Add agent framework/throttling/hidden model/OS assistant and update conversational search documentation (#6354)

* Add agent framework documentation Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Add hidden model and API updates Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Vale error Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Updated field names Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Add updating credentials Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Added tools table Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Add OpenSearch forum thread for OS Assistant Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Add tech review for conv search Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Fix links Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Add tools Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Add links to tools Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * More info about tools Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Tool parameters Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Update cat-index-tool.md Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Parameter clarification Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Tech review feedback Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Typo fix Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * More tech review feedback: RAG tool Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Tech review feedback: memory APis Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Update _ml-commons-plugin/agents-tools/index.md Co-authored-by: Melissa Vagi <vagimeli@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _ml-commons-plugin/agents-tools/tools/neural-sparse-tool.md Co-authored-by: Melissa Vagi <vagimeli@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _ml-commons-plugin/agents-tools/tools/neural-sparse-tool.md Co-authored-by: Melissa Vagi <vagimeli@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _ml-commons-plugin/agents-tools/tools/neural-sparse-tool.md Co-authored-by: Melissa Vagi <vagimeli@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _ml-commons-plugin/opensearch-assistant.md Co-authored-by: Melissa Vagi <vagimeli@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _ml-commons-plugin/agents-tools/tools/ppl-tool.md Co-authored-by: Melissa Vagi <vagimeli@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Melissa Vagi <vagimeli@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Separated search and get APIs and add conversational flow agent Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * More parameters for PPL tool Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Added more parameters Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Tech review feedback: PPL tool Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Apply suggestions from code review Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Rename to automating configurations Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Editorial comments on the new text Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Add parameter to PPl tool Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Changed link to configurations Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Rate limiter feedback and added warning Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> --------- Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: Melissa Vagi <vagimeli@amazon.com> Co-authored-by: Nathan Bower <nbower@amazon.com>
2025-03-09 14:38:01 +00:00 · 2024-02-20 12:09:31 -05:00 · 2024-02-20 12:09:31 -05:00 · 3f7468b504
commit 3f7468b504
parent d706d2e0ac
99 changed files with 5870 additions and 905 deletions
--- a/.github/vale/styles/Vocab/OpenSearch/Products/accept.txt
+++ b/.github/vale/styles/Vocab/OpenSearch/Products/accept.txt
@ -59,6 +59,8 @@ Open Distro
 OpenAI
 OpenID Connect
 OpenSearch
+OpenSearch Assistant
+OpenSearch Assistant Toolkit
 OpenSearch Benchmark
 OpenSearch Dashboards
 OpenSearch Playground
--- a/_automating-configurations/api/create-workflow.md
+++ b/_automating-configurations/api/create-workflow.md
@ -17,7 +17,7 @@ Creating a workflow adds the content of a workflow template to the flow framewor
 * Workflow step fields with invalid values.
 * Workflow graph (node/edge) configurations containing cycles or with duplicate IDs.

-To obtain the validation template for workflow steps, call the [Get Workflow Steps API]({{site.url}}{{site.baseurl}}/automating-workflows/api/get-workflow-steps/).
+To obtain the validation template for workflow steps, call the [Get Workflow Steps API]({{site.url}}{{site.baseurl}}/automating-configurations/api/get-workflow-steps/).

 Once a workflow is created, provide its `workflow_id` to other APIs.

@ -50,7 +50,7 @@ POST /_plugins/_flow_framework/workflow?provision=true
 ```
 {% include copy-curl.html %}

-When set to `true`, the [Provision Workflow API]({{site.url}}{{site.baseurl}}/automating-workflows/api/provision-workflow/) is executed immediately following creation. 
+When set to `true`, the [Provision Workflow API]({{site.url}}{{site.baseurl}}/automating-configurations/api/provision-workflow/) is executed immediately following creation. 

 By default, workflows are validated when they are created to ensure that the syntax is valid and that the graph does not contain cycles. This behavior can be controlled with the `validation` query parameter. If `validation` is set to `all`, OpenSearch performs a complete template validation. Any other value of the `validation` parameter suppresses validation, allowing an incomplete/work-in-progress template to be saved. To disable template validation, set `validation` to `none`:

--- a/_automating-configurations/api/delete-workflow.md
+++ b/_automating-configurations/api/delete-workflow.md
--- a/_automating-configurations/api/deprovision-workflow.md
+++ b/_automating-configurations/api/deprovision-workflow.md
@ -10,7 +10,7 @@ nav_order: 70
 This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/flow-framework/issues/475).    
 {: .warning}

-When you no longer need a workflow, you can deprovision its resources. Most workflow steps that create a resource have corresponding workflow steps to reverse that action. To retrieve all resources currently created for a workflow, call the [Get Workflow Status API]({{site.url}}{{site.baseurl}}/automating-workflows/api/get-workflow-status/). When you call the Deprovision Workflow API, resources included in the `resources_created` field of the Get Workflow Status API response will be removed using a workflow step corresponding to the one that provisioned them.
+When you no longer need a workflow, you can deprovision its resources. Most workflow steps that create a resource have corresponding workflow steps to reverse that action. To retrieve all resources currently created for a workflow, call the [Get Workflow Status API]({{site.url}}{{site.baseurl}}/automating-configurations/api/get-workflow-status/). When you call the Deprovision Workflow API, resources included in the `resources_created` field of the Get Workflow Status API response will be removed using a workflow step corresponding to the one that provisioned them.

 The workflow executes the provisioning workflow steps in reverse order. If failures occur because of resource dependencies, such as preventing deletion of a registered model if it is still deployed, the workflow attempts retries.

@ -56,6 +56,6 @@ If deprovisioning did not completely remove all resources, OpenSearch responds w
 In some cases, the failure happens because of another dependent resource that took some time to be removed. In this case, you can attempt to send the same request again.
 {: .tip}

-To obtain a more detailed deprovisioning status than is provided by the summary in the error response, query the [Get Workflow Status API]({{site.url}}{{site.baseurl}}/automating-workflows/api/get-workflow-status/). 
+To obtain a more detailed deprovisioning status than is provided by the summary in the error response, query the [Get Workflow Status API]({{site.url}}{{site.baseurl}}/automating-configurations/api/get-workflow-status/). 

 On success, the workflow returns to a `NOT_STARTED` state. If some resources have not yet been removed, they are provided in the response.
--- a/_automating-configurations/api/get-workflow-status.md
+++ b/_automating-configurations/api/get-workflow-status.md
@ -10,7 +10,7 @@ nav_order: 40
 This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/flow-framework/issues/475).    
 {: .warning}

-[Provisioning a workflow]({{site.url}}{{site.baseurl}}/automating-workflows/api/provision-workflow/) may take a significant amount of time, particularly when the action is associated with OpenSearch indexing operations. The Get Workflow State API permits monitoring of the provisioning deployment status until it is complete.
+[Provisioning a workflow]({{site.url}}{{site.baseurl}}/automating-configurations/api/provision-workflow/) may take a significant amount of time, particularly when the action is associated with OpenSearch indexing operations. The Get Workflow State API permits monitoring of the provisioning deployment status until it is complete.

 ## Path and HTTP methods

--- a/_automating-configurations/api/get-workflow-steps.md
+++ b/_automating-configurations/api/get-workflow-steps.md
--- a/_automating-configurations/api/get-workflow.md
+++ b/_automating-configurations/api/get-workflow.md
@ -47,4 +47,4 @@ To retrieve a template in JSON format, specify `Content-Type: application/json`
 curl -XGET "http://localhost:9200/_plugins/_flow_framework/workflow/8xL8bowB8y25Tqfenm50" -H 'Content-Type: application/json'
 ```

-OpenSearch responds with the stored template containing the same content as the body of the [create workflow]({{site.url}}{{site.baseurl}}/automating-workflows/api/create-workflow/) request. The order of fields in the returned template may not exactly match the original template but will function identically.
+OpenSearch responds with the stored template containing the same content as the body of the [create workflow]({{site.url}}{{site.baseurl}}/automating-configurations/api/create-workflow/) request. The order of fields in the returned template may not exactly match the original template but will function identically.
--- a/_automating-configurations/api/index.md
+++ b/_automating-configurations/api/index.md
@ -13,11 +13,11 @@ This is an experimental feature and is not recommended for use in a production e

 OpenSearch supports the following workflow APIs:

-* [Create or update workflow]({{site.url}}{{site.baseurl}}/automating-workflows/api/create-workflow/)
-* [Get workflow]({{site.url}}{{site.baseurl}}/automating-workflows/api/get-workflow/)
-* [Provision workflow]({{site.url}}{{site.baseurl}}/automating-workflows/api/provision-workflow/)
-* [Get workflow status]({{site.url}}{{site.baseurl}}/automating-workflows/api/get-workflow-status/)
-* [Get workflow steps]({{site.url}}{{site.baseurl}}/automating-workflows/api/get-workflow-steps/)
-* [Search workflow]({{site.url}}{{site.baseurl}}/automating-workflows/api/search-workflow/)
-* [Deprovision workflow]({{site.url}}{{site.baseurl}}/automating-workflows/api/deprovision-workflow/)
-* [Delete workflow]({{site.url}}{{site.baseurl}}/automating-workflows/api/delete-workflow/)
+* [Create or update workflow]({{site.url}}{{site.baseurl}}/automating-configurations/api/create-workflow/)
+* [Get workflow]({{site.url}}{{site.baseurl}}/automating-configurations/api/get-workflow/)
+* [Provision workflow]({{site.url}}{{site.baseurl}}/automating-configurations/api/provision-workflow/)
+* [Get workflow status]({{site.url}}{{site.baseurl}}/automating-configurations/api/get-workflow-status/)
+* [Get workflow steps]({{site.url}}{{site.baseurl}}/automating-configurations/api/get-workflow-steps/)
+* [Search workflow]({{site.url}}{{site.baseurl}}/automating-configurations/api/search-workflow/)
+* [Deprovision workflow]({{site.url}}{{site.baseurl}}/automating-configurations/api/deprovision-workflow/)
+* [Delete workflow]({{site.url}}{{site.baseurl}}/automating-configurations/api/delete-workflow/)
--- a/_automating-configurations/api/provision-workflow.md
+++ b/_automating-configurations/api/provision-workflow.md
@ -12,7 +12,7 @@ This is an experimental feature and is not recommended for use in a production e

 Provisioning a workflow is a one-time setup process usually performed by a cluster administrator to create resources that will be used by end users.  

-The `workflows` template field may contain multiple workflows. The workflow with the `provision` key can be executed with this API. This API is also executed when the [Create or Update Workflow API]({{site.url}}{{site.baseurl}}/automating-workflows/api/create-workflow/) is called with the `provision` parameter set to `true`.
+The `workflows` template field may contain multiple workflows. The workflow with the `provision` key can be executed with this API. This API is also executed when the [Create or Update Workflow API]({{site.url}}{{site.baseurl}}/automating-configurations/api/create-workflow/) is called with the `provision` parameter set to `true`.

 You can only provision a workflow if it has not yet been provisioned. Deprovision the workflow if you need to repeat provisioning.
 {: .note}
@ -48,4 +48,4 @@ OpenSearch responds with the same `workflow_id` that was used in the request:
 }
 ```

-To obtain the provisioning status, query the [Get Workflow State API]({{site.url}}{{site.baseurl}}/automating-workflows/api/get-workflow-status/).
+To obtain the provisioning status, query the [Get Workflow State API]({{site.url}}{{site.baseurl}}/automating-configurations/api/get-workflow-status/).
--- a/_automating-configurations/api/search-workflow.md
+++ b/_automating-configurations/api/search-workflow.md
--- a/_automating-configurations/index.md
+++ b/_automating-configurations/index.md
@ -1,13 +1,13 @@
 ---
 layout: default
-title: Automating workflows
+title: Automating configurations
 nav_order: 1
 has_children: false
 nav_exclude: true
-redirect_from: /automating-workflows/
+redirect_from: /automating-configurations/
 ---

-# Automating workflows
+# Automating configurations
 **Introduced 2.12**
 {: .label .label-purple }

@ -16,7 +16,7 @@ This is an experimental feature and is not recommended for use in a production e

 You can automate complex OpenSearch setup and preprocessing tasks by providing templates for common use cases. For example, automating machine learning (ML) setup tasks streamlines the use of OpenSearch ML offerings.

-In OpenSearch 2.12, workflow automation is limited to ML tasks.
+In OpenSearch 2.12, configuration automation is limited to ML tasks.
 {: .info}

 OpenSearch use case templates provide a compact description of the setup process in a JSON or YAML document. These templates describe automated workflow configurations for conversational chat or query generation, AI connectors, tools, agents, and other components that prepare OpenSearch as a backend for generative models. For template examples, see [Sample templates](https://github.com/opensearch-project/flow-framework/tree/main/sample-templates).
@ -38,12 +38,12 @@ Workflow automation provides the following benefits:
 * **Workflows**: One or more workflows containing the following elements:
    * **User input**: Parameters expected from the user that are specific to the steps in this workflow.
    * **Workflow Steps**: The workflow steps described as a directed acyclic graph (DAG):  
-        * ***Nodes*** describe steps of the process, which may be executed in parallel. For the syntax of workflow steps, see [Workflow steps]({{site.url}}{{site.baseurl}}/automating-workflows/workflow-steps/). 
+        * ***Nodes*** describe steps of the process, which may be executed in parallel. For the syntax of workflow steps, see [Workflow steps]({{site.url}}{{site.baseurl}}/automating-configurations/workflow-steps/). 
        * ***Edges*** sequence nodes to be executed after the previous step is complete and may use the output fields of previous steps. When a node includes a key in the `previous_node_input` map referring to a previous node’s workflow step, a corresponding edge is automatically added to the template during parsing and may be omitted for the sake of simplicity.

 ## Next steps

- For supported APIs, see [Workflow APIs]({{site.url}}{{site.baseurl}}/automating-workflows/api/index/).
- For the workflow step syntax, see [Workflow steps]({{site.url}}{{site.baseurl}}/automating-workflows/workflow-steps/).  
- For a complete example, see [Workflow tutorial]({{site.url}}{{site.baseurl}}/automating-workflows/workflow-tutorial/).
- For configurable settings, see [Workflow settings]({{site.url}}{{site.baseurl}}/automating-workflows/workflow-settings/).
+- For supported APIs, see [Workflow APIs]({{site.url}}{{site.baseurl}}/automating-configurations/api/index/).
+- For the workflow step syntax, see [Workflow steps]({{site.url}}{{site.baseurl}}/automating-configurations/workflow-steps/).  
+- For a complete example, see [Workflow tutorial]({{site.url}}{{site.baseurl}}/automating-configurations/workflow-tutorial/).
+- For configurable settings, see [Workflow settings]({{site.url}}{{site.baseurl}}/automating-configurations/workflow-settings/).
--- a/_automating-configurations/workflow-settings.md
+++ b/_automating-configurations/workflow-settings.md
--- a/_automating-configurations/workflow-steps.md
+++ b/_automating-configurations/workflow-steps.md
@ -20,7 +20,7 @@ Workflow steps are actively being developed to expand automation capabilities. W
 |`id`	|String	|Required	| A user-provided ID for the step. The ID must be unique within a given workflow and is useful for identifying resources created by the step. For example, a `register_agent` step may return an `agent_id` that has been registered. Using this ID, you can determine which step produced which resource.	|
 |`type`	|String	|Required	|The type of action to take, such as `deploy_model`, which corresponds to the API for which the step is used. Multiple steps may share the same type but must each have their own unique ID. For a list of supported types, see [Workflow step types](#workflow-step-types).	|
 |`previous_node_inputs`	|Object	|Optional	| A key-value map specifying user inputs that are produced by a previous step in the workflow. For each key-value pair, the key is the previous step's `id` and the value is an API body field name (such as `model_id`) that will be produced as an output of a previous step in the workflow. For example, `register_remote_model` (key) may produce a `model_id` (value) that is required for a subsequent `deploy_model` step. <br> A graph edge is automatically added to the workflow connecting the previous step's key as the source and the current node as the destination. <br>In some cases, you can include [additional inputs](#additional-fields) in this field.	|
-|`user_inputs`	|Object	|Optional	| A key-value map of inputs supported by the corresponding API for this specific step. Some inputs are required for an API, while others are optional. Required inputs may be specified here, if known, or in the `previous_node_inputs` field. The [Get Workflow Steps API]({{site.url}}{{site.baseurl}}/automating-workflows/api/get-workflow-steps/) identifies required inputs and step outputs. <br> Substitutions are supported in string values, lists of strings, and maps with string values. The pattern `{% raw %}${{previous_step_id.output_key}}{% endraw %}` will be replaced by the value in the previous step's output with the given key.  For example, if a parameter map in the user inputs includes a key `embedding_model_id` with a value `{% raw %}${{deploy_embedding_model.model_id}}{% endraw %}`, then the `model_id` output of the `deploy_embedding_model` step will be substituted here. This performs a similar function to the `previous_node_input` map but is not validated and does not automatically infer edges. <br>In some cases, you can include [additional inputs](#additional-fields) in this field.	|
+|`user_inputs`	|Object	|Optional	| A key-value map of inputs supported by the corresponding API for this specific step. Some inputs are required for an API, while others are optional. Required inputs may be specified here, if known, or in the `previous_node_inputs` field. The [Get Workflow Steps API]({{site.url}}{{site.baseurl}}/automating-configurations/api/get-workflow-steps/) identifies required inputs and step outputs. <br> Substitutions are supported in string values, lists of strings, and maps with string values. The pattern `{% raw %}${{previous_step_id.output_key}}{% endraw %}` will be replaced by the value in the previous step's output with the given key.  For example, if a parameter map in the user inputs includes a key `embedding_model_id` with a value `{% raw %}${{deploy_embedding_model.model_id}}{% endraw %}`, then the `model_id` output of the `deploy_embedding_model` step will be substituted here. This performs a similar function to the `previous_node_input` map but is not validated and does not automatically infer edges. <br>In some cases, you can include [additional inputs](#additional-fields) in this field.	|

 ## Workflow step types

@ -61,4 +61,4 @@ You can include the following additional fields in the `previous_node_inputs` fi

 ## Example workflow steps

-For example workflow step implementations, see the [Workflow tutorial]({{site.url}}{{site.baseurl}}/automating-workflows/workflow-tutorial/).
+For example workflow step implementations, see the [Workflow tutorial]({{site.url}}{{site.baseurl}}/automating-configurations/workflow-tutorial/).
--- a/_automating-configurations/workflow-tutorial.md
+++ b/_automating-configurations/workflow-tutorial.md
--- a/_config.yml
+++ b/_config.yml
@ -112,7 +112,7 @@ collections:
  about:
    permalink: /:collection/:path/
    output: true
-  automating-workflows:
+  automating-configurations:
    permalink: /:collection/:path/
    output: true
  dashboards-assistant:
@ -172,11 +172,8 @@ opensearch_collection:
    ml-commons-plugin:
      name: Machine learning
      nav_fold: true
-    dashboards-assistant:
-      name: Dashboards assistant
-      nav_fold: true
-    automating-workflows:
-      name: Automating workflows
+    automating-configurations:
+      name: Automating configurations
      nav_fold: true
    monitoring-your-cluster:
      name: Monitoring your cluster
--- a/_dashboards/dashboards-assistant/index.md
+++ b/_dashboards/dashboards-assistant/index.md
@ -9,8 +9,8 @@ has_toc: false
 This is an experimental feature and is not recommended for use in a production environment. For updates on the feature's progress or to leave feedback, go to the [`dashboards-assistant` repository](https://github.com/opensearch-project/dashboards-assistant) on GitHub or the associated [OpenSearch forum thread](https://forum.opensearch.org/t/feedback-opensearch-assistant/16741).
 {: .warning}

-For more information about ways to enable experimental features, see [Experimental feature flags]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/experimental/).
-{: .note}
+Note that machine learning models are probabilistic and that some may perform better than others, so the OpenSearch Assistant may occasionally produce inaccurate information. We recommend evaluating outputs for accuracy as appropriate to your use case, including reviewing the output or combining it with other verification factors.
+{: .important}

 # OpenSearch Assistant for OpenSearch Dashboards
 Introduced 2.12
@ -49,11 +49,14 @@ A screenshot of the interface is shown in the following image.

 <img width="700" src="{{site.url}}{{site.baseurl}}/images/dashboards/opensearch-assistant-full-frame.png" alt="OpenSearch Assistant interface">

+For more information about ways to enable experimental features, see [Experimental feature flags]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/experimental/).
+{: .note}
+
 ## Configuring OpenSearch Assistant

 You can use the OpenSearch Dashboards interface to configure OpenSearch Assistant. Go to the [Getting started guide](https://github.com/opensearch-project/dashboards-assistant/blob/main/GETTING_STARTED_GUIDE.md) for step-by-step instructions. For the chatbot template, go to the [Flow Framework plugin](https://github.com/opensearch-project/flow-framework) documentation. You can modify this template to use your own model and customize the chatbot tools. 

-For information about configuring OpenSearch Assistant through the REST API, see OpenSearch Assistant toolkit.
+For information about configuring OpenSearch Assistant through the REST API, see [OpenSearch Assistant Toolkit]({{site.url}}{{site.baseurl}}/ml-commons-plugin/opensearch-assistant/).

 ## Using OpenSearch Assistant in OpenSearch Dashboards

@ -124,4 +127,4 @@ The following screenshot shows a saved conversation, along with actions you can
 ## Related articles

 - [Getting started guide for OpenSearch Assistant in OpenSearch Dashboards](https://github.com/opensearch-project/dashboards-assistant/blob/main/GETTING_STARTED_GUIDE.md)
- OpenSearch Assistant configuration through the REST API
+- [OpenSearch Assistant configuration through the REST API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/opensearch-assistant/)
--- a/_dashboards/quickstart.md
+++ b/_dashboards/quickstart.md
@ -21,7 +21,8 @@ Here's a glance at the view you see when you open the **Dashboard** or **Discove

 {::nomarkdown}<img src="{{site.url}}{{site.baseurl}}/images/icons/alert-icon.png" class="inline-icon" alt="alert icon"/>{:/} **Note**<br>Before you get started, make sure you've installed OpenSearch and OpenSearch Dashboards. For information about installation and configuration, see [Install and configure OpenSearch]({{site.url}}{{site.baseurl}}/install-and-configure/install-opensearch/index/) and [Install and configure OpenSearch Dashboards]({{site.url}}{{site.baseurl}}/install-and-configure/install-dashboards/index/).
 {: .note}
-# Adding sample data
+
+## Adding sample data

 Sample datasets come with visualizations, dashboards, and other tools to help you explore Dashboards before you add your own data. To add sample data, perform the following steps:

@ -31,7 +32,7 @@ Sample datasets come with visualizations, dashboards, and other tools to help yo

    <img src="{{site.url}}{{site.baseurl}}/images/dashboards/add-sample-data.png" alt="Sample datasets" width="700">

-# Exploring and inspecting data
+## Exploring and inspecting data

 In [**Discover**]({{site.url}}{{site.baseurl}}/dashboards/discover/index-discover/), you can: 

@ -39,7 +40,7 @@ In [**Discover**]({{site.url}}{{site.baseurl}}/dashboards/discover/index-discove
 - Explore the data, view individual documents, and create tables summarizing the data.
 - Visualize your findings.

-## Try it: Getting familiar with Discover
+### Try it: Getting familiar with Discover

 1. On the OpenSearch Dashboards **Home** page, choose **Discover**.
 1. Change the [time filter]({{site.url}}{{site.baseurl}}/dashboards/discover/time-filter/) to **Last 7 days**, as shown in the following image.
@ -54,7 +55,7 @@ In [**Discover**]({{site.url}}{{site.baseurl}}/dashboards/discover/index-discove

    <img src="{{site.url}}{{site.baseurl}}/images/dashboards/filter-data-discover.png" alt="Filter data by FlightDelayType field" width="250"/>

-# Visualizing data
+## Visualizing data

 Raw data can be difficult to comprehend and use. Data visualizations help you prepare and present data in a visual form. In **Dashboard** you can:

@ -63,7 +64,7 @@ Raw data can be difficult to comprehend and use. Data visualizations help you pr
 - Create and share reports.
 - Embed analytics to differentiate your applications.

-## Try it: Getting familiar with Dashboard
+### Try it: Getting familiar with Dashboard

 1. On the OpenSearch Dashboards **Home** page, choose **Dashboard**.
 1. Choose **[Flights] Global Flight Data** in the **Dashboards** window, as shown in the following image.
@ -77,7 +78,7 @@ Raw data can be difficult to comprehend and use. Data visualizations help you pr

    <img src="{{site.url}}{{site.baseurl}}/images/dashboards/add-panel.png" alt="Add panel to dashboard" width="700"/>

-## Try it: Creating a visualization panel
+### Try it: Creating a visualization panel

 Continuing with the preceding dashboard, you'll create a bar chart comparing the number of canceled flights and delayed flights to delay type and then add the panel to the dashboard:

@ -92,11 +93,11 @@ Continuing with the preceding dashboard, you'll create a bar chart comparing the

 <img src="{{site.url}}{{site.baseurl}}/images/dashboards/viz-panel-quickstart.png" alt="Creating a visualization panel" width="700"/>

-# Interacting with data
+## Interacting with data

 Interactive dashboards allow you analyze data in more depth and filter it in several ways. In Dashboards, you can interact directly with data on a dashboard by using dashboard-level filters. For example, continuing with the preceding dashboard, you can filter to show delays and cancellations for a specific airline.

-## Try it: Interacting with the sample flight data
+### Try it: Interacting with the sample flight data

 1. On the **[Flights] Airline Carrier** panel, choose **OpenSearch-Air**. The dashboard updates automatically.
 1. Choose **Save** to save the customized dashboard.
@ -112,7 +113,7 @@ Alternatively, you can apply filters using the dashboard toolbar:

  <img src="{{site.url}}{{site.baseurl}}/images/interact-filter-dashboard.png" alt="Dashboard view after applying Carrier filter" width="700"/>

-# Next steps
+## Next steps

 - **Visualize data**. To learn more about data visualizations in OpenSearch Dashboards, see [**Building data visualizations**]({{site.url}}{{site.baseurl}}/dashboards/visualize/viz-index/).
 - **Create dashboards**. To learn more about creating dashboards in OpenSearch Dashboards, see [**Creating dashboards**]({{site.url}}{{site.baseurl}}/dashboards/quickstart-dashboards/).
--- a/_install-and-configure/configuring-opensearch/plugin-settings.md
+++ b/_install-and-configure/configuring-opensearch/plugin-settings.md
@ -27,7 +27,7 @@ For information about cross-cluster replication settings, see [Replication setti

 ## Flow Framework plugin settings

-For information about automatic workflow settings, see [Workflow settings]({{site.url}}{{site.baseurl}}/automating-workflows/workflow-settings/).
+For information about automatic workflow settings, see [Workflow settings]({{site.url}}{{site.baseurl}}/automating-configurations/workflow-settings/).

 ## Geospatial plugin settings

--- a/_ml-commons-plugin/agents-tools/agents-tools-tutorial.md
+++ b/_ml-commons-plugin/agents-tools/agents-tools-tutorial.md
@ -0,0 +1,357 @@
+---
+layout: default
+title: Agents and tools tutorial
+parent: Agents and tools
+grand_parent: ML Commons APIs
+nav_order: 10
+---
+
+# Agents and tools tutorial
+**Introduced 2.12**
+{: .label .label-purple }
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161).    
+{: .warning}
+
+The following tutorial illustrates creating a flow agent for retrieval-augmented generation (RAG). A flow agent runs its configured tools sequentially, in the order specified. In this example, you'll create an agent with two tools:
+
+1. `VectorDBTool`: The agent will use this tool to retrieve OpenSearch documents relevant to the user question. You'll ingest supplementary information into an OpenSearch index. To facilitate vector search, you'll deploy a text embedding model that translates text into vector embeddings. OpenSearch will translate the ingested documents into embeddings and store them in the index. When you provide a user question to the agent, the agent will construct a query from the question, run vector search on the OpenSearch index, and pass the relevant retrieved documents to the `MLModelTool`.
+1. `MLModelTool`: The agent will run this tool to connect to a large language model (LLM) and send the user query augmented with OpenSearch documents to the model. In this example, you'll use the [Anthropic Claude model hosted on Amazon Bedrock](https://aws.amazon.com/bedrock/claude/). The LLM will then answer the question based on its knowledge and the provided documents.
+
+## Prerequisites
+
+To use the memory feature, first configure the following cluster settings. This tutorial assumes that you have no dedicated machine learning (ML) nodes:
+
+```json
+PUT _cluster/settings
+{
+  "persistent": {
+    "plugins.ml_commons.only_run_on_ml_node": "false",
+    "plugins.ml_commons.memory_feature_enabled": "true"
+  }
+}
+```
+{% include copy-curl.html %}
+
+For more information, see [ML Commons cluster settings]({{site.url}}{{site.baseurl}}/ml-commons-plugin/cluster-settings/).
+
+## Step 1: Register and deploy a text embedding model
+
+You need a text embedding model to facilitate vector search. For this tutorial, you'll use one of the OpenSearch-provided pretrained models. When selecting a model, note its dimensionality because you'll need to provide it when creating an index. 
+
+In this tutorial, you'll use the `huggingface/sentence-transformers/all-MiniLM-L12-v2` model, which generates 384-dimensional dense vector embeddings. To register and deploy the model, send the following request:
+
+```json
+POST /_plugins/_ml/models/_register?deploy=true
+{
+  "name": "huggingface/sentence-transformers/all-MiniLM-L12-v2",
+  "version": "1.0.1",
+  "model_format": "TORCH_SCRIPT"
+}
+```
+{% include copy-curl.html %}
+
+Registering a model is an asynchronous task. OpenSearch returns a task ID for this task:
+
+```json
+{
+  "task_id": "aFeif4oB5Vm0Tdw8yoN7",
+  "status": "CREATED"
+}
+```
+
+You can check the status of the task by calling the Tasks API:
+
+```json
+GET /_plugins/_ml/tasks/aFeif4oB5Vm0Tdw8yoN7
+```
+{% include copy-curl.html %}
+
+Once the task is complete, the task state changes to `COMPLETED` and the Tasks API response includes a model ID for the deployed model:
+
+```json
+{
+  "model_id": "aVeif4oB5Vm0Tdw8zYO2",
+  "task_type": "REGISTER_MODEL",
+  "function_name": "TEXT_EMBEDDING",
+  "state": "COMPLETED",
+  "worker_node": [
+    "4p6FVOmJRtu3wehDD74hzQ"
+  ],
+  "create_time": 1694358489722,
+  "last_update_time": 1694358499139,
+  "is_async": true
+}
+```
+
+## Step 2: Create an ingest pipeline
+
+To translate text into vector embeddings, you'll set up an ingest pipeline. The pipeline translates the `text` field and writes the resulting vector embeddings into the `embedding` field. Create the pipeline by specifying the `model_id` from the previous step in the following request:
+
+```json
+PUT /_ingest/pipeline/test-pipeline-local-model
+{
+  "description": "text embedding pipeline",
+  "processors": [
+    {
+      "text_embedding": {
+        "model_id": "aVeif4oB5Vm0Tdw8zYO2",
+        "field_map": {
+          "text": "embedding"
+        }
+      }
+    }
+  ]
+}
+```
+
+## Step 3: Create a k-NN index and ingest data
+
+Now you'll ingest supplementary data into an OpenSearch index. In OpenSearch, vectors are stored in a k-NN index. You can create a [k-NN index]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/) by sending the following request:
+
+```json
+PUT my_test_data
+{
+  "mappings": {
+    "properties": {
+      "text": {
+        "type": "text"
+      },
+      "embedding": {
+        "type": "knn_vector",
+        "dimension": 384
+      }
+    }
+  },
+  "settings": {
+    "index": {
+      "knn.space_type": "cosinesimil",
+      "default_pipeline": "test-pipeline-local-model",
+      "knn": "true"
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+Then, ingest data into the index by using a bulk request:
+
+```json
+POST _bulk
+{"index": {"_index": "my_test_data", "_id": "1"}}
+{"text": "Chart and table of population level and growth rate for the Ogden-Layton metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\nThe current metro area population of Ogden-Layton in 2023 is 750,000, a 1.63% increase from 2022.\nThe metro area population of Ogden-Layton in 2022 was 738,000, a 1.79% increase from 2021.\nThe metro area population of Ogden-Layton in 2021 was 725,000, a 1.97% increase from 2020.\nThe metro area population of Ogden-Layton in 2020 was 711,000, a 2.16% increase from 2019."}
+{"index": {"_index": "my_test_data", "_id": "2"}}
+{"text": "Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019."}
+{"index": {"_index": "my_test_data", "_id": "3"}}
+{"text": "Chart and table of population level and growth rate for the Chicago metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Chicago in 2023 is 8,937,000, a 0.4% increase from 2022.\\nThe metro area population of Chicago in 2022 was 8,901,000, a 0.27% increase from 2021.\\nThe metro area population of Chicago in 2021 was 8,877,000, a 0.14% increase from 2020.\\nThe metro area population of Chicago in 2020 was 8,865,000, a 0.03% increase from 2019."}
+{"index": {"_index": "my_test_data", "_id": "4"}}
+{"text": "Chart and table of population level and growth rate for the Miami metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Miami in 2023 is 6,265,000, a 0.8% increase from 2022.\\nThe metro area population of Miami in 2022 was 6,215,000, a 0.78% increase from 2021.\\nThe metro area population of Miami in 2021 was 6,167,000, a 0.74% increase from 2020.\\nThe metro area population of Miami in 2020 was 6,122,000, a 0.71% increase from 2019."}
+{"index": {"_index": "my_test_data", "_id": "5"}}
+{"text": "Chart and table of population level and growth rate for the Austin metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Austin in 2023 is 2,228,000, a 2.39% increase from 2022.\\nThe metro area population of Austin in 2022 was 2,176,000, a 2.79% increase from 2021.\\nThe metro area population of Austin in 2021 was 2,117,000, a 3.12% increase from 2020.\\nThe metro area population of Austin in 2020 was 2,053,000, a 3.43% increase from 2019."}
+{"index": {"_index": "my_test_data", "_id": "6"}}
+{"text": "Chart and table of population level and growth rate for the Seattle metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Seattle in 2023 is 3,519,000, a 0.86% increase from 2022.\\nThe metro area population of Seattle in 2022 was 3,489,000, a 0.81% increase from 2021.\\nThe metro area population of Seattle in 2021 was 3,461,000, a 0.82% increase from 2020.\\nThe metro area population of Seattle in 2020 was 3,433,000, a 0.79% increase from 2019."}
+```
+{% include copy-curl.html %}
+
+## Step 4: Create a connector to an externally hosted model
+
+You'll need an LLM to generate responses to user questions. An LLM is too large for an OpenSearch cluster, so you'll create a connection to an externally hosted LLM. For this example, you'll create a connector to the Anthropic Claude model hosted on Amazon Bedrock:
+
+```json
+POST /_plugins/_ml/connectors/_create
+{
+  "name": "BedRock test claude Connector",
+  "description": "The connector to BedRock service for claude model",
+  "version": 1,
+  "protocol": "aws_sigv4",
+  "parameters": {
+      "region": "us-east-1",
+      "service_name": "bedrock",
+      "anthropic_version": "bedrock-2023-05-31",
+      "endpoint": "bedrock.us-east-1.amazonaws.com",
+      "auth": "Sig_V4",
+      "content_type": "application/json",
+      "max_tokens_to_sample": 8000,
+      "temperature": 0.0001,
+      "response_filter": "$.completion"
+  },
+  "credential": {
+      "access_key": "<bedrock_access_key>",
+      "secret_key": "<bedrock_secret_key>",
+      "session_token": "<bedrock_session_token>"
+  },
+  "actions": [
+    {
+      "action_type": "predict",
+      "method": "POST",
+      "url": "https://bedrock-runtime.us-east-1.amazonaws.com/model/anthropic.claude-v2/invoke",
+      "headers": { 
+        "content-type": "application/json",
+        "x-amz-content-sha256": "required"
+      },
+      "request_body": "{\"prompt\":\"${parameters.prompt}\", \"max_tokens_to_sample\":${parameters.max_tokens_to_sample}, \"temperature\":${parameters.temperature},  \"anthropic_version\":\"${parameters.anthropic_version}\" }"
+    }
+  ]
+}
+```
+{% include copy-curl.html %}
+
+The response contains the connector ID for the newly created connector:
+
+```json
+{
+  "connector_id": "a1eMb4kBJ1eYAeTMAljY"
+}
+```
+
+## Step 5: Register and deploy the externally hosted model
+
+Like the text embedding model, an LLM needs to be registered and deployed to OpenSearch. To set up the externally hosted model, first create a model group for this model:
+
+```json
+POST /_plugins/_ml/model_groups/_register
+{
+    "name": "test_model_group_bedrock",
+    "description": "This is a public model group"
+}
+```
+{% include copy-curl.html %}
+
+The response contains the model group ID that you’ll use to register a model to this model group:
+
+```json
+{
+ "model_group_id": "wlcnb4kBJ1eYAeTMHlV6",
+ "status": "CREATED"
+}
+
+```
+
+Next, register and deploy the externally hosted Claude model:
+
+```json
+POST /_plugins/_ml/models/_register?deploy=true
+{
+    "name": "Bedrock Claude V2 model",
+    "function_name": "remote",
+    "model_group_id": "wlcnb4kBJ1eYAeTMHlV6",
+    "description": "test model",
+    "connector_id": "a1eMb4kBJ1eYAeTMAljY"
+}
+```
+{% include copy-curl.html %}
+
+Similarly to [Step 1](#step-1-register-and-deploy-a-text-embedding-model), the response contains a task ID that you can use to check the status of the deployment. Once the model is deployed, the status changes to `COMPLETED` and the response includes the model ID for the Claude model:
+
+```json
+{
+  "model_id": "NWR9YIsBUysqmzBdifVJ",
+  "task_type": "REGISTER_MODEL",
+  "function_name": "remote",
+  "state": "COMPLETED",
+  "worker_node": [
+    "4p6FVOmJRtu3wehDD74hzQ"
+  ],
+  "create_time": 1694358489722,
+  "last_update_time": 1694358499139,
+  "is_async": true
+}
+```
+
+To test the LLM, send the following predict request:
+
+```json
+POST /_plugins/_ml/models/NWR9YIsBUysqmzBdifVJ/_predict
+{
+  "parameters": {
+    "prompt": "\n\nHuman:hello\n\nnAssistant:"
+  }
+}
+```
+{% include copy-curl.html %}
+
+## Step 6: Register and execute an agent
+
+Finally, you'll use the text embedding model created in Step 1 and the Claude model created in Step 5 to create a flow agent. This flow agent will run a `VectorDBTool` and then an `MLModelTool`. The `VectorDBTool` is configured with the model ID for the text embedding model created in Step 1 for vector search. The `MLModelTool` is configured with the Claude model created in step 5:
+
+```json
+POST /_plugins/_ml/agents/_register
+{
+  "name": "Test_Agent_For_RAG",
+  "type": "flow",
+  "description": "this is a test agent",
+  "tools": [
+    {
+      "type": "VectorDBTool",
+      "parameters": {
+        "model_id": "aVeif4oB5Vm0Tdw8zYO2",
+        "index": "my_test_data",
+        "embedding_field": "embedding",
+        "source_field": ["text"],
+        "input": "${parameters.question}"
+      }
+    },
+    {
+      "type": "MLModelTool",
+      "description": "A general tool to answer any question",
+      "parameters": {
+        "model_id": "NWR9YIsBUysqmzBdifVJ",
+        "prompt": "\n\nHuman:You are a professional data analyst. You will always answer a question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say you don't know. \n\n Context:\n${parameters.VectorDBTool.output}\n\nHuman:${parameters.question}\n\nAssistant:"
+      }
+    }
+  ]
+}
+```
+{% include copy-curl.html %}
+
+OpenSearch returns an agent ID for the newly created agent:
+
+```json
+{
+  "agent_id": "879v9YwBjWKCe6Kg12Tx"
+}
+```
+
+You can inspect the agent by sending a request to the `agents` endpoint and providing the agent ID:
+
+```json
+GET /_plugins/_ml/agents/879v9YwBjWKCe6Kg12Tx
+```
+{% include copy-curl.html %}
+
+To execute the agent, send the following request. When registering the agent, you configured it to take in `parameters.question`, so you need to provide this parameter in the request. This parameter represents a human-generated user question:
+
+```json
+POST /_plugins/_ml/agents/879v9YwBjWKCe6Kg12Tx/_execute
+{
+  "parameters": {
+    "question": "what's the population increase of Seattle from 2021 to 2023"
+  }
+}
+```
+{% include copy-curl.html %}
+
+The LLM does not have the recent information in its knowledge base, so it infers the response to the question based on the ingested data, demonstrating RAG:
+
+```json
+{
+  "inference_results": [
+    {
+      "output": [
+        {
+          "result": """ Based on the given context, the key information is:
+
+The metro area population of Seattle in 2021 was 3,461,000.
+The metro area population of Seattle in 2023 is 3,519,000.
+
+To calculate the population increase from 2021 to 2023:
+
+Population in 2023 (3,519,000) - Population in 2021 (3,461,000) = 58,000
+
+Therefore, the population increase of Seattle from 2021 to 2023 is 58,000."""
+        }
+      ]
+    }
+  ]
+}
+```
--- a/_ml-commons-plugin/agents-tools/index.md
+++ b/_ml-commons-plugin/agents-tools/index.md
@ -0,0 +1,181 @@
+---
+layout: default
+title: Agents and tools
+has_children: true
+has_toc: false
+nav_order: 27
+---
+
+# Agents and tools
+**Introduced 2.12**
+{: .label .label-purple }
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161).    
+{: .warning}
+
+You can automate machine learning (ML) tasks using agents and tools. An _agent_ orchestrates and runs ML models and tools. A _tool_ performs a set of specific tasks. Some examples of tools are the `VectorDBTool`, which supports vector search, and the `CATIndexTool`, which executes the `cat indices` operation. For a list of supported tools, see [Tools]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/tools/index/).
+
+## Agents
+
+An _agent_ is a coordinator that uses a large language model (LLM) to solve a problem. After the LLM reasons and decides what action to take, the agent coordinates the action execution. OpenSearch supports the following agent types:
+
+- [_Flow agent_](#flow-agents): Runs tools sequentially, in the order specified in its configuration. The workflow of a flow agent is fixed. Useful for retrieval-augmented generation (RAG).
+- [_Conversational flow agent_](#conversational-flow-agents): Runs tools sequentially, in the order specified in its configuration. The workflow of a conversational flow agent is fixed. Stores conversation history so that users can ask follow-up questions. Useful for creating a chatbot.
+- [_Conversational agent_](#conversational-agents): Reasons in order to provide a response based on the available knowledge, including the LLM knowledge base and a set of tools provided to the LLM. Stores conversation history so that users can ask follow-up questions. The workflow of a conversational agent is variable, based on follow-up questions. For specific questions, uses the Chain-of-Thought (CoT) process to select the best tool from the configured tools for providing a response to the question. Useful for creating a chatbot that employs RAG.
+
+### Flow agents
+
+A flow agent is configured with a set of tools that it runs in order. For example, the following agent runs the `VectorDBTool` and then the `MLModelTool`. The agent coordinates the tools so that one tool's output can become another tool's input. In this example, the `VectorDBTool` queries the k-NN index and the agent passes its output `${parameters.VectorDBTool.output}` to the `MLModelTool` as context, along with the `${parameters.question}` (see the `prompt` parameter):
+
+```json
+POST /_plugins/_ml/agents/_register
+{
+  "name": "Test_Agent_For_RAG",
+  "type": "flow",
+  "description": "this is a test agent",
+  "tools": [
+    {
+      "type": "VectorDBTool",
+      "parameters": {
+        "model_id": "YOUR_TEXT_EMBEDDING_MODEL_ID",
+        "index": "my_test_data",
+        "embedding_field": "embedding",
+        "source_field": ["text"],
+        "input": "${parameters.question}"
+      }
+    },
+    {
+      "type": "MLModelTool",
+      "description": "A general tool to answer any question",
+      "parameters": {
+        "model_id": "YOUR_LLM_MODEL_ID",
+        "prompt": "\n\nHuman:You are a professional data analyst. You will always answer a question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say you don't know. \n\n Context:\n${parameters.VectorDBTool.output}\n\nHuman:${parameters.question}\n\nAssistant:"
+      }
+    }
+  ]
+}
+```
+
+### Conversational flow agents
+
+Similarly to a flow agent, a conversational flow agent is configured with a set of tools that it runs in order. The difference between them is that a conversational flow agent stores the conversation in an index, in the following example, the `conversation_index`. The following agent runs the `VectorDBTool` and then the `MLModelTool`:
+
+```json
+POST /_plugins/_ml/agents/_register
+{
+  "name": "population data analysis agent",
+  "type": "conversational_flow",
+  "description": "This is a demo agent for population data analysis",
+  "app_type": "rag",
+  "memory": {
+    "type": "conversation_index"
+  },
+  "tools": [
+    {
+      "type": "VectorDBTool",
+      "name": "population_knowledge_base",
+      "parameters": {
+        "model_id": "YOUR_TEXT_EMBEDDING_MODEL_ID",
+        "index": "test_population_data",
+        "embedding_field": "population_description_embedding",
+        "source_field": [
+          "population_description"
+        ],
+        "input": "${parameters.question}"
+      }
+    },
+    {
+      "type": "MLModelTool",
+      "name": "bedrock_claude_model",
+      "description": "A general tool to answer any question",
+      "parameters": {
+        "model_id": "YOUR_LLM_MODEL_ID",
+        "prompt": """
+
+Human:You are a professional data analyst. You will always answer question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say don't know. 
+
+Context:
+${parameters.population_knowledge_base.output:-}
+
+${parameters.chat_history:-}
+
+Human:${parameters.question}
+
+Assistant:"""
+      }
+    }
+  ]
+}
+```
+
+### Conversational agents
+
+Similarly to a conversational flow agent, a conversational agent stores the conversation in an index, in the following example, the `conversation_index`. A conversational agent can be configured with an LLM and a set of supplementary tools that perform specific jobs. For example, you can set up an LLM and a `CATIndexTool` when configuring an agent. When you send a question to the model, the agent also includes the `CATIndexTool` as context. The LLM then decides whether it needs to use the `CATIndexTool` to answer questions like "How many indexes are in my cluster?" The context allows an LLM to answer specific questions that are outside of its knowledge base. For example, the following agent is configured with an LLM and a `CATIndexTool` that retrieves information about your OpenSearch indexes:
+
+```json
+POST /_plugins/_ml/agents/_register
+{
+  "name": "Test_Agent_For_ReAct_ClaudeV2",
+  "type": "conversational",
+  "description": "this is a test agent",
+  "llm": {
+    "model_id": "YOUR_LLM_MODEL_ID",
+    "parameters": {
+      "max_iteration": 5,
+      "stop_when_no_tool_found": true,
+      "response_filter": "$.completion"
+    }
+  },
+  "memory": {
+    "type": "conversation_index"
+  },
+  "tools": [
+    {
+      "type": "VectorDBTool",
+      "name": "VectorDBTool",
+      "description": "A tool to search opensearch index with natural language quesiotn. If you don't know answer for some question, you should always try to search data with this tool. Action Input: <natrual language question>",
+      "parameters": {
+        "model_id": "YOUR_TEXT_EMBEDDING_MODEL_ID",
+        "index": "my_test_data",
+        "embedding_field": "embedding",
+        "source_field": [ "text" ],
+        "input": "${parameters.question}"
+      }
+    },
+    {
+      "type": "CatIndexTool",
+      "name": "RetrieveIndexMetaTool",
+      "description": "Use this tool to get OpenSearch index information: (health, status, index, uuid, primary count, replica count, docs.count, docs.deleted, store.size, primary.store.size)."
+    }
+  ],
+  "app_type": "my app"
+}
+```
+
+It is important to provide thorough descriptions of the tools so that the LLM can decide in which situations to use those tools.
+{: .tip}
+
+## Enabling the feature
+
+To enable agents and tools, configure the following setting:
+
+```yaml
+plugins.ml_commons.agent_framework_enabled: true
+```
+{% include copy.html %}
+
+For conversational agents, you also need to enable RAG for use in conversational search. To enable RAG, configure the following setting:
+
+```yaml
+plugins.ml_commons.rag_pipeline_feature_enabled: true
+```
+{% include copy.html %}
+
+For more information about ways to enable experimental features, see [Experimental feature flags]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/experimental/).
+
+## Next steps
+
+- For a list of supported tools, see [Tools]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/tools/index/).
+- For a step-by-step tutorial, see [Agents and tools tutorial]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/agents-tools-tutorial/).
+- For supported APIs, see [Agent APIs]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/agent-apis/).
+- To use agents and tools in configuration automation, see [Automating configurations]({{site.url}}{{site.baseurl}}/automating-configurations/index/).
--- a/_ml-commons-plugin/agents-tools/tools/agent-tool.md
+++ b/_ml-commons-plugin/agents-tools/tools/agent-tool.md
@ -0,0 +1,110 @@
+---
+layout: default
+title: Agent tool
+has_children: false
+has_toc: false
+nav_order: 10
+parent: Tools
+grand_parent: Agents and tools
+---
+
+<!-- vale off -->
+# Agent tool
+**Introduced 2.12**
+{: .label .label-purple }
+<!-- vale on -->
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161).    
+{: .warning}
+
+The `AgentTool` runs any agent.
+
+## Step 1: Set up an agent for AgentTool to run
+
+Set up any agent. For example, set up a flow agent that runs an `MLModelTool` by following the steps in the [ML Model Tool documentation]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/tools/ml-model-tool/) and obtain its agent ID from [Step 3]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/tools/ml-model-tool/#step-3-register-a-flow-agent-that-will-run-the-mlmodeltool):
+
+```json
+{
+  "agent_id": "9X7xWI0Bpc3sThaJdY9i"
+}
+```
+
+## Step 2: Register a flow agent that will run the AgentTool
+
+A flow agent runs a sequence of tools in order and returns the last tool's output. To create a flow agent, send the following register agent request, providing the agent ID from the previous step:
+
+```json
+POST /_plugins/_ml/agents/_register
+{
+  "name": "Test agent tool",
+  "type": "flow",
+  "description": "this is a test agent",
+  "tools": [
+    {
+      "type": "AgentTool",
+      "description": "A general agent to answer any question",
+      "parameters": {
+        "agent_id": "9X7xWI0Bpc3sThaJdY9i"
+      }
+    }
+  ]
+}
+```
+{% include copy-curl.html %} 
+
+For parameter descriptions, see [Register parameters](#register-parameters).
+
+OpenSearch responds with an agent ID:
+
+```json
+{
+  "agent_id": "EQyyZ40BT2tRrkdmhT7_"
+}
+```
+
+## Step 3: Run the agent
+
+Run the agent by sending the following request:
+
+```json
+POST /_plugins/_ml/agents/EQyyZ40BT2tRrkdmhT7_/_execute
+{
+  "parameters": {
+    "question": "what's the population increase of Seattle from 2021 to 2023"
+  }
+}
+```
+{% include copy-curl.html %} 
+
+OpenSearch returns the inference results:
+
+```json
+{
+  "inference_results": [
+    {
+      "output": [
+        {
+          "name": "response",
+          "result": " I do not have direct data on the population increase of Seattle from 2021 to 2023 in the context provided. As a data analyst, I would need to research population statistics from credible sources like the US Census Bureau to analyze population trends and make an informed estimate. Without looking up actual data, I don't have enough information to provide a specific answer to the question."
+        }
+      ]
+    }
+  ]
+}
+```
+
+## Register parameters
+
+The following table lists all tool parameters that are available when registering an agent.
+
+Parameter	| Type | Required/Optional | Description	
+:--- | :--- | :--- | :---
+`agent_id` | String | Required | The agent ID of the agent to run.
+
+## Execute parameters
+
+The following table lists all tool parameters that are available when running the agent.
+
+Parameter	| Type | Required/Optional | Description	
+:--- | :--- | :--- | :---
+`question` | String | Required | The natural language question to send to the LLM. 
--- a/_ml-commons-plugin/agents-tools/tools/cat-index-tool.md
+++ b/_ml-commons-plugin/agents-tools/tools/cat-index-tool.md
@ -0,0 +1,133 @@
+---
+layout: default
+title: CAT Index tool
+has_children: false
+has_toc: false
+nav_order: 20
+parent: Tools
+grand_parent: Agents and tools
+---
+
+<!-- vale off -->
+# CAT Index tool
+**Introduced 2.12**
+{: .label .label-purple }
+<!-- vale on -->
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161).    
+{: .warning}
+
+The `CatIndexTool` retrieves index information for the OpenSearch cluster, similarly to the [CAT Indices API]({{site.url}}{{site.baseurl}}/api-reference/cat/cat-indices/).
+
+## Step 1: Register a flow agent that will run the CatIndexTool
+
+A flow agent runs a sequence of tools in order and returns the last tool's output. To create a flow agent, send the following register agent request:
+
+```json
+POST /_plugins/_ml/agents/_register
+{
+  "name": "Test_Agent_For_CatIndex_tool",
+  "type": "flow",
+  "description": "this is a test agent for the CatIndexTool",
+  "tools": [
+    {
+      "type": "CatIndexTool",
+      "name": "DemoCatIndexTool",
+      "parameters": {
+        "input": "${parameters.question}"
+      }
+    }
+  ]
+}
+```
+{% include copy-curl.html %} 
+
+For parameter descriptions, see [Register parameters](#register-parameters).
+
+OpenSearch responds with an agent ID:
+
+```json
+{
+  "agent_id": "9X7xWI0Bpc3sThaJdY9i"
+}
+```
+
+## Step 2: Run the agent
+
+Before you run the agent, make sure that you add the sample OpenSearch Dashboards `Sample eCommerce orders` dataset. To learn more, see [Adding sample data]({{site.url}}{{site.baseurl}}/dashboards/quickstart#adding-sample-data).
+
+Then, run the agent by sending the following request:
+
+```json
+POST /_plugins/_ml/agents/9X7xWI0Bpc3sThaJdY9i/_execute
+{
+  "parameters": {
+    "question": "How many indices do I have?"
+  }
+}
+```
+{% include copy-curl.html %} 
+
+OpenSearch returns the index information:
+
+```json
+{
+  "inference_results": [
+    {
+      "output": [
+        {
+          "name": "response",
+          "result": """health    status    index    uuid    pri    rep    docs.count    docs.deleted    store.size    pri.store.size
+green    open    .plugins-ml-model-group    lHgGEgJhT_mpADyOZoXl2g    1    1    9    2    33.4kb    16.7kb
+green    open    .plugins-ml-memory-meta    b2LEpv0QS8K60QBjXtRm6g    1    1    13    0    95.1kb    47.5kb
+green    open    .ql-datasources    9NXm_tMXQc6s_4uRToSNkQ    1    1    0    0    416b    208b
+green    open    sample-ecommerce    UPYOQcAfRGqFAlSxcZlRjw    1    1    40320    0    4.1mb    2mb
+green    open    .plugins-ml-task    xYTlprYCQnaaYici69SOjA    1    1    117    0    115.5kb    57.6kb
+green    open    .opendistro_security    7DAqhm9QQmeEsQYhA40cJg    1    1    10    0    117kb    58.5kb
+green    open    sample-host-health    Na5tq6UiTt6r_qYME1vV-w    1    1    40320    0    2.6mb    1.3mb
+green    open    .opensearch-observability    6PthtLluSKyYCdZR3Mw0iw    1    1    0    0    416b    208b
+green    open    .plugins-ml-model    WYcjBHcnRuSDHeVWPVupoA    1    1    191    45    4.2gb    2.1gb
+green    open    index_for_neural_sparse    GQswGabQRIazM_trnqaDrw    1    1    5    0    28.4kb    14.2kb
+green    open    security-auditlog-2024.01.30    BhXR7Nd3QVOVGxJNpR0-jw    1    1    27768    0    13.8mb    7mb
+green    open    sample-http-responses    0gmYYYdOTiCbVUvl_uDL0w    1    1    40320    0    2.5mb    1.2mb
+green    open    security-auditlog-2024.02.01    2VD1ieDGS5m-TfjIdfT8Eg    1    1    39305    0    39mb    18.6mb
+green    open    opensearch_dashboards_sample_data_ecommerce    wnE6r7OvSPqc5YHj8wHSLA    1    1    4675    0    8.8mb    4.4mb
+green    open    security-auditlog-2024.01.31    cNRK5-2eTwes0SRlXTl0RQ    1    1    34520    0    20.5mb    9.8mb
+green    open    .plugins-ml-memory-message    wTNBU4BBQVSFcFhNlUdfBQ    1    1    93    0    358.2kb    181.9kb
+green    open    .plugins-flow-framework-state    dJUNDv9MSJ2jjwKbzXPlrw    1    1    39    0    114.1kb    57kb
+green    open    .plugins-ml-agent    7X1IzoLuSGmIujOh9i5mmg    1    1    30    0    170.7kb    85.3kb
+green    open    .plugins-flow-framework-templates    _ecC0KahTlmG_3tFUst7Uw    1    1    18    0    175.8kb    87.9kb
+green    open    .plugins-ml-connector    q45iJfVjQ5KgxeNC65DLSw    1    1    11    0    313.1kb    156.5kb
+green    open    .kibana_1    vRjXK4bHSUueB_4iXiQ8yw    1    1    257    0    264kb    132kb
+green    open    .plugins-ml-config    G7gxGQB7TZeQzBasHd5PUg    1    1    1    0    7.8kb    3.9kb
+green    open    .plugins-ml-controller    NQTZPREZRhWoDdjCglRLFg    1    1    0    0    50.1kb    49.9kb
+green    open    opensearch_dashboards_sample_data_logs    9gpOTB3rRgqBLvqis_k5LQ    1    1    14074    0    18mb    9mb
+green    open    .plugins-flow-framework-config    JlKPsCh6SEq-Jh6rPL_x9Q    1    1    1    0    7.8kb    3.9kb
+green    open    opensearch_dashboards_sample_data_flights    pJde0irnTce4-uobHwYmMQ    1    1    13059    0    11.9mb    5.9mb
+green    open    my_test_data    T4hwNs7CTJGIfw2QpCqQ_Q    1    1    6    0    91.7kb    45.8kb
+green    open    .opendistro-job-scheduler-lock    XjgmXAVKQ4e8Y-ac54VBzg    1    1    3    3    36.2kb    21.3kb
+"""
+        }
+      ]
+    }
+  ]
+}
+```
+
+## Register parameters
+
+The following table lists all tool parameters that are available when registering an agent.
+
+Parameter | Type | Required/Optional | Description
+:--- | :--- | :--- | :---
+`input` | String | Required | The user input used to return index information.
+`index` | String | Optional | A comma-delimited list of one or more indexes on which to run the CAT operation. Default is an empty list, which means all indexes.
+`local` | Boolean | Optional | When `true`, retrieves information from the local node only instead of the cluster manager node (default is `false`).
+
+## Execute parameters
+
+The following table lists all tool parameters that are available when running the agent.
+
+Parameter	| Type | Required/Optional | Description	
+:--- | :--- | :--- | :---
+`question` | String | Required | The natural language question to send to the LLM. 
--- a/_ml-commons-plugin/agents-tools/tools/index-mapping-tool.md
+++ b/_ml-commons-plugin/agents-tools/tools/index-mapping-tool.md
@ -0,0 +1,123 @@
+---
+layout: default
+title: Index Mapping tool
+has_children: false
+has_toc: false
+nav_order: 30
+parent: Tools
+grand_parent: Agents and tools
+---
+
+<!-- vale off -->
+# Index Mapping tool
+**Introduced 2.12**
+{: .label .label-purple }
+<!-- vale on -->
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161).    
+{: .warning}
+
+The `IndexMappingTool` retrieves mapping and setting information for indexes in your cluster.
+
+## Step 1: Register a flow agent that will run the IndexMappingTool
+
+A flow agent runs a sequence of tools in order and returns the last tool's output. To create a flow agent, send the following register agent request:
+
+```json
+POST /_plugins/_ml/agents/_register
+{
+  "name": "Test_Agent_For_IndexMapping_tool",
+  "type": "flow",
+  "description": "this is a test agent for the IndexMappingTool",
+  "tools": [
+      {
+      "type": "IndexMappingTool",
+      "name": "DemoIndexMappingTool",
+      "parameters": {
+        "index": "${parameters.index}",
+        "input": "${parameters.question}"
+      }
+    }
+  ]
+}
+```
+{% include copy-curl.html %} 
+
+For parameter descriptions, see [Register parameters](#register-parameters).
+
+OpenSearch responds with an agent ID:
+
+```json
+{
+  "agent_id": "9X7xWI0Bpc3sThaJdY9i"
+}
+```
+
+## Step 2: Run the agent
+
+Before you run the agent, make sure that you add the sample OpenSearch Dashboards `Sample eCommerce orders` dataset. To learn more, see [Adding sample data]({{site.url}}{{site.baseurl}}/dashboards/quickstart#adding-sample-data).
+
+Then, run the agent by sending the following request and providing the index name and the question:
+
+```json
+POST /_plugins/_ml/agents/9X7xWI0Bpc3sThaJdY9i/_execute
+{
+  "parameters": {
+    "index": [ "sample-ecommerce" ],
+    "question": "What fields are in the sample-ecommerce index?"
+  }
+}
+```
+{% include copy-curl.html %} 
+
+OpenSearch returns the mappings and settings for the specified index:
+
+```json
+{
+  "inference_results": [
+    {
+      "output": [
+        {
+          "name": "response",
+          "result": """index: sample-ecommerce
+
+mappings:
+properties={items_purchased_failure={type=integer}, items_purchased_success={type=integer}, order_id={type=integer}, timestamp={type=date}, total_revenue_usd={type=integer}}
+
+
+settings:
+index.creation_date=1706752839713
+index.number_of_replicas=1
+index.number_of_shards=1
+index.provided_name=sample-ecommerce
+index.replication.type=DOCUMENT
+index.uuid=UPYOQcAfRGqFAlSxcZlRjw
+index.version.created=137217827
+
+
+"""
+        }
+      ]
+    }
+  ]
+}
+```
+
+## Register parameters
+
+The following table lists all tool parameters that are available when registering an agent.
+
+Parameter | Type | Required/Optional | Description
+:--- | :--- | :--- | :---
+`input` | String | Required | The user input used to return index information.
+`index` | Array | Required | A comma-delimited list of one or more indexes for which to obtain mapping and setting information. Default is an empty list, which means all indexes.
+`local` | Boolean | Optional | Whether to return information from the local node only instead of the cluster manager node (default is `false`).
+
+## Execute parameters
+
+The following table lists all tool parameters that are available when running the agent.
+
+Parameter	| Type | Required/Optional | Description	
+:--- | :--- | :--- | :---
+`question` | String | Required | The natural language question to send to the LLM. 
+`index` | Array | Optional | A comma-delimited list of one or more indexes for which to obtain mapping and setting information. Default is an empty list, which means all indexes.
--- a/_ml-commons-plugin/agents-tools/tools/index.md
+++ b/_ml-commons-plugin/agents-tools/tools/index.md
@ -0,0 +1,51 @@
+---
+layout: default
+title: Tools
+parent: Agents and tools
+has_children: true
+has_toc: false
+nav_order: 20
+redirect_from: 
+  - /ml-commons-plugin/extensibility/index/
+---
+
+# Tools
+**Introduced 2.12**
+{: .label .label-purple }
+
+A _tool_ performs a set of specific tasks. The following table lists all tools that OpenSearch supports.
+
+Specify a tool by providing its `type`, `parameters`, and, optionally, a `description`. For example, you can specify an `AgentTool` as follows:
+
+```json
+{
+  "type": "AgentTool",
+  "description": "A general agent to answer any question",
+  "parameters": {
+    "agent_id": "9X7xWI0Bpc3sThaJdY9i"
+  }
+}
+```
+
+Each tool takes a list of parameters specific to that tool. In the preceding example, the `AgentTool` takes an `agent_id` of the agent it will run. For a list of parameters, see each tool's documentation.
+
+|Tool	| Description	|
+|:---	|:---	|
+|[`AgentTool`]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/tools/agent-tool/)	|Runs any agent. |
+|[`CatIndexTool`]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/tools/cat-index-tool/)	|Retrieves index information for the OpenSearch cluster. |
+|[`IndexMappingTool`]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/tools/index-mapping-tool/)	|Retrieves index mapping and setting information for an index. |
+|[`MLModelTool`]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/tools/ml-model-tool/)	|Runs machine learning models.	|
+|[`NeuralSparseSearchTool`]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/tools/neural-sparse-tool/)	| Performs sparse vector retrieval. |
+|[`PPLTool`]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/tools/ppl-tool/)	|Translates natural language into a Piped Processing Language (PPL) query.	|
+|[`RAGTool`]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/tools/rag-tool/)	|Uses neural search or neural sparse search to retrieve documents and integrates a large language model to summarize the answers. |
+|[`SearchAlertsTool`]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/tools/search-alerts-tool/)	|Searches for alerts.	|
+|[`SearchAnomalyDetectorsTool`]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/tools/search-anomaly-detectors/)	| Searches for anomaly detectors.	|
+|[`SearchAnomalyResultsTool`]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/tools/search-anomaly-results/)	| Searches anomaly detection results generated by anomaly detectors.	|
+|[`SearchIndexTool`]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/tools/search-index-tool/)	|Searches an index using a query written in query domain-specific language (DSL). |
+|[`SearchMonitorsTool`]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/tools/search-monitors-tool/)	| Searches for alerting monitors.	|
+|[`VectorDBTool`]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/tools/vector-db-tool/)	|Performs dense vector retrieval.	|
+|[`VisualizationTool`]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/tools/visualization-tool/)	|Finds visualizations in OpenSearch Dashboards.	|
+
+## Developer information
+
+The agents and tools framework is flexible and extensible. You can find the list of tools provided by OpenSearch in the [Tools library](https://github.com/opensearch-project/skills/tree/main/src/main/java/org/opensearch/agent/tools). For a different use case, you can build your own tool by implementing the [_Tool_ interface](https://github.com/opensearch-project/ml-commons/blob/2.x/spi/src/main/java/org/opensearch/ml/common/spi/tools/Tool.java).
--- a/_ml-commons-plugin/agents-tools/tools/ml-model-tool.md
+++ b/_ml-commons-plugin/agents-tools/tools/ml-model-tool.md
@ -0,0 +1,170 @@
+---
+layout: default
+title: ML Model tool
+has_children: false
+has_toc: false
+nav_order: 40
+parent: Tools
+grand_parent: Agents and tools
+---
+
+<!-- vale off -->
+# ML Model tool
+**Introduced 2.12**
+{: .label .label-purple }
+<!-- vale on -->
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161).    
+{: .warning}
+
+The `MLModelTool` runs a machine learning (ML) model and returns inference results. 
+
+## Step 1: Create a connector for a model
+
+The following example request creates a connector for a model hosted on [Amazon SageMaker](https://aws.amazon.com/pm/sagemaker/):
+
+```json
+POST /_plugins/_ml/connectors/_create
+{
+  "name": "sagemaker model",
+  "description": "Test connector for Sagemaker model",
+  "version": 1,
+  "protocol": "aws_sigv4",
+  "credential": {
+    "access_key": "<YOUR ACCESS KEY>",
+    "secret_key": "<YOUR SECRET KEY>"
+  },
+  "parameters": {
+    "region": "us-east-1",
+    "service_name": "sagemaker"
+  },
+  "actions": [
+    {
+      "action_type": "predict",
+      "method": "POST",
+      "headers": {
+        "content-type": "application/json"
+      },
+      "url": "<YOUR SAGEMAKER ENDPOINT>",
+      "request_body": """{"prompt":"${parameters.prompt}"}"""
+    }
+  ]
+}
+```
+{% include copy-curl.html %} 
+
+OpenSearch responds with a connector ID:
+
+```json
+{
+  "connector_id": "eJATWo0BkIylWTeYToTn"
+}
+```
+
+## Step 2: Register and deploy the model 
+
+To register and deploy the model to OpenSearch, send the following request, providing the connector ID from the previous step:
+
+```json
+POST /_plugins/_ml/models/_register?deploy=true
+{
+  "name": "remote-inferene",
+  "function_name": "remote",
+  "description": "test model",
+  "connector_id": "eJATWo0BkIylWTeYToTn"
+}
+```
+{% include copy-curl.html %} 
+
+OpenSearch responds with a model ID:
+
+```json
+{
+  "task_id": "7X7pWI0Bpc3sThaJ4I8R",
+  "status": "CREATED",
+  "model_id": "h5AUWo0BkIylWTeYT4SU"
+}
+```
+
+## Step 3: Register a flow agent that will run the MLModelTool
+
+A flow agent runs a sequence of tools in order and returns the last tool's output. To create a flow agent, send the following register agent request, providing the model ID in the `model_id` parameter:
+
+```json
+POST /_plugins/_ml/agents/_register
+{
+  "name": "Test agent for embedding model",
+  "type": "flow",
+  "description": "this is a test agent",
+  "tools": [
+    {
+      "type": "MLModelTool",
+      "description": "A general tool to answer any question",
+      "parameters": {
+        "model_id": "h5AUWo0BkIylWTeYT4SU",
+        "prompt": "\n\nHuman:You are a professional data analyst. You will always answer question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say don't know. \n\nHuman:${parameters.question}\n\nAssistant:"
+      }
+    }
+  ]
+}
+```
+{% include copy-curl.html %} 
+
+For parameter descriptions, see [Register parameters](#register-parameters).
+
+OpenSearch responds with an agent ID:
+
+```json
+{
+  "agent_id": "9X7xWI0Bpc3sThaJdY9i"
+}
+```
+
+## Step 4: Run the agent
+
+Run the agent by sending the following request:
+
+```json
+POST /_plugins/_ml/agents/9X7xWI0Bpc3sThaJdY9i/_execute
+{
+  "parameters": {
+    "question": "what's the population increase of Seattle from 2021 to 2023"
+  }
+}
+```
+{% include copy-curl.html %} 
+
+OpenSearch returns the inference results:
+
+```json
+{
+  "inference_results": [
+    {
+      "output": [
+        {
+          "name": "response",
+          "result": " I do not have direct data on the population increase of Seattle from 2021 to 2023 in the context provided. As a data analyst, I would need to research population statistics from credible sources like the US Census Bureau to analyze population trends and make an informed estimate. Without looking up actual data, I don't have enough information to provide a specific answer to the question."
+        }
+      ]
+    }
+  ]
+}
+```
+
+## Register parameters
+
+The following table lists all tool parameters that are available when registering an agent.
+
+Parameter	| Type | Required/Optional | Description	
+:--- | :--- | :--- | :---
+`model_id` | String | Required | The model ID of the large language model (LLM) to use for generating the response.
+`prompt` | String | Optional | The prompt to provide to the LLM.
+`response_field` | String | Optional | The name of the response field. Default is `response`.
+
+## Execute parameters
+
+The following table lists all tool parameters that are available when running the agent.
+
+Parameter	| Type | Required/Optional | Description	
+:--- | :--- | :--- | :---
+`question` | String | Required | The natural language question to send to the LLM. 
--- a/_ml-commons-plugin/agents-tools/tools/neural-sparse-tool.md
+++ b/_ml-commons-plugin/agents-tools/tools/neural-sparse-tool.md
@ -0,0 +1,225 @@
+---
+layout: default
+title: Neural Sparse Search tool
+has_children: false
+has_toc: false
+nav_order: 50
+parent: Tools
+grand_parent: Agents and tools
+---
+
+<!-- vale off -->
+# Neural Sparse Search tool
+**Introduced 2.12**
+{: .label .label-purple }
+<!-- vale on -->
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161).    
+{: .warning}
+
+The `NeuralSparseSearchTool` performs sparse vector retrieval. For more information about neural sparse search, see [Neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/).
+
+## Step 1: Register and deploy a sparse encoding model
+
+OpenSearch supports several pretrained sparse encoding models. You can either use one of those models or your own custom model. For a list of supported pretrained models, see [Sparse encoding models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models). For more information, see [OpenSearch-provided pretrained models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/) and [Custom local models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/custom-local-models/). 
+
+In this example, you'll use the `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1` pretrained model for both ingestion and search. To register and deploy the model to OpenSearch, send the following request:
+
+```json
+POST /_plugins/_ml/models/_register?deploy=true
+{
+  "name": "amazon/neural-sparse/opensearch-neural-sparse-encoding-v1",
+  "version": "1.0.1",
+  "model_format": "TORCH_SCRIPT"
+}
+```
+{% include copy-curl.html %} 
+
+OpenSearch responds with a task ID for the model registration and deployment task:
+
+```json
+{
+  "task_id": "M_9KY40Bk4MTqirc5lP8",
+  "status": "CREATED"
+}
+```
+
+You can monitor the status of the task by calling the Tasks API:
+
+```json
+GET _plugins/_ml/tasks/M_9KY40Bk4MTqirc5lP8
+```
+{% include copy-curl.html %} 
+
+Once the model is registered and deployed, the task `state` changes to `COMPLETED` and OpenSearch returns a model ID for the model:
+
+```json
+{
+  "model_id": "Nf9KY40Bk4MTqirc6FO7",
+  "task_type": "REGISTER_MODEL",
+  "function_name": "SPARSE_ENCODING",
+  "state": "COMPLETED",
+  "worker_node": [
+    "UyQSTQ3nTFa3IP6IdFKoug"
+  ],
+  "create_time": 1706767869692,
+  "last_update_time": 1706767935556,
+  "is_async": true
+}
+```
+
+## Step 2: Ingest data into an index
+
+First, you'll set up an ingest pipeline to encode documents using the sparse encoding model set up in the previous step:
+
+```json
+PUT /_ingest/pipeline/pipeline-sparse
+{
+  "description": "An sparse encoding ingest pipeline",
+  "processors": [
+    {
+      "sparse_encoding": {
+        "model_id": "Nf9KY40Bk4MTqirc6FO7",
+        "field_map": {
+          "passage_text": "passage_embedding"
+        }
+      }
+    }
+  ]
+}
+```
+{% include copy-curl.html %} 
+
+Next, create an index specifying the pipeline as the default pipeline:
+
+```json
+PUT index_for_neural_sparse
+{
+  "settings": {
+    "default_pipeline": "pipeline-sparse"
+  },
+  "mappings": {
+    "properties": {
+      "passage_embedding": {
+        "type": "rank_features"
+      },
+      "passage_text": {
+        "type": "text"
+      }
+    }
+  }
+}
+```
+{% include copy-curl.html %} 
+
+Last, ingest data into the index by sending a bulk request:
+
+```json
+POST _bulk
+{ "index" : { "_index" : "index_for_neural_sparse", "_id" : "1" } }
+{ "passage_text" : "company AAA has a history of 123 years" }
+{ "index" : { "_index" : "index_for_neural_sparse", "_id" : "2" } }
+{ "passage_text" : "company AAA has over 7000 employees" }
+{ "index" : { "_index" : "index_for_neural_sparse", "_id" : "3" } }
+{ "passage_text" : "Jack and Mark established company AAA" }
+{ "index" : { "_index" : "index_for_neural_sparse", "_id" : "4" } }
+{ "passage_text" : "company AAA has a net profit of 13 millions in 2022" }
+{ "index" : { "_index" : "index_for_neural_sparse", "_id" : "5" } }
+{ "passage_text" : "company AAA focus on the large language models domain" }
+```
+{% include copy-curl.html %} 
+
+## Step 3: Register a flow agent that will run the NeuralSparseSearchTool
+
+A flow agent runs a sequence of tools in order and returns the last tool's output. To create a flow agent, send the following request, providing the model ID for the model set up in Step 1. This model will encode your queries into sparse vector embeddings:
+
+```json
+POST /_plugins/_ml/agents/_register
+{
+  "name": "Test_Neural_Sparse_Agent_For_RAG",
+  "type": "flow",
+  "tools": [
+    {
+      "type": "NeuralSparseSearchTool",
+      "parameters": {
+        "description":"use this tool to search data from the knowledge base of company AAA",
+        "model_id": "Nf9KY40Bk4MTqirc6FO7",
+        "index": "index_for_neural_sparse",
+        "embedding_field": "passage_embedding",
+        "source_field": ["passage_text"],
+        "input": "${parameters.question}",
+        "doc_size":2
+      }
+    }
+  ]
+}
+```
+{% include copy-curl.html %} 
+
+For parameter descriptions, see [Register parameters](#register-parameters).
+
+OpenSearch responds with an agent ID:
+
+```json
+{
+  "agent_id": "9X7xWI0Bpc3sThaJdY9i"
+}
+```
+
+## Step 4: Run the agent
+
+Before you run the agent, make sure that you add the sample OpenSearch Dashboards `Sample web logs` dataset. To learn more, see [Adding sample data]({{site.url}}{{site.baseurl}}/dashboards/quickstart#adding-sample-data).
+
+Then, run the agent by sending the following request:
+
+```json
+POST /_plugins/_ml/agents/9X7xWI0Bpc3sThaJdY9i/_execute
+{
+  "parameters": {
+    "question":"how many employees does AAA have?"
+  }
+}
+```
+{% include copy-curl.html %} 
+
+OpenSearch returns the inference results:
+
+```json
+{
+  "inference_results": [
+    {
+      "output": [
+        {
+          "name": "response",
+          "result": """{"_index":"index_for_neural_sparse","_source":{"passage_text":"company AAA has over 7000 employees"},"_id":"2","_score":30.586042}
+{"_index":"index_for_neural_sparse","_source":{"passage_text":"company AAA has a history of 123 years"},"_id":"1","_score":16.088133}
+"""
+        }
+      ]
+    }
+  ]
+}
+```
+
+## Register parameters
+
+The following table lists all tool parameters that are available when registering an agent.
+
+Parameter	| Type | Required/Optional | Description	
+:--- | :--- | :--- | :---
+`model_id` | String | Required | The model ID of the sparse encoding model to use at search time.
+`index` | String | Required | The index to search.
+`embedding_field` | String | Required | When the neural sparse model encodes raw text documents, the encoding result is saved in a field. Specify this field as the `embedding_field`. Neural sparse search matches documents to the query by calculating the similarity score between the query text and the text in the document's `embedding_field`.
+`source_field` | String | Required | The document field or fields to return. You can provide a list of multiple fields as an array of strings, for example, `["field1", "field2"]`.
+`input` | String | Required for flow agent | Runtime input sourced from flow agent parameters. If using a large language model (LLM), this field is populated with the LLM response.
+`name` | String  | Optional | The tool name. Useful when an LLM needs to select an appropriate tool for a task.
+`description` | String | Optional | A description of the tool. Useful when an LLM needs to select an appropriate tool for a task.
+`doc_size` | Integer | Optional | The number of documents to fetch. Default is `2`.
+
+## Execute parameters
+
+The following table lists all tool parameters that are available when running the agent.
+
+Parameter	| Type | Required/Optional | Description	
+:--- | :--- | :--- | :---
+`question` | String | Required | The natural language question to send to the LLM. 
--- a/_ml-commons-plugin/agents-tools/tools/ppl-tool.md
+++ b/_ml-commons-plugin/agents-tools/tools/ppl-tool.md
@ -0,0 +1,207 @@
+---
+layout: default
+title: PPL tool
+has_children: false
+has_toc: false
+nav_order: 60
+parent: Tools
+grand_parent: Agents and tools
+---
+
+# PPL tool
+**Introduced 2.12**
+{: .label .label-purple }
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161).    
+{: .warning}
+
+The `PPLTool` translates natural language into a PPL query. The tool provides an `execute` flag to specify whether to run the query. If you set the flag to `true`, the `PPLTool` runs the query and returns the query and the results. 
+
+## Prerequisite
+
+To create a PPL tool, you need a fine-tuned model that translates natural language into PPL queries. Alternatively, you can use large language models for prompt-based translation. The PPL tool supports the Anthropic Claude and OpenAI models.
+
+## Step 1: Create a connector for a model
+
+The following example request creates a connector for a model hosted on Amazon SageMaker:
+
+```json
+POST /_plugins/_ml/connectors/_create
+{
+  "name": "sagemaker: t2ppl",
+  "description": "Test connector for Sagemaker t2ppl model",
+  "version": 1,
+  "protocol": "aws_sigv4",
+  "credential": {
+    "access_key": "<YOUR ACCESS KEY>",
+    "secret_key": "<YOUR SECRET KEY>"
+  },
+  "parameters": {
+    "region": "us-east-1",
+    "service_name": "sagemaker"
+  },
+  "actions": [
+    {
+      "action_type": "predict",
+      "method": "POST",
+      "headers": {
+        "content-type": "application/json"
+      },
+      "url": "<YOUR SAGEMAKER ENDPOINT>",
+      "request_body": """{"prompt":"${parameters.prompt}"}"""
+    }
+  ]
+}
+```
+{% include copy-curl.html %} 
+
+OpenSearch responds with a connector ID:
+
+```json
+{
+  "connector_id": "eJATWo0BkIylWTeYToTn"
+}
+```
+
+For information about connecting to an Anthropic Claude model or OpenAI models, see [Connectors]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/connectors/). 
+
+## Step 2: Register and deploy the model 
+
+To register and deploy the model to OpenSearch, send the following request, providing the connector ID from the previous step:
+
+```json
+POST /_plugins/_ml/models/_register?deploy=true
+{
+  "name": "remote-inference",
+  "function_name": "remote",
+  "description": "test model",
+  "connector_id": "eJATWo0BkIylWTeYToTn"
+}
+```
+{% include copy-curl.html %} 
+
+OpenSearch responds with a model ID:
+
+```json
+{
+  "task_id": "7X7pWI0Bpc3sThaJ4I8R",
+  "status": "CREATED",
+  "model_id": "h5AUWo0BkIylWTeYT4SU"
+}
+```
+
+<!-- vale off -->
+## Step 3: Register a flow agent that will run the PPLTool
+<!-- vale on -->
+
+A flow agent runs a sequence of tools in order and returns the last tool's output. To create a flow agent, send the following register agent request, providing the model ID in the `model_id` parameter. To run the generated query, set `execute` to `true`:
+
+```json
+POST /_plugins/_ml/agents/_register
+{
+  "name": "Test_Agent_For_PPL",
+  "type": "flow",
+  "description": "this is a test agent",
+  "memory": {
+    "type": "demo"
+  },
+  "tools": [
+    {
+      "type": "PPLTool",
+      "name": "TransferQuestionToPPLAndExecuteTool",
+      "description": "Use this tool to transfer natural language to generate PPL and execute PPL to query inside. Use this tool after you know the index name, otherwise, call IndexRoutingTool first. The input parameters are: {index:IndexName, question:UserQuestion}",
+      "parameters": {
+        "model_id": "h5AUWo0BkIylWTeYT4SU",
+        "model_type": "FINETUNE",
+        "execute": true
+      }
+    }
+  ]
+}
+```
+{% include copy-curl.html %} 
+
+For parameter descriptions, see [Register parameters](#register-parameters).
+
+OpenSearch responds with an agent ID:
+
+```json
+{
+  "agent_id": "9X7xWI0Bpc3sThaJdY9i"
+}
+```
+
+## Step 4: Run the agent
+
+Before you run the agent, make sure that you add the sample OpenSearch Dashboards `Sample web logs` dataset. To learn more, see [Adding sample data]({{site.url}}{{site.baseurl}}/dashboards/quickstart#adding-sample-data).
+
+Then, run the agent by sending the following request:
+
+```json
+POST /_plugins/_ml/agents/9X7xWI0Bpc3sThaJdY9i/_execute
+{
+  "parameters": {
+    "verbose": true,
+    "question": "what is the error rate yesterday",
+    "index": "opensearch_dashboards_sample_data_logs"
+  }
+}
+```
+{% include copy-curl.html %} 
+
+OpenSearch returns the PPL query and the query results:
+
+```json
+{
+  "inference_results": [
+    {
+      "output": [
+        {
+          "name": "response",
+          "result":"{\"ppl\":\"source\=opensearch_dashboards_sample_data_logs| where timestamp \> DATE_SUB(NOW(), INTERVAL 1 DAY) AND timestamp \< NOW() | eval is_error\=IF(response\=\'200\', 0, 1.0) | stats AVG(is_error) as error_rate\",\"executionResult\":\"{\\n  \\\"schema\\\": [\\n    {\\n      \\\"name\\\": \\\"error_rate\\\",\\n      \\\"type\\\": \\\"double\\\"\\n    }\\n  ],\\n  \\\"datarows\\\": [\\n    [\\n      null\\n    ]\\n  ],\\n  \\\"total\\\": 1,\\n  \\\"size\\\": 1\\n}\"}"
+        }
+      ]
+    }
+  ]
+}
+```
+
+If you set `execute` to `false`, OpenSearch only returns the query but does not run it:
+
+```json
+{
+  "inference_results": [
+    {
+      "output": [
+        {
+            "name": "response",
+            "result": "source=opensearch_dashboards_sample_data_logs| where timestamp > DATE_SUB(NOW(), INTERVAL 1 DAY) AND timestamp < NOW() | eval is_error=IF(response='200', 0, 1.0) | stats AVG(is_error) as error_rate"
+        }
+      ]
+    }
+  ]
+}
+```
+
+## Register parameters
+
+The following table lists all tool parameters that are available when registering an agent.
+
+Parameter	| Type | Required/Optional | Description	
+:--- | :--- | :--- | :---
+`model_id` | String | Required | The model ID of the large language model (LLM) to use for translating text into a PPL query.
+`model_type` | String | Optional | The model type. Valid values are `CLAUDE` (Anthropic Claude model), `OPENAI` (OpenAI models), and `FINETUNE` (custom fine-tuned model). 
+`prompt` | String | Optional | The prompt to provide to the LLM.
+`execute` | Boolean | Optional | Specifies whether to run the PPL query. Default is `true`.
+`input` | Object | Optional | Contains two parameters that specify the index to search and the question for the LLM. For example, `"input": "{\"index\": \"${parameters.index}\", \"question\": ${parameters.question} }"`.
+`head` | Integer | Optional | Limits the number of returned execution results if `execute` is set to `true`. Default is `-1` (no limit).
+
+## Execute parameters
+
+The following table lists all tool parameters that are available when running the agent.
+
+Parameter	| Type | Required/Optional | Description	
+:--- | :--- | :--- | :---
+`index` | String | Required | The index on which to run the PPL query.
+`question` | String | Required | The natural language question to send to the LLM. 
+`verbose` | Boolean | Optional | Whether to provide verbose output. Default is `false`.
--- a/_ml-commons-plugin/agents-tools/tools/rag-tool.md
+++ b/_ml-commons-plugin/agents-tools/tools/rag-tool.md
@ -0,0 +1,149 @@
+---
+layout: default
+title: RAG tool
+has_children: false
+has_toc: false
+nav_order: 65
+parent: Tools
+grand_parent: Agents and tools
+---
+
+<!-- vale off -->
+# RAG tool
+**Introduced 2.12**
+{: .label .label-purple }
+<!-- vale on -->
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161).    
+{: .warning}
+
+The `RAGTool` performs retrieval-augmented generation (RAG). For more information about RAG, see [Conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/).
+
+RAG calls a large language model (LLM) and supplements its knowledge by providing relevant OpenSearch documents along with the user question. To retrieve relevant documents from an OpenSearch index, you'll need a text embedding model that facilitates vector search.
+
+The RAG tool supports the following search methods:
+
+- [Neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/): Dense vector retrieval, which uses a text embedding model.
+- [Neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/): Sparse vector retrieval, which uses a sparse encoding model.
+
+## Before you start
+
+To register and deploy a text embedding model and an LLM and ingest data into an index, perform Steps 1--5 of the [Agents and tools tutorial]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/agents-tools-tutorial/).
+
+The following example uses neural search. To configure neural sparse search and deploy a sparse encoding model, see [Neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/).
+
+<!-- vale off -->
+## Step 1: Register a flow agent that will run the RAGTool
+<!-- vale on -->
+
+A flow agent runs a sequence of tools in order and returns the last tool's output. To create a flow agent, send the following request, providing the text embedding model ID in the `embedding_model_id` parameter and the LLM model ID in the `inference_model_id` parameter:
+
+```json
+POST /_plugins/_ml/agents/_register
+{
+  "name": "Test_Agent_For_RagTool",
+  "type": "flow",
+  "description": "this is a test flow agent",
+  "tools": [
+  {
+    "type": "RAGTool",
+    "description": "A description of the tool",
+    "parameters": {
+      "embedding_model_id": "Hv_PY40Bk4MTqircAVmm",
+      "inference_model_id": "SNzSY40B_1JGmyB0WbfI",
+      "index": "my_test_data",
+      "embedding_field": "embedding",
+      "query_type": "neural",
+      "source_field": [
+        "text"
+      ],
+      "input": "${parameters.question}",
+      "prompt": "\n\nHuman:You are a professional data analyst. You will always answer question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say don't know. \n\n Context:\n${parameters.output_field}\n\nHuman:${parameters.question}\n\nAssistant:"
+    }
+  }
+]
+}
+```
+{% include copy-curl.html %} 
+
+For parameter descriptions, see [Register parameters](#register-parameters).
+
+OpenSearch responds with an agent ID:
+
+```json
+{
+  "agent_id": "9X7xWI0Bpc3sThaJdY9i"
+}
+```
+
+To create a conversational agent containing a `RAGTool`, see [Conversational agents]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/index/#conversational-agents).
+
+## Step 2: Run the agent
+
+Before you run the agent, make sure that you add the sample OpenSearch Dashboards `Sample web logs` dataset. To learn more, see [Adding sample data]({{site.url}}{{site.baseurl}}/dashboards/quickstart#adding-sample-data).
+
+Then, run the agent by sending the following request:
+
+```json
+POST /_plugins/_ml/agents/9X7xWI0Bpc3sThaJdY9i/_execute
+{
+  "parameters": {
+    "question": "what's the population increase of Seattle from 2021 to 2023"
+  }
+}
+```
+{% include copy-curl.html %} 
+
+OpenSearch performs vector search and returns the relevant documents:
+
+```json
+{
+  "inference_results": [
+    {
+      "output": [
+        {
+          "name": "response",
+          "result": """{"_index":"my_test_data","_source":{"text":"Chart and table of population level and growth rate for the Seattle metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\n
+          The current metro area population of Seattle in 2023 is 3,519,000, a 0.86% increase from 2022.\\n
+          The metro area population of Seattle in 2022 was 3,489,000, a 0.81% increase from 2021.\\n
+          The metro area population of Seattle in 2021 was 3,461,000, a 0.82% increase from 2020.\\n
+          The metro area population of Seattle in 2020 was 3,433,000, a 0.79% increase from 2019."},"_id":"6","_score":0.8173238}
+        {"_index":"my_test_data","_source":{"text":"Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\n
+        The current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\n
+        The metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\n
+        The metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\n
+        The metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019."},"_id":"2","_score":0.6641471}
+        """
+        }
+      ]
+    }
+  ]
+}
+```
+
+## Register parameters
+
+The following table lists all tool parameters that are available when registering an agent.
+
+Parameter	| Type | Required/Optional | Description	
+:--- | :--- | :--- | :---
+`embedding_model_id` | String | Required | The model ID of the model to use for generating vector embeddings.
+`inference_model_id` | String | Required | The model ID of the LLM to use for inference.
+`index` | String | Required | The index from which to retrieve relevant documents to pass to the LLM.
+`embedding_field` | String | Required | When the model encodes raw text documents, the encoding result is saved in a field. Specify this field as the `embedding_field`. Neural search matches documents to the query by calculating the similarity score between the query text and the text in the document's `embedding_field`.
+`source_field` | String | Required | The document field or fields to return. You can provide a list of multiple fields as an array of strings, for example, `["field1", "field2"]`.
+`input` | String | Required for flow agent | Runtime input sourced from flow agent parameters. If using an LLM, this field is populated with the LLM response.
+`output_field` | String | Optional | The name of the output field. Default is `response`.
+`query_type` | String | Optional | Specifies the type of query to run to perform neural search. Valid values are `neural` (for dense retrieval) and `neural_sparse` (for sparse retrieval). Default is `neural`.
+`doc_size` | Integer | Optional | The number of documents to fetch. Default is `2`.
+`prompt` | String | Optional | The prompt to provide to the LLM.
+`k` | Integer | Optional | The number of nearest neighbors to search for when performing neural search. Default is 10.
+`enable_Content_Generation` | Boolean | Optional | If `true`, returns results generated by an LLM. If `false`, returns results directly without LLM-assisted content generation. Default is `true`.
+
+## Execute parameters
+
+The following table lists all tool parameters that are available when running the agent.
+
+Parameter	| Type | Required/Optional | Description	
+:--- | :--- | :--- | :---
+`question` | String | Required | The natural language question to send to the LLM. 
--- a/_ml-commons-plugin/agents-tools/tools/search-alerts-tool.md
+++ b/_ml-commons-plugin/agents-tools/tools/search-alerts-tool.md
@ -0,0 +1,127 @@
+---
+layout: default
+title: Search Alerts tool
+has_children: false
+has_toc: false
+nav_order: 67
+parent: Tools
+grand_parent: Agents and tools
+---
+
+<!-- vale off -->
+# Search Alerts tool
+**Introduced 2.12**
+{: .label .label-purple }
+<!-- vale on -->
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161).    
+{: .warning}
+
+The `SearchAlertsTool` retrieves information about generated alerts. For more information about alerts, see [Alerting]({{site.url}}{{site.baseurl}}/observing-your-data/alerting/index/).
+
+## Step 1: Register a flow agent that will run the SearchAlertsTool
+
+A flow agent runs a sequence of tools in order and returns the last tool's output. To create a flow agent, send the following register agent request:
+
+```json
+POST /_plugins/_ml/agents/_register
+{
+  "name": "Test_Agent_For_Search_Alerts_Tool",
+  "type": "flow",
+  "description": "this is a test agent for the SearchAlertsTool",
+  "memory": {
+    "type": "demo"
+  },
+  "tools": [
+      {
+      "type": "SearchAlertsTool",
+      "name": "DemoSearchAlertsTool",
+      "parameters": {}
+    }
+  ]
+}
+```
+{% include copy-curl.html %} 
+
+For parameter descriptions, see [Register parameters](#register-parameters).
+
+OpenSearch responds with an agent ID:
+
+```json
+{
+  "agent_id": "EuJYYo0B9RaBCvhuy1q8"
+}
+```
+
+## Step 2: Run the agent
+
+Run the agent by sending the following request:
+
+```json
+POST /_plugins/_ml/agents/EuJYYo0B9RaBCvhuy1q8/_execute
+{
+  "parameters": {
+    "question": "Do I have any alerts?"
+  }
+}
+```
+{% include copy-curl.html %} 
+
+OpenSearch responds with a list of generated alerts and the total number of alerts:
+
+```json
+{
+  "inference_results": [
+    {
+      "output": [
+        {
+          "name": "response",
+          "result": "Alerts=[Alert(id=rv9nYo0Bk4MTqirc_DkW, version=394, schemaVersion=5, monitorId=ZuJnYo0B9RaBCvhuEVux, workflowId=, workflowName=, monitorName=test-monitor-2, monitorVersion=1, monitorUser=User[name=admin, backend_roles=[admin], roles=[own_index, all_access], custom_attribute_names=[], user_requested_tenant=null], triggerId=ZeJnYo0B9RaBCvhuEVul, triggerName=t-1, findingIds=[], relatedDocIds=[], state=ACTIVE, startTime=2024-02-01T02:03:18.420Z, endTime=null, lastNotificationTime=2024-02-01T08:36:18.409Z, acknowledgedTime=null, errorMessage=null, errorHistory=[], severity=1, actionExecutionResults=[], aggregationResultBucket=null, executionId=ZuJnYo0B9RaBCvhuEVux_2024-02-01T02:03:18.404853331_51c18f2c-5923-47c3-b476-0f5a66c6319b, associatedAlertIds=[])]TotalAlerts=1"
+        }
+      ]
+    }
+  ]
+}
+```
+
+If no alerts are found, OpenSearch responds with an empty array in the results:
+
+```json
+{
+  "inference_results": [
+    {
+      "output": [
+        {
+          "name": "response",
+          "result": "Alerts=[]TotalAlerts=0"
+        }
+      ]
+    }
+  ]
+}
+```
+
+## Register parameters
+
+The following table lists all tool parameters that are available when registering an agent. All parameters are optional.
+
+Parameter	| Type | Description	
+:--- | :--- | :---
+`alertIds`	| Array	| The ID of the alert to search for.
+`monitorId`	| String	| The name of the monitor by which to filter the alerts.
+`workflowIds`	| Array | A list of workflow IDs by which to filter the alerts.
+`alertState` |	String	| The alert state by which to filter the alerts. Valid values are `ALL`, `ACTIVE`, `ERROR`, `COMPLETED`, and `ACKNOWLEDGED`. Default is `ALL`.
+`severityLevel` | String| The severity level by which to filter the alerts. Valid values are `ALL`, `1`, `2`, and `3`. Default is `ALL`.
+`searchString` | String	| The search string to use for searching for a specific alert.
+`sortOrder`| String | The sort order of the results. Valid values are `asc` (ascending) and `desc` (descending). Default is `asc`. 
+`sortString`| String |	Specifies the monitor field by which to sort the results. Default is `monitor_name.keyword`.
+`size`	| Integer |	The number of results to return. Default is `20`.
+`startIndex`| Integer |	The paginated index of the alert to start from. Default is `0`.
+
+## Execute parameters
+
+The following table lists all tool parameters that are available when running the agent.
+
+Parameter	| Type | Required/Optional | Description	
+:--- | :--- | :--- | :---
+`question` | String | Required | The natural language question to send to the LLM. 
--- a/_ml-commons-plugin/agents-tools/tools/search-anomaly-detectors.md
+++ b/_ml-commons-plugin/agents-tools/tools/search-anomaly-detectors.md
@ -0,0 +1,112 @@
+---
+layout: default
+title: Search Anomaly Detectors tool
+has_children: false
+has_toc: false
+nav_order: 70
+parent: Tools
+grand_parent: Agents and tools
+---
+
+<!-- vale off -->
+# Search Anomaly Detectors tool
+**Introduced 2.12**
+{: .label .label-purple }
+<!-- vale on -->
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161).    
+{: .warning}
+
+The `SearchAnomalyDetectorsTool` retrieves information about anomaly detectors set up on your cluster. For more information about anomaly detectors, see [Anomaly detection]({{site.url}}{{site.baseurl}}/observing-your-data/ad/index/).
+
+## Step 1: Register a flow agent that will run the SearchAnomalyDetectorsTool
+
+A flow agent runs a sequence of tools in order and returns the last tool's output. To create a flow agent, send the following register agent request:
+
+```json
+POST /_plugins/_ml/agents/_register
+{
+  "name": "Test_Agent_For_Search_Anomaly_Detectors_Tool",
+  "type": "flow",
+  "description": "this is a test agent for the SearchAnomalyDetectorsTool",
+  "memory": {
+    "type": "demo"
+  },
+  "tools": [
+      {
+      "type": "SearchAnomalyDetectorsTool",
+      "name": "DemoSearchAnomalyDetectorsTool",
+      "parameters": {}
+    }
+  ]
+}
+```
+{% include copy-curl.html %} 
+
+For parameter descriptions, see [Register parameters](#register-parameters).
+
+OpenSearch responds with an agent ID:
+
+```json
+{
+  "agent_id": "EuJYYo0B9RaBCvhuy1q8"
+}
+```
+
+## Step 2: Run the agent
+
+Run the agent by sending the following request:
+
+```json
+POST /_plugins/_ml/agents/EuJYYo0B9RaBCvhuy1q8/_execute
+{
+  "parameters": {
+    "question": "Do I have any anomaly detectors?"
+  }
+}
+```
+{% include copy-curl.html %} 
+
+OpenSearch responds with a list of anomaly detectors set up on your cluster and the total number of anomaly detectors:
+
+```json
+{
+  "inference_results": [
+    {
+      "output": [
+        {
+          "name": "response",
+          "result": "AnomalyDetectors=[{id=y2M-Yo0B-yCFzT-N_XXU,name=sample-http-responses-detector,type=SINGLE_ENTITY,description=A sample detector to detect anomalies with HTTP response code logs.,index=[sample-http-responses],lastUpdateTime=1706750311891}]TotalAnomalyDetectors=1"
+        }
+      ]
+    }
+  ]
+}
+```
+
+## Register parameters
+
+The following table lists all tool parameters that are available when registering an agent. All parameters are optional.
+
+Parameter	| Type | Description	
+:--- | :--- | :---
+`detectorName`	| String	| The name of the detector to search for.
+`detectorNamePattern`	| String | A wildcard query used to match the detector name to search for.
+`indices` | String	| The index name or index pattern of the indexes that the returned detectors are using as data sources.
+`highCardinality` | Boolean	| Whether to return information about high-cardinality detectors. Leave this parameter unset (or set it to `null`) to return information about both high-cardinality (multi-entity) and non-high-cardinality (single-entity) detectors. Set this parameter to `true` to only return information about high-cardinality detectors. Set this parameter to `false` to only return information about non-high-cardinality detectors.
+`lastUpdateTime` | Long |	Specifies the earliest last updated time of the detectors to return, in epoch milliseconds. Default is `null`.
+`sortOrder`	|String | The sort order for the results. Valid values are `asc` (ascending) and `desc` (descending). Default is `desc`. 
+`sortString`| String |	Specifies the detector field by which to sort the results. Default is `name.keyword`.
+`size`	| Integer |	The number of results to return. Default is `20`.
+`startIndex`| Integer |	The paginated index of the detector to start from. Default is `0`.
+`running`| Boolean | Whether to return information about detectors that are currently running. Leave this parameter unset (or set it to `null`) to return both running and non-running detector information. Set this parameter to `true` to only return information about running detectors. Set this parameter to `false` to return only information about detectors that are not currently running. Default is `null`.
+`disabled` |	Boolean	| Whether to return information about detectors that are currently disabled. Leave this parameter unset (or set it to `null`) to return information about both enabled and disabled detectors. Set this parameter to `true` to return only information about disabled detectors. Set this parameter to `false` to return only information about enabled detectors. Default is `null`.
+`failed` |	Boolean	| Whether to return information about detectors that are currently failing. Leave this parameter unset (or set it to `null`) to return information about both failed and non-failed detectors. Set this parameter to `true` to return only information about failed detectors. Set this parameter to `false` to return only information about non-failed detectors. Default is `null`.
+
+## Execute parameters
+
+The following table lists all tool parameters that are available when running the agent.
+
+Parameter	| Type | Required/Optional | Description	
+:--- | :--- | :--- | :---
+`question` | String | Required | The natural language question to send to the LLM. 
--- a/_ml-commons-plugin/agents-tools/tools/search-anomaly-results.md
+++ b/_ml-commons-plugin/agents-tools/tools/search-anomaly-results.md
@ -0,0 +1,126 @@
+---
+layout: default
+title: Search Anomaly Results tool
+has_children: false
+has_toc: false
+nav_order: 80
+parent: Tools
+grand_parent: Agents and tools
+---
+
+<!-- vale off -->
+# Search Anomaly Results tool
+**Introduced 2.12**
+{: .label .label-purple }
+<!-- vale on -->
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161).    
+{: .warning}
+
+The `SearchAnomalyResultsTool` retrieves information about anomaly detector results. For more information about anomaly detectors, see [Anomaly detection]({{site.url}}{{site.baseurl}}/observing-your-data/ad/index/).
+
+## Step 1: Register a flow agent that will run the SearchAnomalyResultsTool
+
+A flow agent runs a sequence of tools in order and returns the last tool's output. To create a flow agent, send the following register agent request:
+
+```json
+POST /_plugins/_ml/agents/_register
+{
+  "name": "Test_Agent_For_Search_Anomaly_Results_Tool",
+  "type": "flow",
+  "description": "this is a test agent for the SearchAnomalyResultsTool",
+  "memory": {
+    "type": "demo"
+  },
+  "tools": [
+    {
+      "type": "SearchAnomalyResultsTool",
+      "name": "DemoSearchAnomalyResultsTool",
+      "parameters": {}
+    }
+  ]
+}
+```
+{% include copy-curl.html %} 
+
+For parameter descriptions, see [Register parameters](#register-parameters).
+
+OpenSearch responds with an agent ID:
+
+```json
+{
+  "agent_id": "HuJZYo0B9RaBCvhuUlpy"
+}
+```
+
+## Step 2: Run the agent
+
+Run the agent by sending the following request:
+
+```json
+POST /_plugins/_ml/agents/HuJZYo0B9RaBCvhuUlpy/_execute
+{
+  "parameters": {
+    "question": "Do I have any anomalies?"
+  }
+}
+```
+{% include copy-curl.html %} 
+
+OpenSearch responds with a list of individual anomaly detectors set up on your cluster (where each result contains the detector ID, the anomaly grade, and the confidence level) and the total number of anomaly results found:
+
+```json
+{
+  "inference_results": [
+    {
+      "output": [
+        {
+          "name": "response",
+          "result": "AnomalyResults=[{detectorId=ef9lYo0Bk4MTqircmjnm,grade=1.0,confidence=0.9403051246569198}{detectorId=E-JlYo0B9RaBCvhunFtw,grade=1.0,confidence=0.9163498216870274}]TotalAnomalyResults=2"
+        }
+      ]
+    }
+  ]
+}
+```
+
+If no anomalies are found, OpenSearch responds with an empty array in the results:
+
+```json
+{
+  "inference_results": [
+    {
+      "output": [
+        {
+          "name": "response",
+          "result": "AnomalyResults=[]TotalAnomalyResults=0"
+        }
+      ]
+    }
+  ]
+}
+```
+
+## Register parameters
+
+The following table lists all tool parameters that are available when registering an agent. All parameters are optional.
+
+Parameter	| Type | Description	
+:--- | :--- | :---
+`detectorId`	| String	| The ID of the detector from which to return results.
+`realTime`	| Boolean | Whether to return real-time anomaly detector results. Set this parameter to `false` to return only historical analysis results.
+`anomalyGradeThreshold` | Float	| The minimum anomaly grade for the returned anomaly detector results. Anomaly grade is a number between 0 and 1 that indicates how anomalous a data point is.
+`dataStartTime` | Long	| The earliest time for which to return anomaly detector results, in epoch milliseconds.
+`dataEndTime` | Long |	The latest time for which to return anomaly detector results, in epoch milliseconds.
+`sortOrder`	|String | The sort order for the results. Valid values are `asc` (ascending) and `desc` (descending). Default is `desc`. 
+`sortString`| String |	Specifies the detector field by which to sort the results. Default is `data_start_time`.
+`size`	| Integer |	The number of results to return. Default is `20`.
+`startIndex`| Integer |	The paginated index of the result to start from. Default is `0`.
+
+## Execute parameters
+
+The following table lists all tool parameters that are available when running the agent.
+
+Parameter	| Type | Required/Optional | Description	
+:--- | :--- | :--- | :---
+`question` | String | Required | The natural language question to send to the LLM. 
--- a/_ml-commons-plugin/agents-tools/tools/search-index-tool.md
+++ b/_ml-commons-plugin/agents-tools/tools/search-index-tool.md
@ -0,0 +1,123 @@
+---
+layout: default
+title: Search Index tool
+has_children: false
+has_toc: false
+nav_order: 90
+parent: Tools
+grand_parent: Agents and tools
+---
+
+<!-- vale off -->
+# Search Index tool
+**Introduced 2.12**
+{: .label .label-purple }
+<!-- vale on -->
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161).    
+{: .warning}
+
+The `SearchIndexTool` searches an index using a query written in query domain-specific language (DSL) and returns the query results.
+
+## Step 1: Register a flow agent that will run the SearchIndexTool
+
+A flow agent runs a sequence of tools in order and returns the last tool's output. To create a flow agent, send the following register agent request:
+
+```json
+POST /_plugins/_ml/agents/_register
+{
+  "name": "Test_Agent_For_Search_Index_Tool",
+  "type": "flow",
+  "description": "this is a test for search index tool",
+  "memory": {
+    "type": "demo"
+  },
+  "tools": [
+    {
+      "type": "SearchIndexTool"
+    }
+  ]
+}
+```
+{% include copy-curl.html %} 
+
+OpenSearch responds with an agent ID:
+
+```json
+{
+  "agent_id": "9X7xWI0Bpc3sThaJdY9i"
+}
+```
+
+## Step 2: Run the agent
+
+Before you run the agent, make sure that you add the sample OpenSearch Dashboards `Sample eCommerce orders` dataset. To learn more, see [Adding sample data]({{site.url}}{{site.baseurl}}/dashboards/quickstart#adding-sample-data).
+
+Then, run the agent by sending the following request. The `SearchIndexTool` takes one parameter named `input`. This parameter includes the index name and the query:
+
+```json
+POST /_plugins/_ml/agents/9X7xWI0Bpc3sThaJdY9i/_execute
+{
+  "parameters": {
+    "input": "{\"index\": \"opensearch_dashboards_sample_data_ecommerce\", \"query\": {\"size\": 20,  \"_source\": \"email\"}}"
+  }
+}
+```
+{% include copy-curl.html %} 
+
+For parameter descriptions, see [Execute parameters](#execute-parameters).
+
+The query passed in the previous request is equivalent to the following query:
+
+```json
+GET opensearch_dashboards_sample_data_ecommerce/_search
+{
+  "size": 20,
+  "_source": "email"
+}
+```
+
+OpenSearch returns the query results:
+
+```json
+{
+  "inference_results": [
+    {
+      "output": [
+        {
+          "name": "response",
+          "result": """{"_index":"opensearch_dashboards_sample_data_ecommerce","_source":{"email":"eddie@underwood-family.zzz"},"_id":"_bJVWY0BAehlDanXJnAJ","_score":1.0}
+{"_index":"opensearch_dashboards_sample_data_ecommerce","_source":{"email":"mary@bailey-family.zzz"},"_id":"_rJVWY0BAehlDanXJnAJ","_score":1.0}
+{"_index":"opensearch_dashboards_sample_data_ecommerce","_source":{"email":"gwen@butler-family.zzz"},"_id":"_7JVWY0BAehlDanXJnAJ","_score":1.0}
+{"_index":"opensearch_dashboards_sample_data_ecommerce","_source":{"email":"diane@chandler-family.zzz"},"_id":"ALJVWY0BAehlDanXJnEJ","_score":1.0}
+{"_index":"opensearch_dashboards_sample_data_ecommerce","_source":{"email":"eddie@weber-family.zzz"},"_id":"AbJVWY0BAehlDanXJnEJ","_score":1.0}
+{"_index":"opensearch_dashboards_sample_data_ecommerce","_source":{"email":"diane@goodwin-family.zzz"},"_id":"ArJVWY0BAehlDanXJnEJ","_score":1.0}
+{"_index":"opensearch_dashboards_sample_data_ecommerce","_source":{"email":"oliver@rios-family.zzz"},"_id":"A7JVWY0BAehlDanXJnEJ","_score":1.0}
+{"_index":"opensearch_dashboards_sample_data_ecommerce","_source":{"email":"abd@sutton-family.zzz"},"_id":"BLJVWY0BAehlDanXJnEJ","_score":1.0}
+{"_index":"opensearch_dashboards_sample_data_ecommerce","_source":{"email":"wilhemina st.@tran-family.zzz"},"_id":"BbJVWY0BAehlDanXJnEJ","_score":1.0}
+{"_index":"opensearch_dashboards_sample_data_ecommerce","_source":{"email":"rabbia al@baker-family.zzz"},"_id":"BrJVWY0BAehlDanXJnEJ","_score":1.0}
+{"_index":"opensearch_dashboards_sample_data_ecommerce","_source":{"email":"rabbia al@romero-family.zzz"},"_id":"B7JVWY0BAehlDanXJnEJ","_score":1.0}
+{"_index":"opensearch_dashboards_sample_data_ecommerce","_source":{"email":"eddie@gregory-family.zzz"},"_id":"CLJVWY0BAehlDanXJnEJ","_score":1.0}
+{"_index":"opensearch_dashboards_sample_data_ecommerce","_source":{"email":"sultan al@pratt-family.zzz"},"_id":"CbJVWY0BAehlDanXJnEJ","_score":1.0}
+{"_index":"opensearch_dashboards_sample_data_ecommerce","_source":{"email":"eddie@wolfe-family.zzz"},"_id":"CrJVWY0BAehlDanXJnEJ","_score":1.0}
+{"_index":"opensearch_dashboards_sample_data_ecommerce","_source":{"email":"sultan al@thompson-family.zzz"},"_id":"C7JVWY0BAehlDanXJnEJ","_score":1.0}
+{"_index":"opensearch_dashboards_sample_data_ecommerce","_source":{"email":"sultan al@boone-family.zzz"},"_id":"DLJVWY0BAehlDanXJnEJ","_score":1.0}
+{"_index":"opensearch_dashboards_sample_data_ecommerce","_source":{"email":"george@hubbard-family.zzz"},"_id":"DbJVWY0BAehlDanXJnEJ","_score":1.0}
+{"_index":"opensearch_dashboards_sample_data_ecommerce","_source":{"email":"boris@maldonado-family.zzz"},"_id":"DrJVWY0BAehlDanXJnEJ","_score":1.0}
+{"_index":"opensearch_dashboards_sample_data_ecommerce","_source":{"email":"yahya@rivera-family.zzz"},"_id":"D7JVWY0BAehlDanXJnEJ","_score":1.0}
+{"_index":"opensearch_dashboards_sample_data_ecommerce","_source":{"email":"brigitte@morris-family.zzz"},"_id":"ELJVWY0BAehlDanXJnEJ","_score":1.0}
+"""
+        }
+      ]
+    }
+  ]
+}
+```
+
+## Execute parameters
+
+The following table lists all tool parameters that are available when registering an agent. 
+
+Parameter | Type | Description
+:--- | :--- | :---
+`input`| String | The index name and the query to use for search, in JSON format. The `index` parameter contains the name of the index and the `query` parameter contains the query formatted in Query DSL. For example, `"{\"index\": \"opensearch_dashboards_sample_data_ecommerce\", \"query\": {\"size\": 22,  \"_source\": \"category\"}}"`. The `input` parameter and the `index` and `query` parameters it contains are required.
--- a/_ml-commons-plugin/agents-tools/tools/search-monitors-tool.md
+++ b/_ml-commons-plugin/agents-tools/tools/search-monitors-tool.md
@ -0,0 +1,127 @@
+---
+layout: default
+title: Search Monitors tool
+has_children: false
+has_toc: false
+nav_order: 100
+parent: Tools
+grand_parent: Agents and tools
+---
+
+<!-- vale off -->
+# Search Monitors tool
+**Introduced 2.12**
+{: .label .label-purple }
+<!-- vale on -->
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161).    
+{: .warning}
+
+The `SearchMonitorsTool` retrieves information about alerting monitors set up on your cluster. For more information about alerting monitors, see [Monitors]({{site.url}}{{site.baseurl}}/observing-your-data/alerting/monitors/).
+
+## Step 1: Register a flow agent that will run the SearchMonitorsTool
+
+A flow agent runs a sequence of tools in order and returns the last tool's output. To create a flow agent, send the following register agent request:
+
+```json
+POST /_plugins/_ml/agents/_register
+{
+  "name": "Test_Agent_For_Search_Monitors_Tool",
+  "type": "flow",
+  "description": "this is a test agent for the SearchMonitorsTool",
+  "memory": {
+    "type": "demo"
+  },
+  "tools": [
+    {
+      "type": "SearchMonitorsTool",
+      "name": "DemoSearchMonitorsTool",
+      "parameters": {}
+    }
+  ]
+}
+```
+{% include copy-curl.html %} 
+
+For parameter descriptions, see [Register parameters](#register-parameters).
+
+OpenSearch responds with an agent ID:
+
+```json
+{
+  "agent_id": "EuJYYo0B9RaBCvhuy1q8"
+}
+```
+
+## Step 2: Run the agent
+
+Run the agent by sending the following request:
+
+```json
+POST /_plugins/_ml/agents/EuJYYo0B9RaBCvhuy1q8/_execute
+{
+  "parameters": {
+    "question": "Do I have any alerting monitors?"
+  }
+}
+```
+{% include copy-curl.html %} 
+
+OpenSearch responds with a list of alerting monitors set up on your cluster and the total number of alerting monitors:
+
+```json
+{
+  "inference_results": [
+    {
+      "output": [
+        {
+          "name": "response",
+          "result": "Monitors=[{id=j_9mYo0Bk4MTqircEzk_,name=test-monitor,type=query_level_monitor,enabled=true,enabledTime=1706752873144,lastUpdateTime=1706752873145}{id=ZuJnYo0B9RaBCvhuEVux,name=test-monitor-2,type=query_level_monitor,enabled=true,enabledTime=1706752938405,lastUpdateTime=1706752938405}]TotalMonitors=2"
+        }
+      ]
+    }
+  ]
+}
+```
+
+If no monitors are found, OpenSearch responds with an empty array in the results:
+
+```json
+{
+  "inference_results": [
+    {
+      "output": [
+        {
+          "name": "response",
+          "result": "Monitors=[]TotalMonitors=0"
+        }
+      ]
+    }
+  ]
+}
+```
+
+## Register parameters
+
+The following table lists all tool parameters that are available when registering an agent. All parameters are optional.
+
+Parameter	| Type | Description	
+:--- | :--- | :---
+`monitorId`	| String	| The ID of the monitor to search for.
+`monitorName`	| String	| The name of the monitor to search for.
+`monitorNamePattern`	| String | A wildcard query used to match the monitor name to search for.
+`enabled` |	Boolean	| Whether to return information about monitors that are currently enabled. Leave this parameter unset (or set it to `null`) to return information about both enabled and disabled monitors. Set this parameter to `true` to return only information about enabled monitors. Set this parameter to `false` to return only information about disabled monitors. Default is `null`.
+`hasTriggers` |	Boolean	| Whether to return information about monitors that have triggers enabled. Leave this parameter unset (or set it to `null`) to return information about monitors that have triggers enabled and disabled. Set this parameter to `true` to return only information about monitors with triggers enabled. Set this parameter to `false` to return only information about monitors with triggers disabled. Default is `null`.
+`indices` | String	| The index name or index pattern of the indexes tracked by the returned monitors.
+`sortOrder`| String | The sort order of the results. Valid values are `asc` (ascending) and `desc` (descending). Default is `asc`. 
+`sortString`| String |	Specifies the monitor field by which to sort the results. Default is `name.keyword`.
+`size`	| Integer |	The number of results to return. Default is `20`.
+`startIndex`| Integer |	The paginated index of the monitor to start from. Default is `0`.
+
+## Execute parameters
+
+The following table lists all tool parameters that are available when running the agent.
+
+Parameter	| Type | Required/Optional | Description	
+:--- | :--- | :--- | :---
+`question` | String | Required | The natural language question to send to the LLM. 
--- a/_ml-commons-plugin/agents-tools/tools/vector-db-tool.md
+++ b/_ml-commons-plugin/agents-tools/tools/vector-db-tool.md
@ -0,0 +1,238 @@
+---
+layout: default
+title: Vector DB tool
+has_children: false
+has_toc: false
+nav_order: 110
+parent: Tools
+grand_parent: Agents and tools
+---
+
+<!-- vale off -->
+# Vector DB tool
+**Introduced 2.12**
+{: .label .label-purple }
+<!-- vale on -->
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161).    
+{: .warning}
+
+The `VectorDBTool` performs dense vector retrieval. For more information about OpenSearch vector database capabilities, see [neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/).
+
+## Step 1: Register and deploy a sparse encoding model
+
+OpenSearch supports several pretrained models. You can use one of those models, use your own custom model, or create a connector for an externally hosted model. For a list of supported pretrained models, see [OpenSearch-provided pretrained models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/). For more information about custom models, see [Custom local models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/custom-local-models/). For information about integrating an externally hosted model, see [Connecting to externally hosted models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/index/). 
+
+In this example, you'll use the `huggingface/sentence-transformers/all-MiniLM-L12-v2` pretrained model for both ingestion and search. To register and deploy the model to OpenSearch, send the following request:
+
+```json
+POST /_plugins/_ml/models/_register?deploy=true
+{
+  "name": "huggingface/sentence-transformers/all-MiniLM-L12-v2",
+  "version": "1.0.1",
+  "model_format": "TORCH_SCRIPT"
+}
+```
+{% include copy-curl.html %} 
+
+OpenSearch responds with a task ID for the model registration and deployment task:
+
+```json
+{
+  "task_id": "M_9KY40Bk4MTqirc5lP8",
+  "status": "CREATED"
+}
+```
+
+You can monitor the status of the task by calling the Tasks API:
+
+```json
+GET _plugins/_ml/tasks/M_9KY40Bk4MTqirc5lP8
+```
+{% include copy-curl.html %} 
+
+Once the model is registered and deployed, the task `state` changes to `COMPLETED` and OpenSearch returns a model ID for the model:
+
+```json
+{
+  "model_id": "Hv_PY40Bk4MTqircAVmm",
+  "task_type": "REGISTER_MODEL",
+  "function_name": "TEXT_EMBEDDING",
+  "state": "COMPLETED",
+  "worker_node": [
+    "UyQSTQ3nTFa3IP6IdFKoug"
+  ],
+  "create_time": 1706767869692,
+  "last_update_time": 1706767935556,
+  "is_async": true
+}
+```
+
+## Step 2: Ingest data into an index
+
+First, you'll set up an ingest pipeline to encode documents using the sparse encoding model set up in the previous step:
+
+```json
+PUT /_ingest/pipeline/test-pipeline-local-model
+{
+  "description": "text embedding pipeline",
+  "processors": [
+    {
+      "text_embedding": {
+        "model_id": "Hv_PY40Bk4MTqircAVmm",
+        "field_map": {
+          "text": "embedding"
+        }
+      }
+    }
+  ]
+}
+```
+{% include copy-curl.html %} 
+
+Next, create a k-NN index specifying the pipeline as the default pipeline:
+
+```json
+PUT my_test_data
+{
+  "mappings": {
+    "properties": {
+      "text": {
+        "type": "text"
+      },
+      "embedding": {
+        "type": "knn_vector",
+        "dimension": 384
+      }
+    }
+  },
+  "settings": {
+    "index": {
+      "knn.space_type": "cosinesimil",
+      "default_pipeline": "test-pipeline-local-model",
+      "knn": "true"
+    }
+  }
+}
+```
+{% include copy-curl.html %} 
+
+Last, ingest data into the index by sending a bulk request:
+
+```json
+POST _bulk
+{"index": {"_index": "my_test_data", "_id": "1"}}
+{"text": "Chart and table of population level and growth rate for the Ogden-Layton metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\nThe current metro area population of Ogden-Layton in 2023 is 750,000, a 1.63% increase from 2022.\nThe metro area population of Ogden-Layton in 2022 was 738,000, a 1.79% increase from 2021.\nThe metro area population of Ogden-Layton in 2021 was 725,000, a 1.97% increase from 2020.\nThe metro area population of Ogden-Layton in 2020 was 711,000, a 2.16% increase from 2019."}
+{"index": {"_index": "my_test_data", "_id": "2"}}
+{"text": "Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019."}
+{"index": {"_index": "my_test_data", "_id": "3"}}
+{"text": "Chart and table of population level and growth rate for the Chicago metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Chicago in 2023 is 8,937,000, a 0.4% increase from 2022.\\nThe metro area population of Chicago in 2022 was 8,901,000, a 0.27% increase from 2021.\\nThe metro area population of Chicago in 2021 was 8,877,000, a 0.14% increase from 2020.\\nThe metro area population of Chicago in 2020 was 8,865,000, a 0.03% increase from 2019."}
+{"index": {"_index": "my_test_data", "_id": "4"}}
+{"text": "Chart and table of population level and growth rate for the Miami metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Miami in 2023 is 6,265,000, a 0.8% increase from 2022.\\nThe metro area population of Miami in 2022 was 6,215,000, a 0.78% increase from 2021.\\nThe metro area population of Miami in 2021 was 6,167,000, a 0.74% increase from 2020.\\nThe metro area population of Miami in 2020 was 6,122,000, a 0.71% increase from 2019."}
+{"index": {"_index": "my_test_data", "_id": "5"}}
+{"text": "Chart and table of population level and growth rate for the Austin metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Austin in 2023 is 2,228,000, a 2.39% increase from 2022.\\nThe metro area population of Austin in 2022 was 2,176,000, a 2.79% increase from 2021.\\nThe metro area population of Austin in 2021 was 2,117,000, a 3.12% increase from 2020.\\nThe metro area population of Austin in 2020 was 2,053,000, a 3.43% increase from 2019."}
+{"index": {"_index": "my_test_data", "_id": "6"}}
+{"text": "Chart and table of population level and growth rate for the Seattle metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Seattle in 2023 is 3,519,000, a 0.86% increase from 2022.\\nThe metro area population of Seattle in 2022 was 3,489,000, a 0.81% increase from 2021.\\nThe metro area population of Seattle in 2021 was 3,461,000, a 0.82% increase from 2020.\\nThe metro area population of Seattle in 2020 was 3,433,000, a 0.79% increase from 2019."}
+```
+{% include copy-curl.html %} 
+
+## Step 3: Register a flow agent that will run the VectorDBTool
+
+A flow agent runs a sequence of tools in order and returns the last tool's output. To create a flow agent, send the following request, providing the model ID for the model set up in Step 1. This model will encode your queries into vector embeddings:
+
+```json
+POST /_plugins/_ml/agents/_register
+{
+  "name": "Test_Agent_For_VectorDB",
+  "type": "flow",
+  "description": "this is a test agent",
+  "tools": [
+    {
+      "type": "VectorDBTool",
+      "parameters": {
+        "model_id": "Hv_PY40Bk4MTqircAVmm",
+        "index": "my_test_data",
+        "embedding_field": "embedding",
+        "source_field": ["text"],
+        "input": "${parameters.question}"
+      }
+    }
+  ]
+}
+```
+{% include copy-curl.html %} 
+
+For parameter descriptions, see [Register parameters](#register-parameters).
+
+OpenSearch responds with an agent ID:
+
+```json
+{
+  "agent_id": "9X7xWI0Bpc3sThaJdY9i"
+}
+```
+
+## Step 4: Run the agent
+
+Before you run the agent, make sure that you add the sample OpenSearch Dashboards `Sample web logs` dataset. To learn more, see [Adding sample data]({{site.url}}{{site.baseurl}}/dashboards/quickstart#adding-sample-data).
+
+Then, run the agent by sending the following request:
+
+```json
+POST /_plugins/_ml/agents/9X7xWI0Bpc3sThaJdY9i/_execute
+{
+  "parameters": {
+    "question": "what's the population increase of Seattle from 2021 to 2023"
+  }
+}
+```
+{% include copy-curl.html %} 
+
+OpenSearch performs vector search and returns the relevant documents:
+
+```json
+{
+  "inference_results": [
+    {
+      "output": [
+        {
+          "name": "response",
+          "result": """{"_index":"my_test_data","_source":{"text":"Chart and table of population level and growth rate for the Seattle metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\n
+          The current metro area population of Seattle in 2023 is 3,519,000, a 0.86% increase from 2022.\\n
+          The metro area population of Seattle in 2022 was 3,489,000, a 0.81% increase from 2021.\\n
+          The metro area population of Seattle in 2021 was 3,461,000, a 0.82% increase from 2020.\\n
+          The metro area population of Seattle in 2020 was 3,433,000, a 0.79% increase from 2019."},"_id":"6","_score":0.8173238}
+        {"_index":"my_test_data","_source":{"text":"Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\n
+        The current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\n
+        The metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\n
+        The metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\n
+        The metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019."},"_id":"2","_score":0.6641471}
+        """
+        }
+      ]
+    }
+  ]
+}
+```
+
+## Register parameters
+
+The following table lists all tool parameters that are available when registering an agent. 
+
+Parameter	| Type | Required/Optional | Description	
+:--- | :--- | :--- | :---
+`model_id` | String | Required | The model ID of the model to use at search time.
+`index` | String | Required | The index to search.
+`embedding_field` | String | Required | When the model encodes raw text documents, the encoding result is saved in a field. Specify this field as the `embedding_field`. Neural search matches documents to the query by calculating the similarity score between the query text and the text in the document's `embedding_field`.
+`source_field` | String | Required | The document field or fields to return. You can provide a list of multiple fields as an array of strings, for example, `["field1", "field2"]`.
+`input` | String | Required for flow agent | Runtime input sourced from flow agent parameters. If using a large language model (LLM), this field is populated with the LLM response.
+`doc_size` | Integer | Optional | The number of documents to fetch. Default is `2`.
+`k` | Integer | Optional | The number of nearest neighbors to search for when performing neural search. Default is `10`.
+
+## Execute parameters
+
+The following table lists all tool parameters that are available when running the agent.
+
+Parameter	| Type | Required/Optional | Description	
+:--- | :--- | :--- | :---
+`question` | String | Required | The natural language question to send to the LLM. 
--- a/_ml-commons-plugin/agents-tools/tools/visualization-tool.md
+++ b/_ml-commons-plugin/agents-tools/tools/visualization-tool.md
@ -0,0 +1,106 @@
+---
+layout: default
+title: Visualization tool
+has_children: false
+has_toc: false
+nav_order: 120
+parent: Tools
+grand_parent: Agents and tools
+---
+
+# Visualization tool
+**Introduced 2.12**
+{: .label .label-purple }
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161).    
+{: .warning}
+
+Use the `VisualizationTool` to find visualizations relevant to a question. 
+
+## Step 1: Register a flow agent that will run the VisualizationTool
+
+A flow agent runs a sequence of tools in order and returns the last tool's output. To create a flow agent, send the following register agent request:
+
+```json
+POST /_plugins/_ml/agents/_register
+{
+  "name": "Test_Agent_For_Visualization_tool",
+  "type": "flow",
+  "description": "this is a test agent for the VisuailizationTool",
+  "tools": [
+      {
+      "type": "VisualizationTool",
+      "name": "DemoVisualizationTool",
+      "parameters": {
+        "index": ".kibana",
+        "input": "${parameters.question}",
+        "size": 3
+      }
+    }
+  ]
+}
+```
+{% include copy-curl.html %} 
+
+For parameter descriptions, see [Register parameters](#register-parameters).
+
+OpenSearch responds with an agent ID:
+
+```json
+{
+  "agent_id": "9X7xWI0Bpc3sThaJdY9i"
+}
+```
+
+## Step 2: Run the agent
+
+Before you run the agent, make sure that you add the sample OpenSearch Dashboards `Sample eCommerce orders` dataset. To learn more, see [Adding sample data]({{site.url}}{{site.baseurl}}/dashboards/quickstart#adding-sample-data).
+
+Then, run the agent by sending the following request:
+
+```json
+POST /_plugins/_ml/agents/9X7xWI0Bpc3sThaJdY9i/_execute
+{
+  "parameters": {
+    "question": "what's the revenue for today?"
+  }
+}
+```
+{% include copy-curl.html %} 
+
+By default, OpenSearch returns the top three matching visualizations. You can use the `size` parameter to specify the number of results returned. The output is returned in CSV format. The output includes two columns: `Title` (the visualization title displayed in OpenSearch Dashboards) and `Id` (a unique ID for this visualization):
+
+```json
+{
+  "inference_results": [
+    {
+      "output": [
+        {
+          "name": "response",
+          "result": """Title,Id
+[eCommerce] Total Revenue,10f1a240-b891-11e8-a6d9-e546fe2bba5f
+"""
+        }
+      ]
+    }
+  ]
+}
+```
+
+## Register parameters
+
+The following table lists all tool parameters that are available when registering an agent.
+
+Parameter	| Type | Required/Optional | Description	
+:--- | :--- | :--- | :---
+`input` | String | Required | The user input used to match visualizations.
+`index` | String | Optional | The index to search. Default is `.kibana` (the system index for OpenSearch Dashboards data).
+`size` | Integer | Optional | The number of visualizations to return. Default is `3`.
+
+## Execute parameters
+
+The following table lists all tool parameters that are available when running the agent.
+
+Parameter	| Type | Required/Optional | Description	
+:--- | :--- | :--- | :---
+`question` | String | Required | The natural language question to send to the LLM. 
--- a/_ml-commons-plugin/api/agent-apis/delete-agent.md
+++ b/_ml-commons-plugin/api/agent-apis/delete-agent.md
@ -0,0 +1,47 @@
+---
+layout: default
+title: Delete agent
+parent: Agent APIs
+grand_parent: ML Commons APIs
+nav_order: 50
+---
+
+# Delete an agent
+**Introduced 2.12**
+{: .label .label-purple }
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161).    
+{: .warning}
+
+You can use this API to delete an agent based on the `agent_id`.
+
+## Path and HTTP methods
+
+```json
+DELETE /_plugins/_ml/agents/<agent_id>
+```
+
+#### Example request
+
+```json
+DELETE /_plugins/_ml/agents/MzcIJX8BA7mbufL6DOwl
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+```json
+{
+  "_index" : ".plugins-ml-agent",
+  "_id" : "MzcIJX8BA7mbufL6DOwl",
+  "_version" : 2,
+  "result" : "deleted",
+  "_shards" : {
+    "total" : 2,
+    "successful" : 2,
+    "failed" : 0
+  },
+  "_seq_no" : 27,
+  "_primary_term" : 18
+}
+```
--- a/_ml-commons-plugin/api/agent-apis/execute-agent.md
+++ b/_ml-commons-plugin/api/agent-apis/execute-agent.md
@ -0,0 +1,68 @@
+---
+layout: default
+title: Execute agent
+parent: Agent APIs
+grand_parent: ML Commons APIs
+nav_order: 20
+---
+
+# Execute an agent
+**Introduced 2.12**
+{: .label .label-purple }
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161).    
+{: .warning}
+
+When an agent is executed, it runs the tools with which it is configured.
+
+### Path and HTTP methods
+
+```json
+POST /_plugins/_ml/agents/<agent_id>/_execute
+```
+
+## Request fields
+
+The following table lists the available request fields.
+
+Field | Data type | Required/Optional | Description
+:---  | :--- | :--- 
+`parameters`| Object | Required | The parameters required by the agent. 
+`parameters.verbose`| Boolean | Optional | Provides verbose output. 
+
+#### Example request
+
+```json
+POST /_plugins/_ml/agents/879v9YwBjWKCe6Kg12Tx/_execute
+{
+  "parameters": {
+    "question": "what's the population increase of Seattle from 2021 to 2023"
+  }
+}
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+```json
+{
+  "inference_results": [
+    {
+      "output": [
+        {
+          "result": """ Based on the given context, the key information is:
+
+The metro area population of Seattle in 2021 was 3,461,000.
+The metro area population of Seattle in 2023 is 3,519,000.
+
+To calculate the population increase from 2021 to 2023:
+
+Population in 2023 (3,519,000) - Population in 2021 (3,461,000) = 58,000
+
+Therefore, the population increase of Seattle from 2021 to 2023 is 58,000."""
+        }
+      ]
+    }
+  ]
+}
+```
--- a/_ml-commons-plugin/api/agent-apis/get-agent.md
+++ b/_ml-commons-plugin/api/agent-apis/get-agent.md
@ -0,0 +1,85 @@
+---
+layout: default
+title: Get agent
+parent: Agent APIs
+grand_parent: ML Commons APIs
+nav_order: 20
+---
+
+# Get an agent
+**Introduced 2.12**
+{: .label .label-purple }
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161).    
+{: .warning}
+
+You can retrieve agent information using the `agent_id`.
+
+## Path and HTTP methods
+
+```json
+GET /_plugins/_ml/agents/<agent_id>
+```
+
+## Path parameters
+
+The following table lists the available path parameters. 
+
+| Parameter | Data type | Description |
+| :--- | :--- | :--- |
+| `agent_id` | String | The agent ID of the agent to retrieve. |
+
+
+#### Example request
+
+```json
+GET /_plugins/_ml/agents/N8AE1osB0jLkkocYjz7D
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+```json
+{
+  "name": "Test_Agent_For_RAG",
+  "type": "flow",
+  "description": "this is a test agent",
+  "tools": [
+    {
+      "type": "VectorDBTool",
+      "parameters": {
+        "input": "${parameters.question}",
+        "source_field": """["text"]""",
+        "embedding_field": "embedding",
+        "index": "my_test_data",
+        "model_id": "zBRyYIsBls05QaITo5ex"
+      },
+      "include_output_in_agent_response": false
+    },
+    {
+      "type": "MLModelTool",
+      "description": "A general tool to answer any question",
+      "parameters": {
+        "model_id": "ygAzT40Bdo8gePIqxk0H",
+        "prompt": """
+
+Human:You are a professional data analyst. You will always answer question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say don't know. 
+
+ Context:
+${parameters.VectorDBTool.output}
+
+Human:${parameters.question}
+
+Assistant:"""
+      },
+      "include_output_in_agent_response": false
+    }
+  ],
+  "created_time": 1706821658743,
+  "last_updated_time": 1706821658743
+}
+```
+
+## Response fields
+
+For response field descriptions, see [Register Agent API request fields]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/agent-apis/register-agent#request-fields).
--- a/_ml-commons-plugin/api/agent-apis/index.md
+++ b/_ml-commons-plugin/api/agent-apis/index.md
@ -0,0 +1,25 @@
+---
+layout: default
+title: Agent APIs
+parent: ML Commons APIs
+has_children: true
+has_toc: false
+nav_order: 27
+redirect_from: /ml-commons-plugin/api/agent-apis/
+---
+
+# Agent APIs
+**Introduced 2.12**
+{: .label .label-purple }
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161).    
+{: .warning}
+
+You can automate machine learning (ML) tasks using agents and tools. An _agent_ orchestrates and runs ML models and tools. For more information, see [Agents and tools]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/index/).
+
+ML Commons supports the following agent-level APIs:
+
+- [Register agent]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/agent-apis/register-agent/)
+- [Get agent]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/agent-apis/get-agent/)
+- [Execute agent]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/agent-apis/execute-agent/)
+- [Delete agent]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/agent-apis/delete-agent/)
--- a/_ml-commons-plugin/api/agent-apis/register-agent.md
+++ b/_ml-commons-plugin/api/agent-apis/register-agent.md
@ -0,0 +1,196 @@
+---
+layout: default
+title: Register agent
+parent: Agent APIs
+grand_parent: ML Commons APIs
+nav_order: 10
+---
+
+# Register an agent
+**Introduced 2.12**
+{: .label .label-purple }
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161).    
+{: .warning}
+
+Use this API to register an agent. 
+
+Agents may be of the following types:
+
+- Flow agent
+- Conversational flow agent
+- Conversational agent
+
+For more information about agents, see [Agents and tools]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/index/).
+
+## Path and HTTP methods
+
+```json
+POST /_plugins/_ml/agents/_register
+```
+{% include copy-curl.html %}
+
+## Request fields
+
+The following table lists the available request fields.
+
+Field | Data type | Required/Optional | Agent type | Description
+:---  | :--- | :--- | :--- | :---
+`name`| String | Required | All | The agent name. |
+`type` | String | Required | All | The agent type. Valid values are `flow`, `conversational_flow`, and `conversational`. For more information, see [Agents]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/index/). |
+`description` | String | Optional| All | A description of the agent. |
+`tools` | Array | Optional | All | A list of tools for the agent to execute. 
+`app_type` | String | Optional | All | Specifies an optional agent category. You can then perform operations on all agents in the category. For example, you can delete all messages for RAG agents.
+`memory.type` | String | Optional | `conversational_flow`, `conversational` | Specifies where to store the conversational memory. Currently, the only supported type is `conversation_index` (store the memory in a conversational system index).
+`llm.model_id` | String | Required | `conversational` | The model ID of the LLM to which to send questions.
+`llm.parameters.response_filter` | String | Required | `conversational` | The pattern for parsing the LLM response. For each LLM, you need to provide the field where the response is located. For example, for the Anthropic Claude model, the response is located in the `completion` field, so the pattern is `$.completion`. For OpenAI models, the pattern is `$.choices[0].message.content`.
+`llm.parameters.max_iteration` | Integer | Optional | `conversational` | The maximum number of messages to send to the LLM. Default is `3`.
+
+The `tools` array contains a list of tools for the agent. Each tool contains the following fields.
+
+Field | Data type | Required/Optional | Description
+:---  | :--- | :---
+`name`| String | Optional | The tool name. The tool name defaults to the `type` parameter value. If you need to include multiple tools of the same type in an agent, specify different names for the tools. |
+`type` | String | Required | The tool type. For a list of supported tools, see [Tools]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/tools/index/). 
+`parameters` | Object | Optional | The parameters for this tool. The parameters are highly dependent on the tool type. You can find information about specific tool types in [Tools]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/tools/index/).
+
+#### Example request: Flow agent
+
+```json
+POST /_plugins/_ml/agents/_register
+{
+  "name": "Test_Agent_For_RAG",
+  "type": "flow",
+  "description": "this is a test agent",
+  "tools": [
+    {
+      "name": "vector_tool",
+      "type": "VectorDBTool",
+      "parameters": {
+        "model_id": "zBRyYIsBls05QaITo5ex",
+        "index": "my_test_data",
+        "embedding_field": "embedding",
+        "source_field": [
+          "text"
+        ],
+        "input": "${parameters.question}"
+      }
+    },
+    {
+      "type": "MLModelTool",
+      "description": "A general tool to answer any question",
+      "parameters": {
+        "model_id": "NWR9YIsBUysqmzBdifVJ",
+        "prompt": "\n\nHuman:You are a professional data analyst. You will always answer question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say don't know. \n\n Context:\n${parameters.vector_tool.output}\n\nHuman:${parameters.question}\n\nAssistant:"
+      }
+    }
+  ]
+}
+```
+{% include copy-curl.html %}
+
+#### Example request: Conversational flow agent
+
+```json
+POST /_plugins/_ml/agents/_register
+{
+  "name": "population data analysis agent",
+  "type": "conversational_flow",
+  "description": "This is a demo agent for population data analysis",
+  "app_type": "rag",
+  "memory": {
+    "type": "conversation_index"
+  },
+  "tools": [
+    {
+      "type": "VectorDBTool",
+      "name": "population_knowledge_base",
+      "parameters": {
+        "model_id": "your_text_embedding_model_id",
+        "index": "test_population_data",
+        "embedding_field": "population_description_embedding",
+        "source_field": [
+          "population_description"
+        ],
+        "input": "${parameters.question}"
+      }
+    },
+    {
+      "type": "MLModelTool",
+      "name": "bedrock_claude_model",
+      "description": "A general tool to answer any question",
+      "parameters": {
+        "model_id": "your_LLM_model_id",
+        "prompt": """
+
+Human:You are a professional data analysist. You will always answer question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say don't know. 
+
+Context:
+${parameters.population_knowledge_base.output:-}
+
+${parameters.chat_history:-}
+
+Human:${parameters.question}
+
+Assistant:"""
+      }
+    }
+  ]
+}
+```
+{% include copy-curl.html %}
+
+#### Example request: Conversational agent
+
+```json
+POST /_plugins/_ml/agents/_register
+{
+  "name": "Test_Agent_For_ReAct_ClaudeV2",
+  "type": "conversational",
+  "description": "this is a test agent",
+  "app_type": "my chatbot",
+  "llm": {
+    "model_id": "<llm_model_id>",
+    "parameters": {
+      "max_iteration": 5,
+      "stop_when_no_tool_found": true,
+      "response_filter": "$.completion"
+    }
+  },
+  "memory": {
+    "type": "conversation_index"
+  },
+  "tools": [
+    {
+      "type": "VectorDBTool",
+      "name": "VectorDBTool",
+      "description": "A tool to search opensearch index with natural language quesiotn. If you don't know answer for some question, you should always try to search data with this tool. Action Input: <natrual language question>",
+      "parameters": {
+        "model_id": "<embedding_model_id>",
+        "index": "<your_knn_index>",
+        "embedding_field": "<embedding_filed_name>",
+        "source_field": [
+          "<source_filed>"
+        ],
+        "input": "${parameters.question}"
+      }
+    },
+    {
+      "type": "CatIndexTool",
+      "name": "RetrieveIndexMetaTool",
+      "description": "Use this tool to get OpenSearch index information: (health, status, index, uuid, primary count, replica count, docs.count, docs.deleted, store.size, primary.store.size)."
+    }
+  ]
+}
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+OpenSearch responds with an agent ID that you can use to refer to the agent:
+
+```json
+{
+  "agent_id": "bpV_Zo0BRhAwb9PZqGja"
+}
+```
--- a/_ml-commons-plugin/api/agent-apis/search-agent.md
+++ b/_ml-commons-plugin/api/agent-apis/search-agent.md
@ -0,0 +1,142 @@
+---
+layout: default
+title: Search agent
+parent: Agent APIs
+grand_parent: ML Commons APIs
+nav_order: 30
+---
+
+# Search for an agent
+**Introduced 2.12**
+{: .label .label-purple }
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161).    
+{: .warning}
+
+Use this command to search for agents you've already created. You can provide any OpenSearch search query in the request body.
+
+## Path and HTTP methods
+
+```json
+GET /_plugins/_ml/agents/_search
+POST /_plugins/_ml/agents/_search
+```
+
+#### Example request: Searching for all agents
+
+```json
+POST /_plugins/_ml/agents/_search
+{
+  "query": {
+    "match_all": {}
+  },
+  "size": 1000
+}
+```
+{% include copy-curl.html %}
+
+#### Example request: Searching for agents of a certain type
+
+```json
+POST /_plugins/_ml/agents/_search
+{
+  "query": {
+    "term": {
+      "type": {
+        "value": "flow"
+      }
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+#### Example: Searching for an agent by description
+
+```json
+GET _plugins/_ml/agents/_search
+{
+  "query": {
+    "bool": {
+      "should": [
+        {
+          "match": {
+            "description": "test agent"
+          }
+        }
+      ]
+    }
+  },
+  "size": 1000
+}
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+```json
+{
+  "took": 2,
+  "timed_out": false,
+  "_shards": {
+    "total": 1,
+    "successful": 1,
+    "skipped": 0,
+    "failed": 0
+  },
+  "hits": {
+    "total": {
+      "value": 6,
+      "relation": "eq"
+    },
+    "max_score": 0.15019803,
+    "hits": [
+      {
+        "_index": ".plugins-ml-agent",
+        "_id": "8HXlkI0BfUsSoeNTP_0P",
+        "_version": 1,
+        "_seq_no": 17,
+        "_primary_term": 2,
+        "_score": 0.13904166,
+        "_source": {
+          "created_time": 1707532959502,
+          "last_updated_time": 1707532959502,
+          "name": "Test_Agent_For_RagTool",
+          "description": "this is a test flow agent",
+          "type": "flow",
+          "tools": [
+            {
+              "description": "A description of the tool",
+              "include_output_in_agent_response": false,
+              "type": "RAGTool",
+              "parameters": {
+                "inference_model_id": "gnDIbI0BfUsSoeNT_jAw",
+                "embedding_model_id": "Yg7HZo0B9ggZeh2gYjtu_2",
+                "input": "${parameters.question}",
+                "source_field": """["text"]""",
+                "embedding_field": "embedding",
+                "index": "my_test_data",
+                "query_type": "neural",
+                "prompt": """
+
+Human:You are a professional data analyst. You will always answer question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say don't know. 
+
+ Context:
+${parameters.output_field}
+
+Human:${parameters.question}
+
+Assistant:"""
+              }
+            }
+          ]
+        }
+      }
+    ]
+  }
+}
+```
+
+## Response fields
+
+For response field descriptions, see [Register Agent API request fields]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/agent-apis/register-agent#request-fields).
--- a/_ml-commons-plugin/api/connector-apis/create-connector.md
+++ b/_ml-commons-plugin/api/connector-apis/create-connector.md
@ -2,7 +2,7 @@
 layout: default
 title: Create connector
 parent: Connector APIs
-grand_parent: ML Commons API
+grand_parent: ML Commons APIs
 nav_order: 10
 ---

@ -16,6 +16,10 @@ Creates a standalone connector. For more information, see [Connectors]({{site.ur
 POST /_plugins/_ml/connectors/_create
 ```

+## Request fields
+
+For a list of request fields, see [Blueprint configuration parameters]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/blueprints#configuration-parameters).
+
 #### Example request

 To create a standalone connector, send a request to the `connectors/_create` endpoint and provide all of the parameters described in [Connector blueprints]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/blueprints/):
--- a/_ml-commons-plugin/api/connector-apis/delete-connector.md
+++ b/_ml-commons-plugin/api/connector-apis/delete-connector.md
@ -2,7 +2,7 @@
 layout: default
 title: Delete connector
 parent: Connector APIs
-grand_parent: ML Commons API
+grand_parent: ML Commons APIs
 nav_order: 30
 ---

--- a/_ml-commons-plugin/api/connector-apis/get-connector.md
+++ b/_ml-commons-plugin/api/connector-apis/get-connector.md
@ -2,21 +2,12 @@
 layout: default
 title: Get connector
 parent: Connector APIs
-grand_parent: ML Commons API
+grand_parent: ML Commons APIs
 nav_order: 20
 ---

 # Get a connector

-Use the `_search` endpoint to search for a connector.
-
-To retrieve information about a connector, you can:
-
- [Get a connector by ID](#get-a-connector-by-id)
- [Search for a connector](#search-for-a-connector)
-
-## Get a connector by ID
-
 This API retrieves a connector by its ID.

 ### Path and HTTP methods
@ -62,160 +53,3 @@ GET /_plugins/_ml/connectors/N8AE1osB0jLkkocYjz7D
  ]
 }
 ```
-
-## Search for a connector
-
-This API searches for matching connectors using a query.
-
-### Path and HTTP methods
-
-```json
-POST /_plugins/_ml/connectors/_search
-GET /_plugins/_ml/connectors/_search
-```
-
-#### Example request
-
-```json
-POST /_plugins/_ml/connectors/_search
-{
-  "query": {
-    "match_all": {}
-  },
-  "size": 1000
-}
-```
-{% include copy-curl.html %}
-
-#### Example response
-
-```json
-{
-  "took" : 1,
-  "timed_out" : false,
-  "_shards" : {
-    "total" : 1,
-    "successful" : 1,
-    "skipped" : 0,
-    "failed" : 0
-  },
-  "hits" : {
-    "total" : {
-      "value" : 3,
-      "relation" : "eq"
-    },
-    "max_score" : 1.0,
-    "hits" : [
-      {
-        "_index" : ".plugins-ml-connector",
-        "_id" : "7W-d74sBPD67W0wkEZdE",
-        "_version" : 1,
-        "_seq_no" : 2,
-        "_primary_term" : 1,
-        "_score" : 1.0,
-        "_source" : {
-          "protocol" : "aws_sigv4",
-          "name" : "BedRock claude Connector",
-          "description" : "The connector to BedRock service for claude model",
-          "version" : "1",
-          "parameters" : {
-            "endpoint" : "bedrock.us-east-1.amazonaws.com",
-            "content_type" : "application/json",
-            "auth" : "Sig_V4",
-            "max_tokens_to_sample" : "8000",
-            "service_name" : "bedrock",
-            "temperature" : "1.0E-4",
-            "response_filter" : "$.completion",
-            "region" : "us-east-1",
-            "anthropic_version" : "bedrock-2023-05-31"
-          },
-          "actions" : [
-            {
-              "headers" : {
-                "x-amz-content-sha256" : "required",
-                "content-type" : "application/json"
-              },
-              "method" : "POST",
-              "request_body" : "{\"prompt\":\"${parameters.prompt}\", \"max_tokens_to_sample\":${parameters.max_tokens_to_sample}, \"temperature\":${parameters.temperature},  \"anthropic_version\":\"${parameters.anthropic_version}\" }",
-              "action_type" : "PREDICT",
-              "url" : "https://bedrock.us-east-1.amazonaws.com/model/anthropic.claude-v2/invoke"
-            }
-          ]
-        }
-      },
-      {
-        "_index" : ".plugins-ml-connector",
-        "_id" : "9W-d74sBPD67W0wk4pf_",
-        "_version" : 1,
-        "_seq_no" : 3,
-        "_primary_term" : 1,
-        "_score" : 1.0,
-        "_source" : {
-          "protocol" : "aws_sigv4",
-          "name" : "BedRock claude Connector",
-          "description" : "The connector to BedRock service for claude model",
-          "version" : "1",
-          "parameters" : {
-            "endpoint" : "bedrock.us-east-1.amazonaws.com",
-            "content_type" : "application/json",
-            "auth" : "Sig_V4",
-            "max_tokens_to_sample" : "8000",
-            "service_name" : "bedrock",
-            "temperature" : "1.0E-4",
-            "response_filter" : "$.completion",
-            "region" : "us-east-1",
-            "anthropic_version" : "bedrock-2023-05-31"
-          },
-          "actions" : [
-            {
-              "headers" : {
-                "x-amz-content-sha256" : "required",
-                "content-type" : "application/json"
-              },
-              "method" : "POST",
-              "request_body" : "{\"prompt\":\"${parameters.prompt}\", \"max_tokens_to_sample\":${parameters.max_tokens_to_sample}, \"temperature\":${parameters.temperature},  \"anthropic_version\":\"${parameters.anthropic_version}\" }",
-              "action_type" : "PREDICT",
-              "url" : "https://bedrock.us-east-1.amazonaws.com/model/anthropic.claude-v2/invoke"
-            }
-          ]
-        }
-      },
-      {
-        "_index" : ".plugins-ml-connector",
-        "_id" : "rm_u8osBPD67W0wkCpsG",
-        "_version" : 1,
-        "_seq_no" : 4,
-        "_primary_term" : 1,
-        "_score" : 1.0,
-        "_source" : {
-          "protocol" : "aws_sigv4",
-          "name" : "BedRock Claude-Instant v1",
-          "description" : "Bedrock connector for Claude Instant testing",
-          "version" : "1",
-          "parameters" : {
-            "endpoint" : "bedrock.us-east-1.amazonaws.com",
-            "content_type" : "application/json",
-            "auth" : "Sig_V4",
-            "service_name" : "bedrock",
-            "region" : "us-east-1",
-            "anthropic_version" : "bedrock-2023-05-31"
-          },
-          "actions" : [
-            {
-              "headers" : {
-                "x-amz-content-sha256" : "required",
-                "content-type" : "application/json"
-              },
-              "method" : "POST",
-              "request_body" : "{\"prompt\":\"${parameters.prompt}\", \"max_tokens_to_sample\":${parameters.max_tokens_to_sample}, \"temperature\":${parameters.temperature},  \"anthropic_version\":\"${parameters.anthropic_version}\" }",
-              "action_type" : "PREDICT",
-              "url" : "https://bedrock-runtime.us-east-1.amazonaws.com/model/anthropic.claude-instant-v1/invoke"
-            }
-          ]
-        }
-      }
-    ]
-  }
-}
-```
-
--- a/_ml-commons-plugin/api/connector-apis/index.md
+++ b/_ml-commons-plugin/api/connector-apis/index.md
@ -1,8 +1,9 @@
 ---
 layout: default
 title: Connector APIs
-parent: ML Commons API
+parent: ML Commons APIs
 has_children: true
+has_toc: false
 nav_order: 25
 ---

--- a/_ml-commons-plugin/api/connector-apis/search-connector.md
+++ b/_ml-commons-plugin/api/connector-apis/search-connector.md
@ -0,0 +1,164 @@
+---
+layout: default
+title: Search connector
+parent: Connector APIs
+grand_parent: ML Commons APIs
+nav_order: 25
+---
+
+# Search for a connector
+
+Use the `_search` endpoint to search for a connector. This API uses a query to search for matching connectors.
+
+## Path and HTTP methods
+
+```json
+POST /_plugins/_ml/connectors/_search
+GET /_plugins/_ml/connectors/_search
+```
+
+#### Example request
+
+```json
+POST /_plugins/_ml/connectors/_search
+{
+  "query": {
+    "match_all": {}
+  },
+  "size": 1000
+}
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+```json
+{
+  "took" : 1,
+  "timed_out" : false,
+  "_shards" : {
+    "total" : 1,
+    "successful" : 1,
+    "skipped" : 0,
+    "failed" : 0
+  },
+  "hits" : {
+    "total" : {
+      "value" : 3,
+      "relation" : "eq"
+    },
+    "max_score" : 1.0,
+    "hits" : [
+      {
+        "_index" : ".plugins-ml-connector",
+        "_id" : "7W-d74sBPD67W0wkEZdE",
+        "_version" : 1,
+        "_seq_no" : 2,
+        "_primary_term" : 1,
+        "_score" : 1.0,
+        "_source" : {
+          "protocol" : "aws_sigv4",
+          "name" : "BedRock claude Connector",
+          "description" : "The connector to BedRock service for claude model",
+          "version" : "1",
+          "parameters" : {
+            "endpoint" : "bedrock.us-east-1.amazonaws.com",
+            "content_type" : "application/json",
+            "auth" : "Sig_V4",
+            "max_tokens_to_sample" : "8000",
+            "service_name" : "bedrock",
+            "temperature" : "1.0E-4",
+            "response_filter" : "$.completion",
+            "region" : "us-east-1",
+            "anthropic_version" : "bedrock-2023-05-31"
+          },
+          "actions" : [
+            {
+              "headers" : {
+                "x-amz-content-sha256" : "required",
+                "content-type" : "application/json"
+              },
+              "method" : "POST",
+              "request_body" : "{\"prompt\":\"${parameters.prompt}\", \"max_tokens_to_sample\":${parameters.max_tokens_to_sample}, \"temperature\":${parameters.temperature},  \"anthropic_version\":\"${parameters.anthropic_version}\" }",
+              "action_type" : "PREDICT",
+              "url" : "https://bedrock.us-east-1.amazonaws.com/model/anthropic.claude-v2/invoke"
+            }
+          ]
+        }
+      },
+      {
+        "_index" : ".plugins-ml-connector",
+        "_id" : "9W-d74sBPD67W0wk4pf_",
+        "_version" : 1,
+        "_seq_no" : 3,
+        "_primary_term" : 1,
+        "_score" : 1.0,
+        "_source" : {
+          "protocol" : "aws_sigv4",
+          "name" : "BedRock claude Connector",
+          "description" : "The connector to BedRock service for claude model",
+          "version" : "1",
+          "parameters" : {
+            "endpoint" : "bedrock.us-east-1.amazonaws.com",
+            "content_type" : "application/json",
+            "auth" : "Sig_V4",
+            "max_tokens_to_sample" : "8000",
+            "service_name" : "bedrock",
+            "temperature" : "1.0E-4",
+            "response_filter" : "$.completion",
+            "region" : "us-east-1",
+            "anthropic_version" : "bedrock-2023-05-31"
+          },
+          "actions" : [
+            {
+              "headers" : {
+                "x-amz-content-sha256" : "required",
+                "content-type" : "application/json"
+              },
+              "method" : "POST",
+              "request_body" : "{\"prompt\":\"${parameters.prompt}\", \"max_tokens_to_sample\":${parameters.max_tokens_to_sample}, \"temperature\":${parameters.temperature},  \"anthropic_version\":\"${parameters.anthropic_version}\" }",
+              "action_type" : "PREDICT",
+              "url" : "https://bedrock.us-east-1.amazonaws.com/model/anthropic.claude-v2/invoke"
+            }
+          ]
+        }
+      },
+      {
+        "_index" : ".plugins-ml-connector",
+        "_id" : "rm_u8osBPD67W0wkCpsG",
+        "_version" : 1,
+        "_seq_no" : 4,
+        "_primary_term" : 1,
+        "_score" : 1.0,
+        "_source" : {
+          "protocol" : "aws_sigv4",
+          "name" : "BedRock Claude-Instant v1",
+          "description" : "Bedrock connector for Claude Instant testing",
+          "version" : "1",
+          "parameters" : {
+            "endpoint" : "bedrock.us-east-1.amazonaws.com",
+            "content_type" : "application/json",
+            "auth" : "Sig_V4",
+            "service_name" : "bedrock",
+            "region" : "us-east-1",
+            "anthropic_version" : "bedrock-2023-05-31"
+          },
+          "actions" : [
+            {
+              "headers" : {
+                "x-amz-content-sha256" : "required",
+                "content-type" : "application/json"
+              },
+              "method" : "POST",
+              "request_body" : "{\"prompt\":\"${parameters.prompt}\", \"max_tokens_to_sample\":${parameters.max_tokens_to_sample}, \"temperature\":${parameters.temperature},  \"anthropic_version\":\"${parameters.anthropic_version}\" }",
+              "action_type" : "PREDICT",
+              "url" : "https://bedrock-runtime.us-east-1.amazonaws.com/model/anthropic.claude-instant-v1/invoke"
+            }
+          ]
+        }
+      }
+    ]
+  }
+}
+```
+
--- a/_ml-commons-plugin/api/connector-apis/update-connector.md
+++ b/_ml-commons-plugin/api/connector-apis/update-connector.md
@ -0,0 +1,70 @@
+---
+layout: default
+title: Update connector
+parent: Connector APIs
+grand_parent: ML Commons APIs
+nav_order: 30
+---
+
+# Update a connector
+**Introduced 2.12**
+{: .label .label-purple }
+
+Use this API to update a standalone connector based on the `model_ID`. To update a connector created within a specific model, use the [Update Model API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/update-model/).
+
+Before updating a standalone connector, you must undeploy all models that use the connector. For information about undeploying a model, see [Undeploy Model API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/undeploy-model/).
+{: .note}
+
+Using this API, you can update the connector fields listed in the [Request fields](#request-fields) section and add optional fields to your connector. You cannot delete fields from a connector using this API.
+
+For information about user access for this API, see [Model access control considerations]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/index/#model-access-control-considerations).
+
+## Path and HTTP methods
+
+```json
+PUT /_plugins/_ml/connectors/<connector_id>
+```
+
+## Request fields
+
+The following table lists the updatable fields. For more information about all connector fields, see [Blueprint configuration parameters]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/blueprints#configuration-parameters).
+
+| Field | Data type | Description |
+| :---  | :--- | :--- |
+| `name` | String | The name of the connector. |
+| `description` | String | A description of the connector. |
+| `version` | Integer | The version of the connector. |
+| `protocol` | String | The protocol for the connection. For AWS services, such as Amazon SageMaker and Amazon Bedrock, use `aws_sigv4`. For all other services, use `http`. |
+| `parameters` | JSON object | The default connector parameters, including `endpoint` and `model`. Any parameters included in this field can be overridden by parameters specified in a predict request. |
+| `credential` | JSON object | Defines any credential variables required in order to connect to your chosen endpoint. ML Commons uses **AES/GCM/NoPadding** symmetric encryption to encrypt your credentials. When the connection to the cluster first starts, OpenSearch creates a random 32-byte encryption key that persists in OpenSearch's system index. Therefore, you do not need to manually set the encryption key. |
+| `actions` | JSON array | Defines which actions can run within the connector. If you're an administrator creating a connection, add the [blueprint]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/blueprints/) for your desired connection. |
+| `backend_roles` | JSON array | A list of OpenSearch backend roles. For more information about setting up backend roles, see [Assigning backend roles to users]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-access-control#assigning-backend-roles-to-users). |
+| `access_mode` | String | Sets the access mode for the model, either `public`, `restricted`, or `private`. Default is `private`. For more information about `access_mode`, see [Model groups]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-access-control#model-groups). |
+
+#### Example request
+
+```json
+PUT /_plugins/_ml/connectors/u3DEbI0BfUsSoeNTti-1
+{
+  "description": "The connector to public OpenAI model service for GPT 3.5"
+}
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+```json
+{
+  "_index": ".plugins-ml-connector",
+  "_id": "u3DEbI0BfUsSoeNTti-1",
+  "_version": 2,
+  "result": "updated",
+  "_shards": {
+    "total": 1,
+    "successful": 1,
+    "failed": 0
+  },
+  "_seq_no": 2,
+  "_primary_term": 1
+}
+```
--- a/_ml-commons-plugin/api/controller-apis/create-controller.md
+++ b/_ml-commons-plugin/api/controller-apis/create-controller.md
@ -0,0 +1,188 @@
+---
+layout: default
+title: Create controller
+parent: Controller APIs
+grand_parent: ML Commons APIs
+nav_order: 10
+---
+
+# Create or update a controller
+**Introduced 2.12**
+{: .label .label-purple }
+
+Use this API to create or update a controller for a model. A model may be shared by multiple users. A controller sets rate limits for the number of [Predict API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/predict/) calls users can make on the model. A controller consists of a set of rate limiters for different users.  
+
+You can only create a controller for a model once you have registered the model and received a model ID.
+{: .tip}
+
+The POST method creates a new controller. The PUT method updates an existing controller. 
+
+To learn how to set rate limits at the model level for all users, see [Update Model API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/update-model/). The rate limit is set to either the model-level limit or the user-level limit, whichever is more restrictive. For example, if the model-level limit is 2 requests per minute and the user-level limit is 4 requests per minute, the overall limit will be set to 2 requests per minute.
+
+## Path and HTTP methods
+
+```json
+POST /_plugins/_ml/controllers/<model_id>
+PUT /_plugins/_ml/controllers/<model_id>
+```
+{% include copy-curl.html %}
+
+## Path parameters
+
+The following table lists the available path parameters.
+
+Parameter | Data type | Description
+:--- | :--- | :---
+`model_id` | String | The model ID of the model for which you want to set rate limits. Required.
+
+## Request fields
+
+The following table lists the available request fields.
+
+Field | Data type | Required/Optional | Description
+:---  | :--- | :--- | :---
+`user_rate_limiter`| Object | Required | Limits the number of times users can call the Predict API on the model. For more information, see [Rate limiting inference calls]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#rate-limiting-inference-calls).
+
+The `user_rate_limiter` object contains an object for each user, specified by username. The user object contains the following fields.
+
+Field | Data type | Description
+:---  | :--- | :--- 
+`limit` | Integer | The maximum number of times the user can call the Predict API on the model per `unit` of time. By default, there is no limit on the number of Predict API calls. Once you set a limit, you cannot reset it to no limit. As an alternative, you can specify a high limit value and a small time unit, for example, 1 request per nanosecond.
+`unit` | String | The unit of time for the rate limiter. Valid values are `DAYS`, `HOURS`, `MICROSECONDS`, `MILLISECONDS`, `MINUTES`, `NANOSECONDS`, and `SECONDS`.
+
+
+#### Example request: Create a controller
+
+```json
+POST _plugins/_ml/controllers/mtw-ZI0B_1JGmyB068C0
+{
+  "user_rate_limiter": {
+    "user1": {
+      "limit": 4,
+      "unit": "MINUTES"
+    },
+    "user2": {
+      "limit": 4,
+      "unit": "MINUTES"
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+```json
+{
+  "model_id": "mtw-ZI0B_1JGmyB068C0",
+  "status": "CREATED"
+}
+```
+
+#### Example request: Update the rate limit for one user
+
+To update the limit for `user1`, send a PUT request and specify the updated information:
+
+```json
+PUT _plugins/_ml/controllers/mtw-ZI0B_1JGmyB068C0
+{
+  "user_rate_limiter": {
+    "user1": {
+      "limit": 6,
+      "unit": "MINUTES"
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+This will update only the `user1` object, leaving all other user limits intact:
+
+```json
+{
+  "model_id": "mtw-ZI0B_1JGmyB068C0",
+  "user_rate_limiter": {
+    "user1": {
+      "limit": "6",
+      "unit": "MINUTES"
+    },
+    "user2": {
+      "limit": "4",
+      "unit": "MINUTES"
+    }
+  }
+}
+```
+
+#### Example response
+
+```json
+{
+  "_index": ".plugins-ml-controller",
+  "_id": "mtw-ZI0B_1JGmyB068C0",
+  "_version": 2,
+  "result": "updated",
+  "forced_refresh": true,
+  "_shards": {
+    "total": 2,
+    "successful": 2,
+    "failed": 0
+  },
+  "_seq_no": 1,
+  "_primary_term": 1
+}
+```
+
+#### Example request: Delete the rate limit for one user
+
+To delete the limit for `user2`, send a POST request containing all other users' limits: 
+
+```json
+POST _plugins/_ml/controllers/mtw-ZI0B_1JGmyB068C0
+{
+  "user_rate_limiter": {
+    "user1": {
+      "limit": 6,
+      "unit": "MINUTES"
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+This will overwrite the controller with the new information:
+
+```json
+{
+  "model_id": "mtw-ZI0B_1JGmyB068C0",
+  "user_rate_limiter": {
+    "user1": {
+      "limit": "6",
+      "unit": "MINUTES"
+    }
+  }
+}
+```
+
+#### Example response
+
+```json
+{
+  "_index": ".plugins-ml-controller",
+  "_id": "mtw-ZI0B_1JGmyB068C0",
+  "_version": 2,
+  "result": "updated",
+  "forced_refresh": true,
+  "_shards": {
+    "total": 2,
+    "successful": 2,
+    "failed": 0
+  },
+  "_seq_no": 1,
+  "_primary_term": 1
+}
+```
+
+## Required permissions
+
+If you use the Security plugin, make sure you have the appropriate permissions: `cluster:admin/opensearch/ml/controllers/create` and `cluster:admin/opensearch/ml/controllers/update`.
--- a/_ml-commons-plugin/api/controller-apis/delete-controller.md
+++ b/_ml-commons-plugin/api/controller-apis/delete-controller.md
@ -0,0 +1,56 @@
+---
+layout: default
+title: Delete controller
+parent: Controller APIs
+grand_parent: ML Commons APIs
+nav_order: 50
+---
+
+# Delete a controller
+**Introduced 2.12**
+{: .label .label-purple }
+
+Use this API to delete a controller for a model based on the `model_id`.
+
+## Path and HTTP methods
+
+```json
+DELETE /_plugins/_ml/controllers/<model_id>
+```
+
+## Path parameters
+
+The following table lists the available path parameters. 
+
+| Parameter | Data type | Description |
+| :--- | :--- | :--- |
+| `model_id` | String | The model ID of the model for which to delete the controller. |
+
+#### Example request
+
+```json
+DELETE /_plugins/_ml/controllers/MzcIJX8BA7mbufL6DOwl
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+```json
+{
+  "_index" : ".plugins-ml-controller",
+  "_id" : "MzcIJX8BA7mbufL6DOwl",
+  "_version" : 2,
+  "result" : "deleted",
+  "_shards" : {
+    "total" : 2,
+    "successful" : 2,
+    "failed" : 0
+  },
+  "_seq_no" : 27,
+  "_primary_term" : 18
+}
+```
+
+## Required permissions
+
+If you use the Security plugin, make sure you have the appropriate permissions: `cluster:admin/opensearch/ml/controllers/delete`.
--- a/_ml-commons-plugin/api/controller-apis/get-controller.md
+++ b/_ml-commons-plugin/api/controller-apis/get-controller.md
@ -0,0 +1,78 @@
+---
+layout: default
+title: Get controller
+parent: Controller APIs
+grand_parent: ML Commons APIs
+nav_order: 20
+---
+
+# Get a controller
+**Introduced 2.12**
+{: .label .label-purple }
+
+Use this API to retrieve information about a controller for a model by model ID.
+
+### Path and HTTP methods
+
+```json
+GET /_plugins/_ml/controllers/<model_id>
+```
+
+## Path parameters
+
+The following table lists the available path parameters. 
+
+| Parameter | Data type | Description |
+| :--- | :--- | :--- |
+| `model_id` | String | The model ID of the model for which to retrieve the controller. |
+
+#### Example request
+
+```json
+GET /_plugins/_ml/controllers/T_S-cY0BKCJ3ot9qr0aP
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+```json
+{
+  "model_id": "T_S-cY0BKCJ3ot9qr0aP",
+  "user_rate_limiter": {
+    "user1": {
+      "limit": "4",
+      "unit": "MINUTES"
+    },
+    "user2": {
+      "limit": "4",
+      "unit": "MINUTES"
+    }
+  }
+}
+```
+
+If there is no controller defined for the model, OpenSearch returns an error:
+
+```json
+{
+  "error": {
+    "root_cause": [
+      {
+        "type": "status_exception",
+        "reason": "Failed to find model controller with the provided model ID: T_S-cY0BKCJ3ot9qr0aP"
+      }
+    ],
+    "type": "status_exception",
+    "reason": "Failed to find model controller with the provided model ID: T_S-cY0BKCJ3ot9qr0aP"
+  },
+  "status": 404
+}
+```
+
+## Response fields
+
+For response field descriptions, see [Create Controller API request fields]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/controller-apis/create-controller#request-fields).
+
+## Required permissions
+
+If you use the Security plugin, make sure you have the appropriate permissions: `cluster:admin/opensearch/ml/controllers/get`.
--- a/_ml-commons-plugin/api/controller-apis/index.md
+++ b/_ml-commons-plugin/api/controller-apis/index.md
@ -0,0 +1,25 @@
+---
+layout: default
+title: Controller APIs
+parent: ML Commons APIs
+has_children: true
+has_toc: false
+nav_order: 29
+redirect_from: /ml-commons-plugin/api/controller-apis/
+---
+
+# Controller APIs
+**Introduced 2.12**
+{: .label .label-purple }
+
+You can configure a rate limit for a specific user or users of a model by calling the Controller APIs. 
+
+ML Commons supports the following controller-level APIs:
+
+- [Create or update controller]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/controller-apis/create-controller/)
+- [Get controller]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/controller-apis/get-controller/)
+- [Delete controller]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/controller-apis/delete-controller/)
+
+## Required permissions
+
+To call the Controller APIs, you must have `cluster:admin/opensearch/ml/controllers/` permissions. Links to more information about each Controller API are provided in the preceding section.
--- a/_ml-commons-plugin/api/execute-algorithm.md
+++ b/_ml-commons-plugin/api/execute-algorithm.md
@ -1,7 +1,7 @@
 ---
 layout: default
 title: Execute algorithm 
-parent: ML Commons API
+parent: ML Commons APIs
 nav_order: 30
 ---

--- a/_ml-commons-plugin/api/index.md
+++ b/_ml-commons-plugin/api/index.md
@ -1,6 +1,6 @@
 ---
 layout: default
-title: ML Commons API
+title: ML Commons APIs
 has_children: false
 nav_order: 130
 has_children: true
@ -9,9 +9,9 @@ redirect_from:
  - /ml-commons-plugin/api/
 ---

-# ML Commons API 
+# ML Commons APIs 

-ML Commons supports the following API types:
+ML Commons supports the following APIs:

 - [Model APIs]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/index/)
 - [Model group APIs]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-group-apis/index/)
--- a/_ml-commons-plugin/api/memory-apis/create-memory.md
+++ b/_ml-commons-plugin/api/memory-apis/create-memory.md
@ -0,0 +1,61 @@
+---
+layout: default
+title: Create or update memory
+parent: Memory APIs
+grand_parent: ML Commons APIs
+nav_order: 10
+---
+
+# Create or update a memory
+**Introduced 2.12**
+{: .label .label-purple }
+
+Use this API to create or update a conversational memory for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). A memory stores conversation history for the current conversation.
+
+Once a memory is created, you'll provide its `memory_id` to other APIs.
+
+The POST method creates a new memory. The PUT method updates an existing memory.
+
+When the Security plugin is enabled, all memories exist in a `private` security mode. Only the user who created a memory can interact with that memory and its messages.
+{: .important}
+
+## Path and HTTP methods
+
+```json
+POST /_plugins/_ml/memory/
+PUT /_plugins/_ml/memory/<memory_id>
+```
+
+## Path parameters
+
+The following table lists the available path parameters.
+
+Parameter | Data type | Description
+:--- | :--- | :---
+`memory_id` | String | The ID of the memory to be updated. Required for the PUT method.
+
+## Request fields
+
+The following table lists the available request fields.
+
+Field | Data type | Required/Optional | Description
+:--- | :--- | :--- | :---
+`name` | String | Optional | The name of the memory.
+
+#### Example request
+
+```json
+POST /_plugins/_ml/memory/
+{
+  "name": "Conversation for a RAG pipeline"
+}
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+```json
+{
+  "memory_id": "gW8Aa40BfUsSoeNTvOKI"
+}
+```
--- a/_ml-commons-plugin/api/memory-apis/create-message.md
+++ b/_ml-commons-plugin/api/memory-apis/create-message.md
@ -0,0 +1,173 @@
+---
+layout: default
+title: Create or update message
+parent: Memory APIs
+grand_parent: ML Commons APIs
+nav_order: 40
+---
+
+# Create or update a message
+**Introduced 2.12**
+{: .label .label-purple }
+
+Use this API to create or update a message within a conversational memory for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). A memory stores conversation history for the current conversation. A message represents one question/answer pair within a conversation.
+
+Once a message is created, you'll provide its `message_id` to other APIs.
+
+The POST method creates a new message. The PUT method updates an existing message.
+
+You can only update the `additional_info` field of a message.
+{: .note}
+
+When the Security plugin is enabled, all memories exist in a `private` security mode. Only the user who created a memory can interact with that memory and its messages.
+{: .important}
+
+## Path and HTTP methods
+
+```json
+POST /_plugins/_ml/memory/<memory_id>/messages
+PUT /_plugins/_ml/memory/message/<message_id>
+```
+
+## Path parameters
+
+The following table lists the available path parameters.
+
+Parameter | Data type | Description
+:--- | :--- | :---
+`memory_id` | String | The ID of the memory to which to add the message. Required for the POST method.
+`message_id` | String | The ID of the message to be updated. Required for the PUT method.
+
+## Request fields
+
+The following table lists the available request fields.
+
+Field | Data type | Required/Optional | Updatable | Description
+:--- | :--- | :--- | :--- | :---
+| `input` | String | Optional | No | The question (human input) in the message. |
+| `prompt_template` | String | Optional | No | The prompt template that was used for the message. The template may contain instructions or examples that were sent to the large language model. |
+| `response` | String | Optional | No | The answer (generative AI output) to the question. |
+| `origin` | String | Optional | No | The name of the AI or other system that generated the response. |
+| `additional_info` | Object | Optional | Yes | Any other information that was sent to the `origin`. |
+
+#### Example request: Create a message
+
+```json
+POST /_plugins/_ml/memory/SXA2cY0BfUsSoeNTz-8m/messages
+{
+    "input": "How do I make an interaction?",
+    "prompt_template": "Hello OpenAI, can you answer this question?",
+    "response": "Hello, this is OpenAI. Here is the answer to your question.",
+    "origin": "MyFirstOpenAIWrapper",
+    "additional_info": {
+      "suggestion": "api.openai.com"
+    }
+}
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+```json
+{
+  "memory_id": "WnA3cY0BfUsSoeNTI-_J"
+}
+```
+
+#### Example request: Add a field to `additional_info`
+
+```json
+PUT /_plugins/_ml/memory/message/WnA3cY0BfUsSoeNTI-_J
+{
+  "additional_info": {
+    "feedback": "positive"
+  }
+}
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+```json
+{
+  "_index": ".plugins-ml-memory-message",
+  "_id": "WnA3cY0BfUsSoeNTI-_J",
+  "_version": 2,
+  "result": "updated",
+  "forced_refresh": true,
+  "_shards": {
+    "total": 1,
+    "successful": 1,
+    "failed": 0
+  },
+  "_seq_no": 45,
+  "_primary_term": 1
+}
+```
+
+The updated message contains an additional `feedback` field:
+
+```json
+{
+  "memory_id": "SXA2cY0BfUsSoeNTz-8m",
+  "message_id": "WnA3cY0BfUsSoeNTI-_J",
+  "create_time": "2024-02-03T23:04:15.554370024Z",
+  "input": "How do I make an interaction?",
+  "prompt_template": "Hello OpenAI, can you answer this question?",
+  "response": "Hello, this is OpenAI. Here is the answer to your question.",
+  "origin": "MyFirstOpenAIWrapper",
+  "additional_info": {
+    "feedback": "positive",
+    "suggestion": "api.openai.com"
+  }
+}
+```
+
+#### Example request: Change a field in `additional_info`
+
+```json
+PUT /_plugins/_ml/memory/message/WnA3cY0BfUsSoeNTI-_J
+{
+  "additional_info": {
+    "feedback": "negative"
+  }
+}
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+```json
+{
+  "_index": ".plugins-ml-memory-message",
+  "_id": "WnA3cY0BfUsSoeNTI-_J",
+  "_version": 3,
+  "result": "updated",
+  "forced_refresh": true,
+  "_shards": {
+    "total": 1,
+    "successful": 1,
+    "failed": 0
+  },
+  "_seq_no": 46,
+  "_primary_term": 1
+}
+```
+
+The updated message contains the updated `feedback` field:
+
+```json
+{
+  "memory_id": "SXA2cY0BfUsSoeNTz-8m",
+  "message_id": "WnA3cY0BfUsSoeNTI-_J",
+  "create_time": "2024-02-03T23:04:15.554370024Z",
+  "input": "How do I make an interaction?",
+  "prompt_template": "Hello OpenAI, can you answer this question?",
+  "response": "Hello, this is OpenAI. Here is the answer to your question.",
+  "origin": "MyFirstOpenAIWrapper",
+  "additional_info": {
+    "feedback": "negative",
+    "suggestion": "api.openai.com"
+  }
+}
+```
--- a/_ml-commons-plugin/api/memory-apis/delete-memory.md
+++ b/_ml-commons-plugin/api/memory-apis/delete-memory.md
@ -0,0 +1,45 @@
+---
+layout: default
+title: Delete memory
+parent: Memory APIs
+grand_parent: ML Commons APIs
+nav_order: 30
+---
+
+# Delete a memory
+**Introduced 2.12**
+{: .label .label-purple }
+
+Use this API to delete a memory based on the `memory_id`.
+
+When the Security plugin is enabled, all memories exist in a `private` security mode. Only the user who created a memory can interact with that memory and its messages.
+{: .important}
+
+## Path and HTTP methods
+
+```json
+DELETE /_plugins/_ml/memory/<memory_id>
+```
+
+## Path parameters
+
+The following table lists the available path parameters.
+
+Parameter | Data type | Description
+:--- | :--- | :---
+`memory_id` | String | The ID of the memory to be deleted. 
+
+#### Example request
+
+```json
+DELETE /_plugins/_ml/memory/MzcIJX8BA7mbufL6DOwl
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+```json
+{
+  "success": true
+}
+```
--- a/_ml-commons-plugin/api/memory-apis/get-memory.md
+++ b/_ml-commons-plugin/api/memory-apis/get-memory.md
@ -0,0 +1,133 @@
+---
+layout: default
+title: Get memory
+parent: Memory APIs
+grand_parent: ML Commons APIs
+nav_order: 20
+---
+
+# Get a memory
+**Introduced 2.12**
+{: .label .label-purple }
+
+Use this API to retrieve a conversational memory for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). 
+
+To retrieve memory information, you can:
+
+- [Get a memory by ID](#get-a-memory-by-id).
+- [Get all memories](#get-all-memories).
+
+To retrieve message information for a memory, you can:
+
+- [Get all messages within a memory]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/memory-apis/get-message#get-all-messages-within-a-memory). 
+- [Search for messages within a memory]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/memory-apis/search-message/).
+
+When the Security plugin is enabled, all memories exist in a `private` security mode. Only the user who created a memory can interact with that memory and its messages.
+{: .important}
+
+## Get a memory by ID
+
+You can retrieve memory information by using the `memory_id`. The response includes all messages within the memory.
+
+### Path and HTTP methods
+
+```json
+GET /_plugins/_ml/memory/<memory_id>
+```
+### Path parameters
+
+The following table lists the available path parameters.
+
+Parameter | Data type | Description
+:--- | :--- | :---
+`memory_id` | String | The ID of the memory to retrieve.
+
+#### Example request
+
+```json
+GET /_plugins/_ml/memory/N8AE1osB0jLkkocYjz7D
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+```json
+{
+  "memory_id": "gW8Aa40BfUsSoeNTvOKI",
+  "create_time": "2024-02-02T18:07:06.887061463Z",
+  "updated_time": "2024-02-02T19:01:32.121444968Z",
+  "name": "Conversation for a RAG pipeline",
+  "user": "admin"
+}
+```
+
+## Get all memories
+
+Use this command to get all memories.
+
+### Path and HTTP methods
+
+```json
+GET /_plugins/_ml/memory
+```
+
+### Query parameters
+
+Use the following query parameters to customize your results. All query parameters are optional.
+
+Parameter | Data type | Description
+:--- | :--- | :---
+`max_results` | Integer | The maximum number of results to return. If there are fewer memories than the number set in `max_results`, the response returns only the number of memories that exist. Default is `10`.
+`next_token` | Integer | The index of the first memory in the sorted list of memories to return. Memories are ordered by `create_time`. For example, if memories A, B, and C exist, `next_token=1` returns memories B and C. Default is `0` (return all memories).
+
+### Paginating results
+
+The `next_token` parameter provides the ordered position of the first memory within the sorted list of memories to return in the results. When a memory is added between subsequent GET Memory calls, one of the listed memories will be duplicated in the results. For example, suppose the current ordered list of memories is `BCDEF`, where `B` is the memory created most recently. When you call the Get Memory API with `next_token=0` and `max_results=3`, the API returns `BCD`. Suppose you then create another memory A. The memory list now appears as `ABCDEF`. The next time you call the Get Memory API with `next_token=3` and `max_results=3`, you'll receive `DEF` in the results. Notice that `D` will be returned in the first and second batches of results. The following diagram illustrates the duplication.
+
+Request | List of memories (returned memories are enclosed in brackets) | Results returned in the response
+:--- | :--- | :---
+Get Memory (next_token = 0, max_results = 3) | [BCD]EF | BCD
+Create Memory            | ABCDEF | -
+Get Memory (next_token = 3, max_results = 3) -> ABC[DEF] | DEF
+
+
+#### Example request: Get all memories
+
+```json
+GET /_plugins/_ml/memory/
+```
+{% include copy-curl.html %}
+
+#### Example request: Paginating results
+
+```json
+GET /_plugins/_ml/memory?max_results=2&next_token=1
+```
+
+#### Example response
+
+```json
+{
+  "memories": [
+    {
+      "memory_id": "gW8Aa40BfUsSoeNTvOKI",
+      "create_time": "2024-02-02T18:07:06.887061463Z",
+      "updated_time": "2024-02-02T19:01:32.121444968Z",
+      "name": "Conversation for a RAG pipeline",
+      "user": "admin"
+    }
+  ]
+}
+```
+
+## Response fields
+
+The following table lists the available response fields.
+
+| Field | Data type | Description |
+| :--- | :--- | :--- |
+| `memory_id` | String | The memory ID. |
+| `create_time` | String | The time at which the memory was created. |
+| `updated_time` | String | The time at which the memory was last updated. |
+| `name` | String | The memory name. |
+| `user` | String | The username of the user who created the memory. |
--- a/_ml-commons-plugin/api/memory-apis/get-message-traces.md
+++ b/_ml-commons-plugin/api/memory-apis/get-message-traces.md
@ -0,0 +1,142 @@
+---
+layout: default
+title: Get message traces
+parent: Memory APIs
+grand_parent: ML Commons APIs
+nav_order: 70
+---
+
+# Get message traces
+**Introduced 2.12**
+{: .label .label-purple }
+
+Use this API to retrieve message trace information for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). This can be useful for debugging.
+
+For each message, an agent may need to run different tools. You can use the Get Traces API to get all trace data for a message. The trace data includes detailed steps of a message execution.
+
+When the Security plugin is enabled, all memories exist in a `private` security mode. Only the user who created a memory can interact with that memory and its messages.
+{: .important}
+
+
+## Path and HTTP methods
+
+```json
+GET /_plugins/_ml/memory/message/<message_id>/traces
+```
+
+## Path parameters
+
+The following table lists the available path parameters.
+
+Parameter | Data type | Description
+:--- | :--- | :---
+`message_id` | String | The ID of the message to trace.
+
+#### Example request
+
+```json
+GET /_plugins/_ml/memory/message/TAuCZY0BT2tRrkdmCPqZ/traces
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+```json
+{
+  "traces": [
+    {
+      "memory_id": "7Qt4ZY0BT2tRrkdmSPlo",
+      "message_id": "TQuCZY0BT2tRrkdmEvpp",
+      "create_time": "2024-02-01T16:30:39.719968032Z",
+      "input": "Which index has most documents",
+      "prompt_template": null,
+      "response": "Let me check the document counts of each index",
+      "origin": null,
+      "additional_info": {},
+      "parent_message_id": "TAuCZY0BT2tRrkdmCPqZ",
+      "trace_number": 1
+    },
+    {
+      "memory_id": "7Qt4ZY0BT2tRrkdmSPlo",
+      "message_id": "TguCZY0BT2tRrkdmEvp7",
+      "create_time": "2024-02-01T16:30:39.732979687Z",
+      "input": "",
+      "prompt_template": null,
+      "response": """health    status    index    uuid    pri    rep    docs.count    docs.deleted    store.size    pri.store.size
+green    open    .plugins-ml-model-group    lHgGEgJhT_mpADyOZoXl2g    1    1    9    2    33.4kb    16.7kb
+green    open    .plugins-ml-memory-meta    b2LEpv0QS8K60QBjXtRm6g    1    1    13    0    117.5kb    58.7kb
+green    open    .ql-datasources    9NXm_tMXQc6s_4uRToSNkQ    1    1    0    0    416b    208b
+green    open    sample-ecommerce    UPYOQcAfRGqFAlSxcZlRjw    1    1    40320    0    4.1mb    2mb
+green    open    .plugins-ml-task    xYTlprYCQnaaYici69SOjA    1    1    117    0    115.5kb    57.6kb
+green    open    .opendistro_security    7DAqhm9QQmeEsQYhA40cJg    1    1    10    0    117kb    58.5kb
+green    open    sample-host-health    Na5tq6UiTt6r_qYME1vV-w    1    1    40320    0    2.6mb    1.3mb
+green    open    .opensearch-observability    6PthtLluSKyYCdZR3Mw0iw    1    1    0    0    416b    208b
+green    open    .plugins-ml-model    WYcjBHcnRuSDHeVWPVupoA    1    1    191    45    4.2gb    2.1gb
+green    open    index_for_neural_sparse    GQswGabQRIazM_trnqaDrw    1    1    5    0    28.4kb    14.2kb
+green    open    security-auditlog-2024.01.30    BhXR7Nd3QVOVGxJNpR0-jw    1    1    27768    0    13.8mb    7mb
+green    open    sample-http-responses    0gmYYYdOTiCbVUvl_uDL0w    1    1    40320    0    2.5mb    1.2mb
+green    open    security-auditlog-2024.02.01    2VD1ieDGS5m-TfjIdfT8Eg    1    1    36386    0    37mb    18.2mb
+green    open    opensearch_dashboards_sample_data_ecommerce    wnE6r7OvSPqc5YHj8wHSLA    1    1    4675    0    8.8mb    4.4mb
+green    open    security-auditlog-2024.01.31    cNRK5-2eTwes0SRlXTl0RQ    1    1    34520    0    20.5mb    9.8mb
+green    open    .plugins-ml-memory-message    wTNBU4BBQVSFcFhNlUdfBQ    1    1    88    1    399.7kb    205kb
+green    open    .plugins-flow-framework-state    dJUNDv9MSJ2jjwKbzXPlrw    1    1    39    0    114.1kb    57kb
+green    open    .plugins-ml-agent    7X1IzoLuSGmIujOh9i5mmg    1    1    27    0    146.6kb    73.3kb
+green    open    .plugins-flow-framework-templates    _ecC0KahTlmG_3tFUst7Uw    1    1    18    0    175.8kb    87.9kb
+green    open    .plugins-ml-connector    q45iJfVjQ5KgxeNC65DLSw    1    1    11    0    313.1kb    156.5kb
+green    open    .kibana_1    vRjXK4bHSUueB_4iXiQ8yw    1    1    257    0    264kb    132kb
+green    open    .plugins-ml-config    G7gxGQB7TZeQzBasHd5PUg    1    1    1    0    7.8kb    3.9kb
+green    open    .plugins-ml-controller    NQTZPREZRhWoDdjCglRLFg    1    1    0    0    50.1kb    49.9kb
+green    open    opensearch_dashboards_sample_data_logs    9gpOTB3rRgqBLvqis_k5LQ    1    1    14074    0    18mb    9mb
+green    open    .plugins-flow-framework-config    JlKPsCh6SEq-Jh6rPL_x9Q    1    1    1    0    7.8kb    3.9kb
+green    open    opensearch_dashboards_sample_data_flights    pJde0irnTce4-uobHwYmMQ    1    1    13059    0    11.9mb    5.9mb
+green    open    my_test_data    T4hwNs7CTJGIfw2QpCqQ_Q    1    1    6    0    91.7kb    45.8kb
+green    open    .opendistro-job-scheduler-lock    XjgmXAVKQ4e8Y-ac54VBzg    1    1    3    0    38.7kb    19.4kb
+""",
+      "origin": "CatIndexTool",
+      "additional_info": {},
+      "parent_message_id": "TAuCZY0BT2tRrkdmCPqZ",
+      "trace_number": 2
+    },
+    {
+      "memory_id": "7Qt4ZY0BT2tRrkdmSPlo",
+      "message_id": "UwuCZY0BT2tRrkdmHPos",
+      "create_time": "2024-02-01T16:30:42.217897656Z",
+      "input": "Which index has most documents",
+      "prompt_template": null,
+      "response": "Based on the cluster health information provided, the index with the most documents is .plugins-ml-model with 191 documents",
+      "origin": null,
+      "additional_info": {},
+      "parent_message_id": "TAuCZY0BT2tRrkdmCPqZ",
+      "trace_number": 3
+    },
+    {
+      "memory_id": "7Qt4ZY0BT2tRrkdmSPlo",
+      "message_id": "UQuCZY0BT2tRrkdmHPos",
+      "create_time": "2024-02-01T16:30:42.218120716Z",
+      "input": "Which index has most documents",
+      "prompt_template": null,
+      "response": "The index with the most documents is the .plugins-ml-model index, which contains 191 documents based on the cluster health information provided.",
+      "origin": null,
+      "additional_info": {},
+      "parent_message_id": "TAuCZY0BT2tRrkdmCPqZ",
+      "trace_number": 4
+    },
+    {
+      "memory_id": "7Qt4ZY0BT2tRrkdmSPlo",
+      "message_id": "UguCZY0BT2tRrkdmHPos",
+      "create_time": "2024-02-01T16:30:42.218240713Z",
+      "input": "Which index has most documents",
+      "prompt_template": null,
+      "response": "The index with the most documents is the .plugins-ml-model index, which contains 191 documents based on the cluster health information provided.",
+      "origin": null,
+      "additional_info": {},
+      "parent_message_id": "TAuCZY0BT2tRrkdmCPqZ",
+      "trace_number": 5
+    }
+  ]
+}
+```
+
+## Response fields
+
+For information about response fields, see [Create Message request fields]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/memory-apis/create-message#request-fields).
--- a/_ml-commons-plugin/api/memory-apis/get-message.md
+++ b/_ml-commons-plugin/api/memory-apis/get-message.md
@ -0,0 +1,139 @@
+---
+layout: default
+title: Get message
+parent: Memory APIs
+grand_parent: ML Commons APIs
+nav_order: 50
+---
+
+# Get message
+**Introduced 2.12**
+{: .label .label-purple }
+
+Use this API to retrieve message information for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). 
+
+To retrieve message information, you can:
+
+- [Get a message by ID](#get-a-message-by-id).
+- [Get all messages within a memory](#get-all-messages-within-a-memory).
+
+When the Security plugin is enabled, all memories exist in a `private` security mode. Only the user who created a memory can interact with that memory and its messages.
+{: .important}
+
+## Get a message by ID
+
+You can retrieve message information by using the `message_id`.
+
+### Path and HTTP methods
+
+```json
+GET /_plugins/_ml/memory/message/<message_id>
+```
+
+### Path parameters
+
+The following table lists the available path parameters.
+
+Parameter | Data type | Description
+:--- | :--- | :---
+`message_id` | String | The ID of the message to retrieve.
+
+#### Example request
+
+```json
+GET /_plugins/_ml/memory/message/0m8ya40BfUsSoeNTj-pU
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+```json
+{
+  "memory_id": "gW8Aa40BfUsSoeNTvOKI",
+  "message_id": "0m8ya40BfUsSoeNTj-pU",
+  "create_time": "2024-02-02T19:01:32.113621539Z",
+  "input": null,
+  "prompt_template": null,
+  "response": "Hello, this is OpenAI. Here is the answer to your question.",
+  "origin": null,
+  "additional_info": {
+    "suggestion": "api.openai.com"
+  }
+}
+```
+
+For information about response fields, see [Create Message request fields]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/memory-apis/create-message#request-fields).
+
+## Get all messages within a memory
+
+Use this command to get a list of messages for a certain memory.
+
+### Path and HTTP methods
+
+```json
+GET /_plugins/_ml/memory/<memory_id>/messages
+```
+
+### Path parameters
+
+The following table lists the available path parameters.
+
+Parameter | Data type | Description
+:--- | :--- | :---
+`memory_id` | String | The ID of the memory for which to retrieve messages.
+
+#### Example request
+
+```json
+GET /_plugins/_ml/memory/gW8Aa40BfUsSoeNTvOKI/messages
+```
+{% include copy-curl.html %}
+
+```json
+POST /_plugins/_ml/message/_search
+{
+  "query": {
+    "match_all": {}
+  },
+  "size": 1000
+}
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+```json
+{
+  "messages": [
+    {
+      "memory_id": "gW8Aa40BfUsSoeNTvOKI",
+      "message_id": "BW8ha40BfUsSoeNT8-i3",
+      "create_time": "2024-02-02T18:43:23.566994302Z",
+      "input": "How do I make an interaction?",
+      "prompt_template": "Hello OpenAI, can you answer this question?",
+      "response": "Hello, this is OpenAI. Here is the answer to your question.",
+      "origin": "MyFirstOpenAIWrapper",
+      "additional_info": {
+        "suggestion": "api.openai.com"
+      }
+    },
+    {
+      "memory_id": "gW8Aa40BfUsSoeNTvOKI",
+      "message_id": "0m8ya40BfUsSoeNTj-pU",
+      "create_time": "2024-02-02T19:01:32.113621539Z",
+      "input": null,
+      "prompt_template": null,
+      "response": "Hello, this is OpenAI. Here is the answer to your question.",
+      "origin": null,
+      "additional_info": {
+        "suggestion": "api.openai.com"
+      }
+    }
+  ]
+}
+```
+
+## Response fields
+
+For information about response fields, see [Create Message request fields]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/memory-apis/create-message#request-fields).
+
--- a/_ml-commons-plugin/api/memory-apis/index.md
+++ b/_ml-commons-plugin/api/memory-apis/index.md
@ -0,0 +1,29 @@
+---
+layout: default
+title: Memory APIs
+parent: ML Commons APIs
+has_children: true
+has_toc: false
+nav_order: 28
+redirect_from: /ml-commons-plugin/api/memory-apis/
+---
+
+# Memory APIs
+**Introduced 2.12**
+{: .label .label-purple }
+
+Memory APIs provide operations needed to implement [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). A memory stores conversation history for the current conversation. A message represents one question/answer interaction between the user and a large language model. Messages are organized into memories.
+
+ML Commons supports the following memory-level APIs:
+
+- [Create or update memory]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/memory-apis/create-memory/)
+- [Get memory]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/memory-apis/get-memory/)
+- [Delete memory]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/memory-apis/delete-memory/)
+
+ML Commons supports the following message-level APIs:
+
+- [Create or update message]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/memory-apis/create-message/)
+- [Get message information]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/memory-apis/get-message/)
+
+When the Security plugin is enabled, all memories exist in a `private` security mode. Only the user who created a memory can interact with that memory and its messages.
+{: .important}
--- a/_ml-commons-plugin/api/memory-apis/search-memory.md
+++ b/_ml-commons-plugin/api/memory-apis/search-memory.md
@ -0,0 +1,133 @@
+---
+layout: default
+title: Search memory
+parent: Memory APIs
+grand_parent: ML Commons APIs
+nav_order: 25
+---
+
+# Search for a memory
+**Introduced 2.12**
+{: .label .label-purple }
+
+This API retrieves a conversational memory for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). Use this command to search for memories.
+
+When the Security plugin is enabled, all memories exist in a `private` security mode. Only the user who created a memory can interact with that memory and its messages.
+{: .important}
+
+## Path and HTTP methods
+
+```json
+GET /_plugins/_ml/memory/_search
+POST /_plugins/_ml/memory/_search
+```
+
+#### Example request: Searching for all memories
+
+```json
+POST /_plugins/_ml/memory/_search
+{
+  "query": {
+    "match_all": {}
+  },
+  "size": 1000
+}
+```
+{% include copy-curl.html %}
+
+#### Example request: Searching for a memory by name
+
+```json
+POST /_plugins/_ml/memory/_search
+{
+  "query": {
+    "term": {
+      "name": {
+        "value": "conversation"
+      }
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+```json
+{
+  "took": 1,
+  "timed_out": false,
+  "_shards": {
+    "total": 1,
+    "successful": 1,
+    "skipped": 0,
+    "failed": 0
+  },
+  "hits": {
+    "total": {
+      "value": 3,
+      "relation": "eq"
+    },
+    "max_score": 0.2195382,
+    "hits": [
+      {
+        "_index": ".plugins-ml-memory-meta",
+        "_id": "znCqcI0BfUsSoeNTntd7",
+        "_version": 3,
+        "_seq_no": 39,
+        "_primary_term": 1,
+        "_score": 0.2195382,
+        "_source": {
+          "updated_time": "2024-02-03T20:36:10.252213029Z",
+          "create_time": "2024-02-03T20:30:46.395829411Z",
+          "application_type": null,
+          "name": "Conversation about NYC population",
+          "user": "admin"
+        }
+      },
+      {
+        "_index": ".plugins-ml-memory-meta",
+        "_id": "iXC4bI0BfUsSoeNTjS30",
+        "_version": 4,
+        "_seq_no": 11,
+        "_primary_term": 1,
+        "_score": 0.20763937,
+        "_source": {
+          "updated_time": "2024-02-03T02:59:39.862347093Z",
+          "create_time": "2024-02-03T02:07:30.804554275Z",
+          "application_type": null,
+          "name": "Test conversation for RAG pipeline",
+          "user": "admin"
+        }
+      },
+      {
+        "_index": ".plugins-ml-memory-meta",
+        "_id": "gW8Aa40BfUsSoeNTvOKI",
+        "_version": 4,
+        "_seq_no": 6,
+        "_primary_term": 1,
+        "_score": 0.19754036,
+        "_source": {
+          "updated_time": "2024-02-02T19:01:32.121444968Z",
+          "create_time": "2024-02-02T18:07:06.887061463Z",
+          "application_type": null,
+          "name": "Conversation for a RAG pipeline",
+          "user": "admin"
+        }
+      }
+    ]
+  }
+}
+```
+
+## Response fields
+
+The following table lists all response fields.
+
+| Field | Data type | Description |
+| :--- | :--- | :--- |
+| `memory_id` | String | The memory ID. |
+| `create_time` | String | The time at which the memory was created. |
+| `updated_time` | String | The time at which the memory was last updated. |
+| `name` | String | The memory name. |
+| `user` | String | The username of the user who created the memory. |
--- a/_ml-commons-plugin/api/memory-apis/search-message.md
+++ b/_ml-commons-plugin/api/memory-apis/search-message.md
@ -0,0 +1,94 @@
+---
+layout: default
+title: Search message
+parent: Memory APIs
+grand_parent: ML Commons APIs
+nav_order: 60
+---
+
+# Search for a message
+**Introduced 2.12**
+{: .label .label-purple }
+
+Retrieves message information for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). You can send queries to the `_search` endpoint to search for matching messages within a memory.
+
+When the Security plugin is enabled, all memories exist in a `private` security mode. Only the user who created a memory can interact with that memory and its messages.
+{: .important}
+
+## Path and HTTP methods
+
+```json
+POST /_plugins/_ml/memory/<memory_id>/_search
+GET /_plugins/_ml/memory/<memory_id>/_search
+```
+
+### Path parameters
+
+The following table lists the available path parameters.
+
+Parameter | Data type | Description
+:--- | :--- | :---
+`memory_id` | String | The ID of the memory used to search for messages matching the query.
+
+#### Example request
+
+```json
+GET /_plugins/_ml/memory/gW8Aa40BfUsSoeNTvOKI/_search
+{
+  "query": {
+    "match": {
+      "input": "interaction"
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+```json
+{
+  "took": 5,
+  "timed_out": false,
+  "_shards": {
+    "total": 1,
+    "successful": 1,
+    "skipped": 0,
+    "failed": 0
+  },
+  "hits": {
+    "total": {
+      "value": 1,
+      "relation": "eq"
+    },
+    "max_score": 0.47000366,
+    "hits": [
+      {
+        "_index": ".plugins-ml-memory-message",
+        "_id": "BW8ha40BfUsSoeNT8-i3",
+        "_version": 1,
+        "_seq_no": 0,
+        "_primary_term": 1,
+        "_score": 0.47000366,
+        "_source": {
+          "input": "How do I make an interaction?",
+          "memory_id": "gW8Aa40BfUsSoeNTvOKI",
+          "trace_number": null,
+          "create_time": "2024-02-02T18:43:23.566994302Z",
+          "additional_info": {
+            "suggestion": "api.openai.com"
+          },
+          "response": "Hello, this is OpenAI. Here is the answer to your question.",
+          "origin": "MyFirstOpenAIWrapper",
+          "parent_message_id": null,
+          "prompt_template": "Hello OpenAI, can you answer this question?"
+        }
+      }
+    ]
+  }
+}
+```
+
+## Response fields
+
+For information about response fields, see [Create Message request fields]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/memory-apis/create-message#request-fields).
--- a/_ml-commons-plugin/api/model-apis/delete-model.md
+++ b/_ml-commons-plugin/api/model-apis/delete-model.md
@ -2,7 +2,7 @@
 layout: default
 title: Delete model
 parent: Model APIs
-grand_parent: ML Commons API
+grand_parent: ML Commons APIs
 nav_order: 50
 ---

--- a/_ml-commons-plugin/api/model-apis/deploy-model.md
+++ b/_ml-commons-plugin/api/model-apis/deploy-model.md
@ -2,8 +2,8 @@
 layout: default
 title: Deploy model
 parent: Model APIs
-grand_parent: ML Commons API
-nav_order: 30
+grand_parent: ML Commons APIs
+nav_order: 20
 ---

 # Deploy a model
--- a/_ml-commons-plugin/api/model-apis/get-model.md
+++ b/_ml-commons-plugin/api/model-apis/get-model.md
@ -2,29 +2,30 @@
 layout: default
 title: Get model
 parent: Model APIs
-grand_parent: ML Commons API
-nav_order: 20
+grand_parent: ML Commons APIs
+nav_order: 30
 ---

 # Get a model

-To retrieve information about a model, you can:
-
- [Get a model by ID](#get-a-model-by-id)
- [Search for a model](#search-for-a-model)
-
-## Get a model by ID
-
 You can retrieve model information using the `model_id`.

 For information about user access for this API, see [Model access control considerations]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/index/#model-access-control-considerations).

-### Path and HTTP methods
+## Path and HTTP methods

 ```json
-GET /_plugins/_ml/models/<model-id>
+GET /_plugins/_ml/models/<model_id>
 ```

+## Path parameters
+
+The following table lists the available path parameters. 
+
+| Parameter | Data type | Description |
+| :--- | :--- | :--- |
+| `model_id` | String | The model ID of the model to retrieve. |
+
 #### Example request

 ```json
@ -36,202 +37,22 @@ GET /_plugins/_ml/models/N8AE1osB0jLkkocYjz7D

 ```json
 {
-"name" : "all-MiniLM-L6-v2_onnx",
-"algorithm" : "TEXT_EMBEDDING",
-"version" : "1",
-"model_format" : "TORCH_SCRIPT",
-"model_state" : "LOADED",
-"model_content_size_in_bytes" : 83408741,
-"model_content_hash_value" : "9376c2ebd7c83f99ec2526323786c348d2382e6d86576f750c89ea544d6bbb14",
-"model_config" : {
-    "model_type" : "bert",
-    "embedding_dimension" : 384,
-    "framework_type" : "SENTENCE_TRANSFORMERS",
-    "all_config" : """{"_name_or_path":"nreimers/MiniLM-L6-H384-uncased","architectures":["BertModel"],"attention_probs_dropout_prob":0.1,"gradient_checkpointing":false,"hidden_act":"gelu","hidden_dropout_prob":0.1,"hidden_size":384,"initializer_range":0.02,"intermediate_size":1536,"layer_norm_eps":1e-12,"max_position_embeddings":512,"model_type":"bert","num_attention_heads":12,"num_hidden_layers":6,"pad_token_id":0,"position_embedding_type":"absolute","transformers_version":"4.8.2","type_vocab_size":2,"use_cache":true,"vocab_size":30522}"""
-},
-"created_time" : 1665961344044,
-"last_uploaded_time" : 1665961373000,
-"last_loaded_time" : 1665961815959,
-"total_chunks" : 9
-}
-```
-
-## Search for a model
-
-Use this command to search for models you've already created.
-
-The response will contain only those model versions to which you have access. For example, if you send a match all query, model versions for the following model group types will be returned:
-
- All public model groups in the index.
- Private model groups for which you are the model owner.
- Model groups that share at least one backend role with your backend roles.
-
-For more information, see [Model access control]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-access-control/).
-
-### Path and HTTP methods
-
-```json
-GET /_plugins/_ml/models/_search
-POST /_plugins/_ml/models/_search
-```
-
-#### Example request: Searching for all models
-
-```json
-POST /_plugins/_ml/models/_search
-{
-  "query": {
-    "match_all": {}
+  "name" : "all-MiniLM-L6-v2_onnx",
+  "algorithm" : "TEXT_EMBEDDING",
+  "version" : "1",
+  "model_format" : "TORCH_SCRIPT",
+  "model_state" : "LOADED",
+  "model_content_size_in_bytes" : 83408741,
+  "model_content_hash_value" : "9376c2ebd7c83f99ec2526323786c348d2382e6d86576f750c89ea544d6bbb14",
+  "model_config" : {
+      "model_type" : "bert",
+      "embedding_dimension" : 384,
+      "framework_type" : "SENTENCE_TRANSFORMERS",
+      "all_config" : """{"_name_or_path":"nreimers/MiniLM-L6-H384-uncased","architectures":["BertModel"],"attention_probs_dropout_prob":0.1,"gradient_checkpointing":false,"hidden_act":"gelu","hidden_dropout_prob":0.1,"hidden_size":384,"initializer_range":0.02,"intermediate_size":1536,"layer_norm_eps":1e-12,"max_position_embeddings":512,"model_type":"bert","num_attention_heads":12,"num_hidden_layers":6,"pad_token_id":0,"position_embedding_type":"absolute","transformers_version":"4.8.2","type_vocab_size":2,"use_cache":true,"vocab_size":30522}"""
  },
-  "size": 1000
+  "created_time" : 1665961344044,
+  "last_uploaded_time" : 1665961373000,
+  "last_loaded_time" : 1665961815959,
+  "total_chunks" : 9
 }
-```
-{% include copy-curl.html %}
-
-#### Example request: Searching for models with algorithm "FIT_RCF"
-
-```json
-POST /_plugins/_ml/models/_search
-{
-  "query": {
-    "term": {
-      "algorithm": {
-        "value": "FIT_RCF"
-      }
-    }
-  }
-}
-```
-{% include copy-curl.html %}
-
-#### Example: Excluding model chunks
-
-```json
-GET /_plugins/_ml/models/_search
-{
-  "query": {
-    "bool": {
-      "must_not": {
-        "exists": {
-          "field": "chunk_number"
-        }
-      }
-    }
-  },
-  "sort": [
-    {
-      "created_time": {
-        "order": "desc"
-      }
-    }
-  ]
-}
-```
-{% include copy-curl.html %}
-
-#### Example: Searching for all model chunks
-
-The following query searches for all chunks of the model with the ID `979y9YwBjWKCe6KgNGTm` and sorts the chunks in ascending order:
-
-```json
-GET /_plugins/_ml/models/_search
-{
-  "query": {
-    "bool": {
-      "filter": [
-        {
-          "term": {
-            "model_id": "9r9w9YwBjWKCe6KgyGST"
-          }
-        }
-      ]
-    }
-  },
-  "sort": [
-    {
-      "chunk_number": {
-        "order": "asc"
-      }
-    }
-  ]
-}
-```
-{% include copy-curl.html %}
-
-#### Example: Searching for a model by description
-
-```json
-GET _plugins/_ml/models/_search
-{
-  "query": {
-    "bool": {
-      "should": [
-        {
-          "match": {
-            "description": "sentence transformer"
-          }
-        }
-      ],
-      "must_not": {
-        "exists": {
-          "field": "chunk_number"
-        }
-      }
-    }
-  },
-  "size": 1000
-}
-```
-{% include copy-curl.html %}
-
-#### Example response
-
-```json
-{
-    "took" : 8,
-    "timed_out" : false,
-    "_shards" : {
-      "total" : 1,
-      "successful" : 1,
-      "skipped" : 0,
-      "failed" : 0
-    },
-    "hits" : {
-      "total" : {
-        "value" : 2,
-        "relation" : "eq"
-      },
-      "max_score" : 2.4159138,
-      "hits" : [
-        {
-          "_index" : ".plugins-ml-model",
-          "_id" : "-QkKJX8BvytMh9aUeuLD",
-          "_version" : 1,
-          "_seq_no" : 12,
-          "_primary_term" : 15,
-          "_score" : 2.4159138,
-          "_source" : {
-            "name" : "FIT_RCF",
-            "version" : 1,
-            "content" : "xxx",
-            "algorithm" : "FIT_RCF"
-          }
-        },
-        {
-          "_index" : ".plugins-ml-model",
-          "_id" : "OxkvHn8BNJ65KnIpck8x",
-          "_version" : 1,
-          "_seq_no" : 2,
-          "_primary_term" : 8,
-          "_score" : 2.4159138,
-          "_source" : {
-            "name" : "FIT_RCF",
-            "version" : 1,
-            "content" : "xxx",
-            "algorithm" : "FIT_RCF"
-          }
-        }
-      ]
-    }
-  }
 ```
--- a/_ml-commons-plugin/api/model-apis/index.md
+++ b/_ml-commons-plugin/api/model-apis/index.md
@ -1,9 +1,10 @@
 ---
 layout: default
 title: Model APIs
-parent: ML Commons API
+parent: ML Commons APIs
 has_children: true
 nav_order: 10
+has_toc: false
 ---

 # Model APIs
@ -11,10 +12,12 @@ nav_order: 10
 ML Commons supports the following model-level APIs:

 - [Register model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/register-model/)
+- [Update model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/update-model/)
 - [Get model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/get-model/)
 - [Deploy model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/deploy-model/)
 - [Undeploy model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/undeploy-model/)
 - [Delete model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/delete-model/)
+- [Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/predict/) (invokes a model)

 ## Model access control considerations

--- a/_ml-commons-plugin/api/model-apis/register-model.md
+++ b/_ml-commons-plugin/api/model-apis/register-model.md
@ -2,7 +2,7 @@
 layout: default
 title: Register model
 parent: Model APIs
-grand_parent: ML Commons API
+grand_parent: ML Commons APIs
 nav_order: 10
 ---

@ -29,7 +29,14 @@ If the model is more than 10 MB in size, ML Commons splits it into smaller chunk
 ```json
 POST /_plugins/_ml/models/_register
 ```
-{% include copy-curl.html %}
+
+## Query parameters
+
+The following table lists the available query parameters. All query parameters are optional.
+
+| Parameter | Data type | Description |
+| :--- | :--- | :--- |
+| `deploy` | Boolean | Whether to deploy the model after registering it. The deploy operation is performed by calling the [Deploy Model API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/deploy-model/). Default is `false`. |

 ## Register an OpenSearch-provided pretrained model

@ -50,6 +57,7 @@ Field | Data type | Required/Optional | Description
 `model_format` | String | Required | The portable format of the model file. Valid values are `TORCH_SCRIPT` and `ONNX`. |
 `description` | String | Optional| The model description. |
 `model_group_id` | String | Optional | The model group ID of the model group to register this model to. 
+`is_enabled`| Boolean | Specifies whether the model is enabled. Disabling the model makes it unavailable for Predict API requests, regardless of the model's deployment status. Default is `true`.

 #### Example request: OpenSearch-provided text embedding model

@ -82,6 +90,7 @@ Field | Data type | Required/Optional | Description
 `url` | String | Required | The URL that contains the model. |
 `description` | String | Optional| The model description. |
 `model_group_id` | String | Optional | The model group ID of the model group to register this model to. 
+`is_enabled`| Boolean | Specifies whether the model is enabled. Disabling the model makes it unavailable for Predict API requests, regardless of the model's deployment status. Default is `true`.

 #### Example request: OpenSearch-provided sparse encoding model

@ -119,6 +128,7 @@ Field | Data type | Required/Optional | Description
 `url` | String | Required | The URL that contains the model. |
 `description` | String | Optional| The model description. |
 `model_group_id` | String | Optional | The model group ID of the model group to register this model to. 
+`is_enabled`| Boolean | Specifies whether the model is enabled. Disabling the model makes it unavailable for Predict API requests, regardless of the model's deployment status. Default is `true`.

 #### The `model_config` object

@ -176,6 +186,7 @@ Field | Data type | Required/Optional | Description
 `connector` | Object | Required | Contains specifications for a connector for a model hosted on a third-party platform. For more information, see [Creating a connector for a specific model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/connectors/#creating-a-connector-for-a-specific-model). You must provide either `connector_id` or `connector`.
 `description` | String | Optional| The model description. |
 `model_group_id` | String | Optional | The model group ID of the model group to register this model to. 
+`is_enabled`| Boolean | Specifies whether the model is enabled. Disabling the model makes it unavailable for Predict API requests, regardless of the model's deployment status. Default is `true`.

 #### Example request: Remote model with a standalone connector

--- a/_ml-commons-plugin/api/model-apis/search-model.md
+++ b/_ml-commons-plugin/api/model-apis/search-model.md
@ -0,0 +1,187 @@
+---
+layout: default
+title: Search model
+parent: Model APIs
+grand_parent: ML Commons APIs
+nav_order: 35
+---
+
+# Search for a model
+
+You can use this command to search for models you've already created.
+
+The response will contain only those model versions to which you have access. For example, if you send a `match_all` query, model versions for the following model group types will be returned:
+
+- All public model groups in the index
+- Private model groups for which you are the model owner
+- Model groups that share at least one backend role with your backend roles
+
+For information about user access for this API, see [Model access control considerations]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/index/#model-access-control-considerations).
+
+## Path and HTTP methods
+
+```json
+GET /_plugins/_ml/models/_search
+POST /_plugins/_ml/models/_search
+```
+
+#### Example request: Searching for all models
+
+```json
+POST /_plugins/_ml/models/_search
+{
+  "query": {
+    "match_all": {}
+  },
+  "size": 1000
+}
+```
+{% include copy-curl.html %}
+
+#### Example request: Searching for models with the algorithm "FIT_RCF"
+
+```json
+POST /_plugins/_ml/models/_search
+{
+  "query": {
+    "term": {
+      "algorithm": {
+        "value": "FIT_RCF"
+      }
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+#### Example: Excluding model chunks
+
+```json
+GET /_plugins/_ml/models/_search
+{
+  "query": {
+    "bool": {
+      "must_not": {
+        "exists": {
+          "field": "chunk_number"
+        }
+      }
+    }
+  },
+  "sort": [
+    {
+      "created_time": {
+        "order": "desc"
+      }
+    }
+  ]
+}
+```
+{% include copy-curl.html %}
+
+#### Example: Searching for all model chunks
+
+The following query searches for all chunks of the model with the ID `979y9YwBjWKCe6KgNGTm` and sorts the chunks in ascending order:
+
+```json
+GET /_plugins/_ml/models/_search
+{
+  "query": {
+    "bool": {
+      "filter": [
+        {
+          "term": {
+            "model_id": "9r9w9YwBjWKCe6KgyGST"
+          }
+        }
+      ]
+    }
+  },
+  "sort": [
+    {
+      "chunk_number": {
+        "order": "asc"
+      }
+    }
+  ]
+}
+```
+{% include copy-curl.html %}
+
+#### Example: Searching for a model by description
+
+```json
+GET _plugins/_ml/models/_search
+{
+  "query": {
+    "bool": {
+      "should": [
+        {
+          "match": {
+            "description": "sentence transformer"
+          }
+        }
+      ],
+      "must_not": {
+        "exists": {
+          "field": "chunk_number"
+        }
+      }
+    }
+  },
+  "size": 1000
+}
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+```json
+{
+    "took" : 8,
+    "timed_out" : false,
+    "_shards" : {
+      "total" : 1,
+      "successful" : 1,
+      "skipped" : 0,
+      "failed" : 0
+    },
+    "hits" : {
+      "total" : {
+        "value" : 2,
+        "relation" : "eq"
+      },
+      "max_score" : 2.4159138,
+      "hits" : [
+        {
+          "_index" : ".plugins-ml-model",
+          "_id" : "-QkKJX8BvytMh9aUeuLD",
+          "_version" : 1,
+          "_seq_no" : 12,
+          "_primary_term" : 15,
+          "_score" : 2.4159138,
+          "_source" : {
+            "name" : "FIT_RCF",
+            "version" : 1,
+            "content" : "xxx",
+            "algorithm" : "FIT_RCF"
+          }
+        },
+        {
+          "_index" : ".plugins-ml-model",
+          "_id" : "OxkvHn8BNJ65KnIpck8x",
+          "_version" : 1,
+          "_seq_no" : 2,
+          "_primary_term" : 8,
+          "_score" : 2.4159138,
+          "_source" : {
+            "name" : "FIT_RCF",
+            "version" : 1,
+            "content" : "xxx",
+            "algorithm" : "FIT_RCF"
+          }
+        }
+      ]
+    }
+  }
+```
--- a/_ml-commons-plugin/api/model-apis/undeploy-model.md
+++ b/_ml-commons-plugin/api/model-apis/undeploy-model.md
@ -2,8 +2,8 @@
 layout: default
 title: Undeploy model
 parent: Model APIs
-grand_parent: ML Commons API
-nav_order: 40
+grand_parent: ML Commons APIs
+nav_order: 45
 ---

 # Undeploy a model
--- a/_ml-commons-plugin/api/model-apis/update-model.md
+++ b/_ml-commons-plugin/api/model-apis/update-model.md
@ -0,0 +1,81 @@
+---
+layout: default
+title: Update model
+parent: Model APIs
+grand_parent: ML Commons APIs
+nav_order: 40
+---
+
+# Update a model
+**Introduced 2.12**
+{: .label .label-purple }
+
+Updates a model based on the `model_ID`.
+
+For information about user access for this API, see [Model access control considerations]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/index/#model-access-control-considerations).
+
+## Path and HTTP methods
+
+```json
+PUT /_plugins/_ml/models/<model_id>
+```
+
+## Request fields
+
+The following table lists the updatable fields. Not all request fields are applicable to all models. To determine whether the field is applicable to your model type, see [Register Model API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/register-model/).
+
+Field | Data type |  Description
+:---  | :--- | :--- 
+`connector` | Object | Contains specifications for a connector for a model hosted on a third-party platform. For more information, see [Creating a connector for a specific model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/connectors/#creating-a-connector-for-a-specific-model). For information about the updatable fields within a connector, see [Update Connector API request fields]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/connector-apis/update-connector/#request-fields).
+`connector_id` | Optional | The connector ID of a standalone connector for a model hosted on a third-party platform. For more information, see [Standalone connector]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/connectors/#creating-a-standalone-connector). To update a standalone connector, you must undeploy the model, update the connector, and then redeploy the model.
+`description` | String | The model description. 
+`is_enabled`| Boolean | Specifies whether the model is enabled. Disabling the model makes it unavailable for Predict API requests, regardless of the model's deployment status. Default is `true`.
+`model_config` | Object | The model's configuration, including the `model_type`, `embedding_dimension`, and `framework_type`. `all_config` is an optional JSON string that contains all model configurations. For more information, see [The `model_config` object]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/register-model#the-model_config-object). |
+`model_group_id` | String | The model group ID of the model group to which to register this model. 
+`name`| String | The model name. 
+`rate_limiter` | Object | Limits the number of times any user can call the Predict API on the model. For more information, see [Rate limiting inference calls]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#rate-limiting-inference-calls).
+`rate_limiter.limit` | Integer | The maximum number of times any user can call the Predict API on the model per `unit` of time. By default, there is no limit on the number of Predict API calls. Once you set a limit, you cannot reset it to no limit. As an alternative, you can specify a high limit value and a small time unit, for example, 1 request per nanosecond.
+`rate_limiter.unit` | String | The unit of time for the rate limiter. Valid values are `DAYS`, `HOURS`, `MICROSECONDS`, `MILLISECONDS`, `MINUTES`, `NANOSECONDS`, and `SECONDS`.
+
+#### Example request: Disabling a model
+
+```json
+PUT /_plugins/_ml/models/MzcIJX8BA7mbufL6DOwl
+{
+    "is_enabled": false
+}
+```
+{% include copy-curl.html %}
+
+#### Example request: Rate limiting inference calls for a model
+
+The following request limits the number of times you can call the Predict API on the model to 4 Predict API calls per minute:
+
+```json
+PUT /_plugins/_ml/models/T_S-cY0BKCJ3ot9qr0aP
+{
+  "rate_limiter": {
+    "limit": "4",
+    "unit": "MINUTES"
+  }
+}
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+```json
+{
+  "_index": ".plugins-ml-model",
+  "_id": "MzcIJX8BA7mbufL6DOwl",
+  "_version": 10,
+  "result": "updated",
+  "_shards": {
+    "total": 1,
+    "successful": 1,
+    "failed": 0
+  },
+  "_seq_no": 48,
+  "_primary_term": 4
+}
+```
--- a/_ml-commons-plugin/api/model-group-apis/delete-model-group.md
+++ b/_ml-commons-plugin/api/model-group-apis/delete-model-group.md
@ -2,7 +2,7 @@
 layout: default
 title: Delete model group
 parent: Model group APIs
-grand_parent: ML Commons API
+grand_parent: ML Commons APIs
 nav_order: 40
 ---

--- a/_ml-commons-plugin/api/model-group-apis/index.md
+++ b/_ml-commons-plugin/api/model-group-apis/index.md
@ -1,8 +1,9 @@
 ---
 layout: default
 title: Model group APIs
-parent: ML Commons API
+parent: ML Commons APIs
 has_children: true
+has_toc: false
 nav_order: 20
 ---

--- a/_ml-commons-plugin/api/model-group-apis/register-model-group.md
+++ b/_ml-commons-plugin/api/model-group-apis/register-model-group.md
@ -2,7 +2,7 @@
 layout: default
 title: Register model group
 parent: Model group APIs
-grand_parent: ML Commons API
+grand_parent: ML Commons APIs
 nav_order: 10
 ---

--- a/_ml-commons-plugin/api/model-group-apis/search-model-group.md
+++ b/_ml-commons-plugin/api/model-group-apis/search-model-group.md
@ -1,8 +1,8 @@
 ---
 layout: default
-title: Search for a model group
+title: Search model group
 parent: Model group APIs
-grand_parent: ML Commons API
+grand_parent: ML Commons APIs
 nav_order: 30
 ---

--- a/_ml-commons-plugin/api/model-group-apis/update-model-group.md
+++ b/_ml-commons-plugin/api/model-group-apis/update-model-group.md
@ -2,7 +2,7 @@
 layout: default
 title: Update model group
 parent: Model group APIs
-grand_parent: ML Commons API
+grand_parent: ML Commons APIs
 nav_order: 20
 ---

--- a/_ml-commons-plugin/api/profile.md
+++ b/_ml-commons-plugin/api/profile.md
@ -1,7 +1,7 @@
 ---
 layout: default
 title: Profile
-parent: ML Commons API
+parent: ML Commons APIs
 nav_order: 40
 ---

--- a/_ml-commons-plugin/api/stats.md
+++ b/_ml-commons-plugin/api/stats.md
@ -1,7 +1,7 @@
 ---
 layout: default
 title: Stats 
-parent: ML Commons API
+parent: ML Commons APIs
 nav_order: 50
 ---

--- a/_ml-commons-plugin/api/tasks-apis/delete-task.md
+++ b/_ml-commons-plugin/api/tasks-apis/delete-task.md
@ -2,7 +2,7 @@
 layout: default
 title: Delete task
 parent: Tasks APIs
-grand_parent: ML Commons API
+grand_parent: ML Commons APIs
 nav_order: 20
 ---

--- a/_ml-commons-plugin/api/tasks-apis/get-task.md
+++ b/_ml-commons-plugin/api/tasks-apis/get-task.md
@ -2,22 +2,15 @@
 layout: default
 title: Get task
 parent: Tasks APIs
-grand_parent: ML Commons API
+grand_parent: ML Commons APIs
 nav_order: 10
 ---

 # Get task

-To retrieve information about a model, you can:
-
- [Get a task by ID](#get-a-task-by-id)
- [Search for a task](#search-for-a-task)
-
-## Get a task by ID
-
 You can retrieve information about a task using the `task_id`.

-### Path and HTTP methods
+## Path and HTTP methods

 ```json
 GET /_plugins/_ml/tasks/<task_id>
@ -30,6 +23,8 @@ GET /_plugins/_ml/tasks/MsBi1YsB0jLkkocYjD5f
 ```
 {% include copy-curl.html %}

+#### Example response
+
 The response includes information about the task.

 ```json
@ -45,95 +40,3 @@ The response includes information about the task.
  "is_async" : true
 }
 ```
-
-## Search for a task
-
-Searches tasks based on parameters indicated in the request body.
-
-### Path and HTTP methods
-
-```json
-GET /_plugins/_ml/tasks/_search
-```
-
-#### Example request: Search for a task in which `function_name` is `KMEANS`
-
-```json
-GET /_plugins/_ml/tasks/_search
-{
-  "query": {
-    "bool": {
-      "filter": [
-        {
-          "term": {
-            "function_name": "KMEANS"
-          }
-        }
-      ]
-    }
-  }
-}
-```
-{% include copy-curl.html %}
-
-#### Example response
-
-```json
-{
-  "took" : 12,
-  "timed_out" : false,
-  "_shards" : {
-    "total" : 1,
-    "successful" : 1,
-    "skipped" : 0,
-    "failed" : 0
-  },
-  "hits" : {
-    "total" : {
-      "value" : 2,
-      "relation" : "eq"
-    },
-    "max_score" : 0.0,
-    "hits" : [
-      {
-        "_index" : ".plugins-ml-task",
-        "_id" : "_wnLJ38BvytMh9aUi-Ia",
-        "_version" : 4,
-        "_seq_no" : 29,
-        "_primary_term" : 4,
-        "_score" : 0.0,
-        "_source" : {
-          "last_update_time" : 1645640125267,
-          "create_time" : 1645640125209,
-          "is_async" : true,
-          "function_name" : "KMEANS",
-          "input_type" : "SEARCH_QUERY",
-          "worker_node" : "jjqFrlW7QWmni1tRnb_7Dg",
-          "state" : "COMPLETED",
-          "model_id" : "AAnLJ38BvytMh9aUi-M2",
-          "task_type" : "TRAINING"
-        }
-      },
-      {
-        "_index" : ".plugins-ml-task",
-        "_id" : "wwRRLX8BydmmU1x6I-AI",
-        "_version" : 3,
-        "_seq_no" : 38,
-        "_primary_term" : 7,
-        "_score" : 0.0,
-        "_source" : {
-          "last_update_time" : 1645732766656,
-          "create_time" : 1645732766472,
-          "is_async" : true,
-          "function_name" : "KMEANS",
-          "input_type" : "SEARCH_QUERY",
-          "worker_node" : "A_IiqoloTDK01uZvCjREaA",
-          "state" : "COMPLETED",
-          "model_id" : "xARRLX8BydmmU1x6I-CG",
-          "task_type" : "TRAINING"
-        }
-      }
-    ]
-  }
-}
-```
--- a/_ml-commons-plugin/api/tasks-apis/index.md
+++ b/_ml-commons-plugin/api/tasks-apis/index.md
@ -1,8 +1,9 @@
 ---
 layout: default
 title: Tasks APIs
-parent: ML Commons API
+parent: ML Commons APIs
 has_children: true
+has_toc: false
 nav_order: 30
 ---

--- a/_ml-commons-plugin/api/tasks-apis/search-task.md
+++ b/_ml-commons-plugin/api/tasks-apis/search-task.md
@ -0,0 +1,99 @@
+---
+layout: default
+title: Search task
+parent: Tasks APIs
+grand_parent: ML Commons APIs
+nav_order: 15
+---
+
+# Search for a task
+
+Searches tasks based on parameters indicated in the request body.
+
+## Path and HTTP methods
+
+```json
+GET /_plugins/_ml/tasks/_search
+```
+
+#### Example request: Search for a task in which `function_name` is `KMEANS`
+
+```json
+GET /_plugins/_ml/tasks/_search
+{
+  "query": {
+    "bool": {
+      "filter": [
+        {
+          "term": {
+            "function_name": "KMEANS"
+          }
+        }
+      ]
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+```json
+{
+  "took" : 12,
+  "timed_out" : false,
+  "_shards" : {
+    "total" : 1,
+    "successful" : 1,
+    "skipped" : 0,
+    "failed" : 0
+  },
+  "hits" : {
+    "total" : {
+      "value" : 2,
+      "relation" : "eq"
+    },
+    "max_score" : 0.0,
+    "hits" : [
+      {
+        "_index" : ".plugins-ml-task",
+        "_id" : "_wnLJ38BvytMh9aUi-Ia",
+        "_version" : 4,
+        "_seq_no" : 29,
+        "_primary_term" : 4,
+        "_score" : 0.0,
+        "_source" : {
+          "last_update_time" : 1645640125267,
+          "create_time" : 1645640125209,
+          "is_async" : true,
+          "function_name" : "KMEANS",
+          "input_type" : "SEARCH_QUERY",
+          "worker_node" : "jjqFrlW7QWmni1tRnb_7Dg",
+          "state" : "COMPLETED",
+          "model_id" : "AAnLJ38BvytMh9aUi-M2",
+          "task_type" : "TRAINING"
+        }
+      },
+      {
+        "_index" : ".plugins-ml-task",
+        "_id" : "wwRRLX8BydmmU1x6I-AI",
+        "_version" : 3,
+        "_seq_no" : 38,
+        "_primary_term" : 7,
+        "_score" : 0.0,
+        "_source" : {
+          "last_update_time" : 1645732766656,
+          "create_time" : 1645732766472,
+          "is_async" : true,
+          "function_name" : "KMEANS",
+          "input_type" : "SEARCH_QUERY",
+          "worker_node" : "A_IiqoloTDK01uZvCjREaA",
+          "state" : "COMPLETED",
+          "model_id" : "xARRLX8BydmmU1x6I-CG",
+          "task_type" : "TRAINING"
+        }
+      }
+    ]
+  }
+}
+```
--- a/_ml-commons-plugin/api/train-predict/index.md
+++ b/_ml-commons-plugin/api/train-predict/index.md
@ -1,8 +1,9 @@
 ---
 layout: default
 title: Train and Predict APIs
-parent: ML Commons API
+parent: ML Commons APIs
 has_children: true
+has_toc: false
 nav_order: 30
 ---

--- a/_ml-commons-plugin/api/train-predict/predict.md
+++ b/_ml-commons-plugin/api/train-predict/predict.md
@ -2,8 +2,7 @@
 layout: default
 title: Predict
 parent: Train and Predict APIs
-grand_parent: ML Commons API
-has_children: true
+grand_parent: ML Commons APIs
 nav_order: 20
 ---

--- a/_ml-commons-plugin/api/train-predict/train-and-predict.md
+++ b/_ml-commons-plugin/api/train-predict/train-and-predict.md
@ -2,8 +2,7 @@
 layout: default
 title: Train and predict 
 parent: Train and Predict APIs
-grand_parent: ML Commons API
-has_children: true
+grand_parent: ML Commons APIs
 nav_order: 10
 ---

--- a/_ml-commons-plugin/api/train-predict/train.md
+++ b/_ml-commons-plugin/api/train-predict/train.md
@ -2,8 +2,7 @@
 layout: default
 title: Train 
 parent: Train and Predict APIs
-grand_parent: ML Commons API
-has_children: true
+grand_parent: ML Commons APIs
 nav_order: 10
 ---

--- a/_ml-commons-plugin/custom-local-models.md
+++ b/_ml-commons-plugin/custom-local-models.md
@ -460,4 +460,4 @@ Document text | Score

 The document that contains the same text as the query is scored the highest, and the remaining documents are scored based on the text similarity.

-To learn how to use the model for reranking, see [Reranking search results]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/reranking-search-results/).
+To learn how to use the model for reranking, see [Reranking search results]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/reranking-search-results/).
--- a/_ml-commons-plugin/integrating-ml-models.md
+++ b/_ml-commons-plugin/integrating-ml-models.md
@ -45,15 +45,30 @@ For a step-by-step tutorial, see [Neural search tutorial]({{site.url}}{{site.bas

 You can use an ML model in one of the following ways:

- [Make predictions](#making-predictions).
+- [Invoke a model for inference](#invoking-a-model-for-inference).
 - [Use a model for search](#using-a-model-for-search).

-### Making predictions
+### Invoking a model for inference

-[Models trained]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/train/) through the ML Commons plugin support model-based algorithms, such as k-means. After you've trained a model to your precision requirements, use the model to [make predictions]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/predict/). 
+You can invoke your model by calling the [Predict API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/predict/). For example, testing text embedding models lets you see the vector embeddings they generate.

-If you don't want to use a model, you can use the [Train and Predict API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/train-and-predict/) to test your model without having to evaluate the model's performance.
+[Models trained]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/train/) through the ML Commons plugin support model-based algorithms, such as k-means. After you've trained a model to your precision requirements, you can use such a model for inference. Alternatively, you can use the [Train and Predict API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/train-and-predict/) to test your model without having to evaluate the model's performance.

 ### Using a model for search

-OpenSearch supports multiple search methods that integrate with ML models. For more information, see [Search methods]({{site.url}}{{site.baseurl}}/search-plugins/index/#search-methods).
+OpenSearch supports multiple search methods that integrate with ML models. For more information, see [Search methods]({{site.url}}{{site.baseurl}}/search-plugins/index/#search-methods).
+
+## Disabling a model
+
+You can temporarily disable a model when you don't want to undeploy or delete it. Disable a model by calling the [Update Model API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/update-model/) and setting `is_enabled` to `false`. When you disable a model, it becomes unavailable for [Predict API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/predict/) requests. If you disable a model that is undeployed, the model remains disabled after deployment. You'll need to enable it in order to use it for inference.
+
+## Rate limiting inference calls
+
+Setting a rate limit for Predict API calls on your ML models allows you to reduce your model inference costs. You can set a rate limit for the number of Predict API calls at the following levels:
+
+- **Model level**: Configure a rate limit for all users of the model by calling the Update Model API and specifying a `rate_limiter`. For more information, see [Update Model API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/update-model/).
+- **User level**: Configure a rate limit for a specific user or users of the model by creating a controller. A model may be shared by multiple users; you can configure the controller to set different rate limits for different users. For more information, see [Create Controller API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/controller-apis/create-controller/).
+
+Model-level rate limiting applies to all users of the model. If you specify both a model-level rate limit and a user-level rate limit, the overall rate limit is set to the more restrictive of the two. For example, if the model-level limit is 2 requests per minute and the user-level limit is 4 requests per minute, the overall limit will be set to 2 requests per minute.
+
+To set the rate limit, you must provide two inputs: the maximum number of requests and the time frame. OpenSearch uses these inputs to calculate the rate limit as the maximum number of requests divided by the time frame. For example, if you set the limit to be 4 requests per minute, the rate limit is `4 requests / 1 minute`, which is `1 request / 0.25 minutes`, or `1 request / 15 seconds`. OpenSearch processes predict requests sequentially, in a first-come-first-served manner, and will limit those requests to 1 request per 15 seconds. Imagine two users, Alice and Bob, calling the Predict API for the same model, which has a rate limit of 1 request per 15 seconds. If Alice calls the Predict API and immediately after that Bob calls the Predict API, OpenSearch processes Alice's predict request and rejects Bob's request. Once 15 seconds has passed since Alice's request, Bob can send a request again, and this request will be processed. 
--- a/_ml-commons-plugin/model-access-control.md
+++ b/_ml-commons-plugin/model-access-control.md
@ -114,4 +114,65 @@ PUT _cluster/settings

 Model access control is achieved through the Model Group APIs. These APIs include the register, search, update, and delete model group operations.

-For information about model access control API, see [Model group APIs]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-group-apis/index/).
+For information about APIs related to model access control, see [Model Group APIs]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-group-apis/index/).
+
+## Hidden models
+**Introduced 2.12**
+{: .label .label-purple }
+
+To hide model details from end users, including the cluster admin, you can register a _hidden_ model. If a model is hidden, the non-superadmin users don't have permission to call any [Model APIs]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/index/) except for the [Predict API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/predict/) on the model.
+
+Only superadmin users can register a hidden model. A hidden model can be one of the OpenSearch-provided pretrained models, your own custom model, or an externally hosted model. To register a hidden model, you first need to authenticate with an [admin certificate]({{site.url}}{{site.baseurl}}/security/configuration/tls/#configuring-admin-certificates):
+
+```bash
+curl -k --cert ./kirk.pem --key ./kirk-key.pem -XGET 'https://localhost:9200/.opendistro_security/_search'
+```
+
+All models created by a superadmin user are automatically registered as hidden. To register a hidden model, send a request to the `_register` endpoint:
+
+```bash
+curl -k --cert ./kirk.pem --key ./kirk-key.pem -X POST 'https://localhost:9200/_plugins/_ml/models/_register' -H 'Content-Type: application/json' -d '
+{
+    "name": "OPENSEARCH_ASSISTANT_MODEL",
+    "function_name": "remote",
+    "description": "OpenSearch Assistant Model",
+    "connector": {
+        "name": "Bedrock Claude Connector",
+        "description": "The connector to Bedrock Claude",
+        "version": 1,
+        "protocol": "aws_sigv4",
+        "parameters": {
+          "region": "us-east-1",
+          "service_name": "bedrock"
+        },
+        "credential": {
+            "access_key": "<YOUR_ACCESS_KEY>",
+            "secret_key": "<YOUR_SECRET_KEY>",
+            "session_token": "<YOUR_SESSION_TOKEN>"
+        },
+        "actions": [
+           {
+            "action_type": "predict",
+            "method": "POST",
+            "headers": {
+                "content-type": "application/json"
+            },
+            "url": "https://bedrock-runtime.us-east-1.amazonaws.com/model/anthropic.claude-v2/invoke",
+            "request_body": "{\"prompt\":\"\\n\\nHuman: ${parameters.inputs}\\n\\nAssistant:\",\"max_tokens_to_sample\":300,\"temperature\":0.5,\"top_k\":250,\"top_p\":1,\"stop_sequences\":[\"\\\\n\\\\nHuman:\"]}"
+          }
+       ]
+    }
+}'
+```
+{% include copy.html %}
+
+Once a hidden model is registered, only a superadmin can invoke operations on the model, including the deploy, undeploy, delete, and get API operations. For example, to deploy a hidden model, send the following request. In this request, `q7wLt4sBaDRBsUkl9BJV` is the model ID:
+
+```json
+curl -k --cert ./kirk.pem --key ./kirk-key.pem -X POST 'https://localhost:9200/_plugins/_ml/models/q7wLt4sBaDRBsUkl9BJV/_deploy'
+```
+{% include copy.html %}
+
+The `model_id` of a hidden model is the model `name`. A hidden model includes an `is_hidden` parameter that is set to `true`. You cannot change a hidden model's `is_hidden` parameter.
+
+Admin users can change access to a model by updating its backend roles. 
--- a/_ml-commons-plugin/opensearch-assistant.md
+++ b/_ml-commons-plugin/opensearch-assistant.md
@ -0,0 +1,43 @@
+---
+layout: default
+title: OpenSearch Assistant Toolkit
+has_children: false
+has_toc: false
+nav_order: 28
+---
+
+# OpenSearch Assistant Toolkit
+**Introduced 2.12**
+{: .label .label-purple }
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [OpenSearch forum thread](https://forum.opensearch.org/t/feedback-opensearch-assistant/16741).    
+{: .warning}
+
+The OpenSearch Assistant Toolkit helps you create AI-powered assistants for OpenSearch Dashboards. The toolkit includes the following elements:
+
+- [**Agents and tools**]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/index/): _Agents_ interface with a large language model (LLM) and execute high-level tasks, such as summarization or generating Piped Processing Language (PPL) from natural language. The agent's high-level tasks consist of low-level tasks called _tools_, which can be reused by multiple agents.
+- [**Configuration automation**]({{site.url}}{{site.baseurl}}/automating-configurations/index/): Uses templates to set up infrastructure for artificial intelligence and machine learning (AI/ML) applications. For example, you can automate configuring agents to be used for chat or generating PPL queries from natural language.
+- [**OpenSearch Assistant for OpenSearch Dashboards**]({{site.url}}{{site.baseurl}}/dashboards/dashboards-assistant/index/): This is the OpenSearch Dashboards UI for the AI-powered assistant. The assistant's workflow is configured with various agents and tools.
+ 
+## Enabling OpenSearch Assistant
+
+To enable OpenSearch Assistant, perform the following steps:
+
+- Enable the agent framework and retrieval-augmented generation (RAG) by configuring the following settings:
+    ```yaml
+    plugins.ml_commons.agent_framework_enabled: true
+    plugins.ml_commons.rag_pipeline_feature_enabled: true
+    ```
+    {% include copy.html %}
+- Enable the assistant by configuring the following settings:
+    ```yaml
+    assistant.chat.enabled: true
+    observability.query_assist.enabled: true
+    ```
+    {% include copy.html %}
+
+For more information about ways to enable experimental features, see [Experimental feature flags]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/experimental/).
+
+## Next steps
+
+- For more information about the OpenSearch Assistant UI, see [OpenSearch Assistant for OpenSearch Dashboards]({{site.url}}{{site.baseurl}}/dashboards/dashboards-assistant/index/)
--- a/_ml-commons-plugin/pretrained-models.md
+++ b/_ml-commons-plugin/pretrained-models.md
@ -95,7 +95,7 @@ OpenSearch returns the task ID of the register operation:
 }
 ```

-To check the status of the operation, provide the task ID to the [Tasks API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/tasks-apis/get-task/#get-a-task-by-id):
+To check the status of the operation, provide the task ID to the [Tasks API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/tasks-apis/get-task/):

 ```bash
 GET /_plugins/_ml/tasks/cVeMb4kBJ1eYAeTMFFgj
--- a/_ml-commons-plugin/remote-models/blueprints.md
+++ b/_ml-commons-plugin/remote-models/blueprints.md
@ -6,7 +6,7 @@ nav_order: 65
 parent: Connecting to externally hosted models 
 grand_parent: Integrating ML models
 redirect_from: 
-  - ml-commons-plugin/extensibility/blueprints/
+  - /ml-commons-plugin/extensibility/blueprints/
 ---

 # Connector blueprints
--- a/_ml-commons-plugin/remote-models/connectors.md
+++ b/_ml-commons-plugin/remote-models/connectors.md
@ -7,7 +7,7 @@ nav_order: 61
 parent: Connecting to externally hosted models 
 grand_parent: Integrating ML models
 redirect_from: 
-  - ml-commons-plugin/extensibility/connectors/
+  - /ml-commons-plugin/extensibility/connectors/
 ---

 # Creating connectors for third-party ML platforms
@ -285,6 +285,21 @@ POST /_plugins/_ml/connectors/_create
 ```
 {% include copy-curl.html %}

+## Updating connector credentials
+
+In some cases, you may need to update credentials, like `access_key`, that you use to connect to externally hosted models. You can update credentials without undeploying the model by providing the new credentials in the following request:
+
+```json
+PUT /_plugins/_ml/models/<model_id>
+{
+  "connector": {
+    "credential": {
+      "openAI_key": "YOUR NEW OPENAI KEY"
+    }
+  }
+}
+```
+
 ## Next steps

 - To learn more about connecting to external models, see [Connecting to externally hosted models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/index/).
--- a/_ml-commons-plugin/remote-models/index.md
+++ b/_ml-commons-plugin/remote-models/index.md
@ -6,7 +6,7 @@ has_children: true
 has_toc: false
 nav_order: 60
 redirect_from: 
-  - ml-commons-plugin/extensibility/index/
+  - /ml-commons-plugin/extensibility/index/
 ---

 # Connecting to externally hosted models
@ -177,7 +177,7 @@ OpenSearch returns the task ID of the register operation:
 }
 ```

-To check the status of the operation, provide the task ID to the [Tasks API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/tasks-apis/get-task/#get-a-task-by-id):
+To check the status of the operation, provide the task ID to the [Tasks API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/tasks-apis/get-task/):

 ```bash
 GET /_plugins/_ml/tasks/cVeMb4kBJ1eYAeTMFFgj
@ -309,7 +309,7 @@ The response contains the inference results provided by the OpenAI model:

 ## Step 6: Use the model for search

-To learn how to use the model for vector search, see [Set up neural search]({{site.url}}{{site.baseurl}}http://localhost:4000/docs/latest/search-plugins/neural-search/#set-up-neural-search).
+To learn how to use the model for vector search, see [Using an ML model for neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/#using-an-ml-model-for-neural-search).

 ## Next steps

--- a/_ml-commons-plugin/using-ml-models.md
+++ b/_ml-commons-plugin/using-ml-models.md
@ -22,4 +22,3 @@ To integrate machine learning (ML) models into your OpenSearch cluster, you can
 ## GPU acceleration

 For better performance, you can take advantage of GPU acceleration on your ML node. For more information, see [GPU acceleration]({{site.url}}{{site.baseurl}}/ml-commons-plugin/gpu-acceleration/).
-
--- a/_observing-your-data/event-analytics.md
+++ b/_observing-your-data/event-analytics.md
@ -33,6 +33,9 @@ For more information about building PPL queries, see [Piped Processing Language]
 This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [OpenSearch forum thread](https://forum.opensearch.org/t/feedback-opensearch-assistant/16741).    
 {: .warning}

+Note that machine learning models are probabilistic and that some may perform better than others, so the OpenSearch Assistant may occasionally produce inaccurate information. We recommend evaluating outputs for accuracy as appropriate to your use case, including reviewing the output or combining it with other verification factors.
+{: .important}
+
 To simplify query building, the **OpenSearch Assistant** toolkit offers an assistant that converts natural language queries into PPL. A screenshot is shown in the following image. 

 ![Sample OpenSearch Query Assist screen view]({{site.url}}{{site.baseurl}}/images/log-explorer-query-assist.png)
--- a/_search-plugins/conversational-search.md
+++ b/_search-plugins/conversational-search.md
@ -7,330 +7,179 @@ redirect_from:
  - /ml-commons-plugin/conversational-search/
 ---

-This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://forum.opensearch.org/t/feedback-conversational-search-and-retrieval-augmented-generation-using-search-pipeline-experimental-release/16073). 
-{: .warning}
-
 # Conversational search

-Conversational search is an experimental machine learning (ML) feature that enables a new search interface. Whereas traditional document search allows you to ask a question and receive a list of documents that might contain the answer to that question, conversational search uses large language models (LLMs) to read the top N documents and synthesizes those documents into a plaintext "answer" to your question.
+Conversational search allows you to ask questions in natural language and refine the answers by asking follow-up questions. Thus, the conversation becomes a dialog between you and a large language model (LLM). For this to happen, instead of answering each question individually, the model needs to remember the context of the entire conversation. 

-Currently, conversational search uses two systems to synthesize documents:
+Conversational search is implemented with the following components:

- [Conversation memory](#conversation-memory)
- [Retrieval Augmented Generation (RAG) pipeline](#rag-pipeline)
+- [Conversation history](#conversation-history): Allows an LLM to remember the context of the current conversation and understand follow-up questions.
+- [Retrieval-Augmented Generation (RAG)](#rag): Allows an LLM to supplement its static knowledge base with proprietary or current information.

-## Conversation memory
+## Conversation history

-Conversation memory consists of a simple CRUD-like API comprising two resources: **Conversations** and **Interactions**. Conversations are made up of interactions. An interaction represents a pair of messages: a human input and an artificial intelligence (AI) response. You cannot create any interactions until you've created a conversation. 
+Conversation history consists of a simple CRUD-like API comprising two resources: _memories_ and _messages_. All messages for the current conversation are stored within one conversation _memory_. A _message_ represents a question/answer pair: a human-input question and an AI answer. Messages do not exist by themselves; they must be added to a memory. 

-To make it easier to build and debug applications that use conversation memory, `conversation-meta` and `conversation-interactions` are stored in two system indexes.
+## RAG

-### `conversation-meta` index
+RAG retrieves data from the index and history and sends all the information as context to the LLM. The LLM then supplements its static knowledge base with the dynamically retrieved data. In OpenSearch, RAG is implemented through a search pipeline containing a [retrieval-augmented generation processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rag-processor/). The processor intercepts OpenSearch query results, retrieves previous messages in the conversation from the conversation memory, and sends a prompt to the LLM. After the processor receives a response from the LLM, it saves the response in conversation memory and returns both the original OpenSearch query results and the LLM response. 

-In the `conversation-meta` index, you can customize the `name` field to make it easier for end users to know how to continue a conversation with the AI, as shown in the following schema:
+As of OpenSearch 2.11, the RAG technique has only been tested with OpenAI models and the Anthropic Claude model on Amazon Bedrock.
+{: .warning}

-```jsx
-.plugins-ml-conversation-meta
-{
-    "_meta": {
-        "schema_version": 1
-    },
-    "properties": {
-        "name": {"type": "keyword"},
-        "create_time": {"type": "date", "format": "strict_date_time||epoch_millis"},
-        "user": {"type": "keyword"}
-    }
-}
-```
-
-### `conversation-interactions` index
-
-In the `conversation-interactions` index, all of the following fields are set by the user or AI application. Each field is entered as a string.
-
-| Field | Description |
-| :--- | :--- |
-| `input` | The question that forms the basis for an interaction. |
-| `prompt_template` | The prompt template that was used as the framework for this interaction. |
-| `response` | The AI response to the prompt. |
-| `origin` | The name of the AI or other system that generated the response. |
-| `additional_info` | Any other information that was sent to the "origin" in the prompt. |
-
-The `conversation-interactions` index creates a clean interaction abstraction and make it easy for the index to reconstruct the exact prompts sent to the LLM, enabling robust debugging and explainability, as shown in the following schema:
-
-```jsx
-.plugins-ml-conversation-interactions
-{
-    "_meta": {
-        "schema_version": 1
-    },
-    "properties": {
-        "conversation_id": {"type": "keyword"},
-        "create_time": {"type": "date", "format": "strict_date_time||epoch_millis"},
-        "input": {"type": "text"},
-        "prompt_template": {"type": "text"},
-        "response": {"type": "text"},
-        "origin": {"type": "keyword"},
-        "additional_info": {"type": "text"}
-    }
-}
-```
-
-## Working with conversations and interactions
-
-When the Security plugin is enabled, all conversations in ML Commons exist in a "private" security mode. Only the user who created a conversation can interact with that conversation. No users on the cluster can see another user's conversation.
+When the Security plugin is enabled, all memories exist in a `private` security mode. Only the user who created a memory can interact with that memory. No user can see another user's memory.
 {: .note}

-To begin using conversation memory, enable the following cluster setting:
+## Prerequisites
+
+To begin using conversational search, enable conversation memory and RAG pipeline features:

 ```json
 PUT /_cluster/settings
 {
  "persistent": {
-    "plugins.ml_commons.memory_feature_enabled": true
+    "plugins.ml_commons.memory_feature_enabled": true,
+    "plugins.ml_commons.rag_pipeline_feature_enabled": true
  }
 }
 ```
 {% include copy-curl.html %}

-After conversation memory is enabled, you can use the Memory API to create a conversation. 
+## Using conversational search

-To make the conversation easily identifiable, use the optional `name` field in the Memory API, as shown in the following example. This will be your only opportunity to name your conversation.
+To use conversational search, follow these steps:

+1. [Create a connector to a model](#step-1-create-a-connector-to-a-model).
+1. [Register and deploy the model](#step-2-register-and-deploy-the-model)
+1. [Create a search pipeline](#step-3-create-a-search-pipeline).
+1. [Ingest RAG data into an index](#step-4-ingest-rag-data-into-an-index).
+1. [Create a conversation memory](#step-5-create-a-conversation-memory).
+1. [Use the pipeline for RAG](#step-6-use-the-pipeline-for-rag).

+### Step 1: Create a connector to a model
+
+RAG requires an LLM in order to function. To connect to an LLM, create a [connector]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/connectors/). The following request creates a connector for the OpenAI GPT 3.5 model:

 ```json
-POST /_plugins/_ml/memory/conversation
+POST /_plugins/_ml/connectors/_create
 {
-  "name": Example conversation
-}
-```
-{% include copy-curl.html %}
-
-The Memory API responds with the conversation ID, as shown in the following example response:
-
-```json
-{ "conversation_id": "4of2c9nhoIuhcr" }
-```
-
-You'll use the `conversation_id` to create interactions inside the conversation. To create interactions, enter the `conversation_id` into the Memory API path. Then customize the [fields](#conversation-interactions-index) in the request body, as shown in the following example:
-
-```json
-POST /_plugins/_ml/memory/conversation/4of2c9nhoIuhcr
-{
-  "input": "How do I make an interaction?",
-  "prompt_template": "Hello OpenAI, can you answer this question? \
-											Here's some extra info that may help. \
-											[INFO] \n [QUESTION]",
-  "response": "Hello, this is OpenAI. Here is the answer to your question.",
-  "origin": "MyFirstOpenAIWrapper",
-  "additional_info": "Additional text related to the answer \
-											A JSON or other semi-structured response"
-}
-```
-{% include copy-curl.html %}
-
-The Memory API then responds with an interaction ID, as shown in the following example response:
-
-```json
-{ "interaction_id": "948vh_PoiY2hrnpo" }
-```
-
-### Getting conversations
-
-You can get a list of conversations using the following Memory API operation:
-
-```json
-GET /_plugins/_ml/memory/conversation?max_results=3&next_token=0
-```
-{% include copy-curl.html %}
-
-Use the following path parameters to customize your results.
-
-Parameter | Data type | Description
-:--- | :--- | :---
-`max_results` | Integer | The maximum number of results returned by the response. Default is `10`.
-`next_token` | Integer | Represents the conversation order position that will be retrieved. For example, if conversations A, B, and C exist, `next_token=1` would return conversations B and C. Default is `0`.
-
-The Memory API responds with the most recent conversation, as indicated in the `create_time` field of the following example response:
-
-```json
-{
-  "conversations": [
+  "name": "OpenAI Chat Connector",
+  "description": "The connector to public OpenAI model service for GPT 3.5",
+  "version": 2,
+  "protocol": "http",
+  "parameters": {
+    "endpoint": "api.openai.com",
+    "model": "gpt-3.5-turbo",
+    "temperature": 0
+  },
+  "credential": {
+    "openAI_key": "<YOUR_OPENAI_KEY>"
+  },
+  "actions": [
    {
-      "conversation_id": "0y4hto_in1",
-      "name": "",
-      "create_time": "2023-4-23 10:25.324662"
-	  }, ... (2 more since we specified max_results=3)
-	],
-  "next_token": 3
-}
-```
-
-
-If there are fewer conversations than the number set in `max_results`, the response only returns the number of conversations that exist. Lastly, `next_token` provides an ordered position of the sorted list of conversations. When a conversation is added between subsequent GET conversation calls, one of the listed conversations will be duplicated in the results, for example:
-
-```plaintext
-GetConversations               -> [BCD]EFGH
-CreateConversation             -> ABCDEFGH
-GetConversations(next_token=3) -> ABC[DEF]GH
-```
-
-### Getting interactions
-
-To see a list of interactions in a conversation, enter the `conversation_id` at the end of the API request, as shown in the following example. You can use `max_results` and `next_token` to sort the response:
-
-```json
-GET /_plugins/_ml/memory/conversation/4of2c9nhoIuhcr
-```
-{% include copy-curl.html %}
-
-The Memory API returns the following interaction information:
-
-```json
-{
-  "interactions": [
-    {
-      "interaction_id": "342968y2u4-0",
-      "conversation_id": "0y4hto_in1",
-      "create_time": "2023-4-23 10:25.324662",
-      "input": "How do I make an interaction?",
-      "prompt_template": "Hello OpenAI, can you answer this question? \
-											Here's some extra info that may help. \
-											[INFO] \n [QUESTION]",
-      "response": "Hello, this is OpenAI. Here is the answer to your question.",
-      "origin": "MyFirstOpenAIWrapper",
-      "additional_info": "Additional text related to the answer \
-											A JSON or other semi-structured response"
-	  }, ... (9 more since max_results defaults to 10)
-	],
-  "next_token": 10
-}
-```
-
-### Deleting conversations
-
-To delete a conversation, use the `DELETE` operation, as shown in the following example:
-
-```json
-DELETE /_plugins/_ml/memory/conversation/4of2c9nhoIuhcr
-```
-{% include copy-curl.html %}
-
-The Memory API responds with the following:
-
-```json
-{ "success": true }
-```
-
-## RAG pipeline
-
-RAG is a technique that retrieves documents from an index, passes them through a seq2seq model, such as an LLM, and then supplements the static LLM information with the dynamically retrieved data in context.
-
-As of OpenSearch 2.11, the RAG technique has only been tested with OpenAI models and the Anthropic Claude model on Amazon Bedrock.
-{: .warning}
-
-### Enabling RAG
-
-Use the following cluster setting to enable the RAG pipeline feature:
-
-```json
-PUT /_cluster/settings
-{
-  "persistent": {"plugins.ml_commons.rag_pipeline_feature_enabled": "true"}
-}
-```
-{% include copy-curl.html %}
-
-### Connecting the model
-
-RAG requires an LLM in order to function. We recommend using a [connector]({{site.url}}{{site.baseurl}}ml-commons-plugin/remote-models/connectors/).
-
-Use the following steps to set up an HTTP connector using the OpenAI GPT 3.5 model:
-
-1. Use the Connector API to create the HTTP connector:
-
-    ```json
-    POST /_plugins/_ml/connectors/_create
-    {
-        "name": "OpenAI Chat Connector",
-        "description": "The connector to public OpenAI model service for GPT 3.5",
-        "version": 2,
-        "protocol": "http",
-        "parameters": {
-            "endpoint": "api.openai.com",
-            "model": "gpt-3.5-turbo",
-      "temperature": 0
-        },
-        "credential": {
-            "openAI_key": "<your OpenAI key>"
-        },
-        "actions": [
-            {
-                "action_type": "predict",
-                "method": "POST",
-                "url": "https://${parameters.endpoint}/v1/chat/completions",
-                "headers": {
-                    "Authorization": "Bearer ${credential.openAI_key}"
-                },
-                "request_body": "{ \"model\": \"${parameters.model}\", \"messages\": ${parameters.messages}, \"temperature\": ${parameters.temperature} }"
-            }
-        ]
+      "action_type": "predict",
+      "method": "POST",
+      "url": "https://${parameters.endpoint}/v1/chat/completions",
+      "headers": {
+        "Authorization": "Bearer ${credential.openAI_key}"
+      },
+      "request_body": """{ "model": "${parameters.model}", "messages": ${parameters.messages}, "temperature": ${parameters.temperature} }"""
    }
-    ```
-    {% include copy-curl.html %}
+  ]
+}
+```
+{% include copy-curl.html %}

-1. Create a new model group for the connected model. You'll use the `model_group_id` returned by the Register API to register the model:
-
-    ```json
-    POST /_plugins/_ml/model_groups/_register
-    {
-      "name": "public_model_group", 
-      "description": "This is a public model group"
-    }
-    ```
-    {% include copy-curl.html %}
-
-1. Register and deploy the model using the `connector_id` from the Connector API response in Step 1 and the `model_group_id` returned in Step 2:
-
-    ```json
-    POST /_plugins/_ml/models/_register
-    {
-      "name": "openAI-gpt-3.5-turbo",
-      "function_name": "remote",
-      "model_group_id": "fp-hSYoBu0R6vVqGMnM1",
-      "description": "test model",
-      "connector_id": "f5-iSYoBu0R6vVqGI3PA"
-    }
-    ``` 
-    {% include copy-curl.html %}
-
-1. With the model registered, use the `task_id` returned in the registration response to get the `model_id`. You'll use the `model_id` to deploy the model to OpenSearch:
-
-    ```json
-    GET /_plugins/_ml/tasks/<task_id>
-    ```
-    {% include copy-curl.html %}
-
-1. Using the `model_id` from step 4, deploy the model:
-
-    ```json
-    POST /_plugins/_ml/models/<model_id>/_deploy
-    ```
-    {% include copy-curl.html %}
-
-### Setting up the pipeline
-
-Next, you'll create a search pipeline for the connector model. Use the following Search API request to create a pipeline: 
+OpenSearch responds with a connector ID for the connector:

 ```json
-PUT /_search/pipeline/<pipeline_name>
+{
+  "connector_id": "u3DEbI0BfUsSoeNTti-1"
+}
+```
+
+For example requests that connect to other services and models, see [Connector blueprints]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/blueprints/).
+{: .tip}
+
+### Step 2: Register and deploy the model
+
+Register the LLM for which you created a connector in the previous step. To register the model with OpenSearch, provide the `connector_id` returned in the previous step:
+
+```json
+POST /_plugins/_ml/models/_register
+{
+  "name": "openAI-gpt-3.5-turbo",
+  "function_name": "remote",
+  "description": "test model",
+  "connector_id": "u3DEbI0BfUsSoeNTti-1"
+}
+``` 
+{% include copy-curl.html %}
+
+OpenSearch returns a task ID for the register task and a model ID for the registered model:
+
+```json
+{
+  "task_id": "gXDIbI0BfUsSoeNT_jAb",
+  "status": "CREATED",
+  "model_id": "gnDIbI0BfUsSoeNT_jAw"
+}
+```
+
+To verify that the registration is complete, call the Tasks API:
+
+```json
+GET /_plugins/_ml/tasks/gXDIbI0BfUsSoeNT_jAb
+```
+{% include copy-curl.html %}
+
+The `state` changes to `COMPLETED` in the response:
+
+```json
+{
+  "model_id": "gnDIbI0BfUsSoeNT_jAw",
+  "task_type": "REGISTER_MODEL",
+  "function_name": "REMOTE",
+  "state": "COMPLETED",
+  "worker_node": [
+    "kYv-Z5-mQ4uCUy_cRC6LXA"
+  ],
+  "create_time": 1706927128091,
+  "last_update_time": 1706927128125,
+  "is_async": false
+}
+```
+
+To deploy the model, provide the `model_id` to the Deploy API:
+
+```json
+POST /_plugins/_ml/models/gnDIbI0BfUsSoeNT_jAw/_deploy
+```
+{% include copy-curl.html %}
+
+OpenSearch acknowledges that the model is deployed:
+
+```json
+{
+  "task_id": "cnDObI0BfUsSoeNTDzGd",
+  "task_type": "DEPLOY_MODEL",
+  "status": "COMPLETED"
+}
+```
+
+### Step 3: Create a search pipeline
+
+Next, create a search pipeline with a `retrieval_augmented_generation` processor: 
+
+```json
+PUT /_search/pipeline/rag_pipeline
 {
  "response_processors": [
    {
      "retrieval_augmented_generation": {
        "tag": "openai_pipeline_demo",
        "description": "Demo pipeline Using OpenAI Connector",
-        "model_id": "<model_id>",
+        "model_id": "gnDIbI0BfUsSoeNT_jAw",
        "context_field_list": ["text"],
        "system_prompt": "You are a helpful assistant",
        "user_instructions": "Generate a concise and informative answer in less than 100 words for the given question"
@ -341,73 +190,247 @@ PUT /_search/pipeline/<pipeline_name>
 ```
 {% include copy-curl.html %}

-### Context field list
+For information about the processor fields, see [Retrieval-augmented generation processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rag-processor/).

-`context_field_list` is the list of fields in document sources that the pipeline uses as context for the RAG. For example, when `context_field_list` parses through the following document, the pipeline sends the `text` field from the response to OpenAI model:
+### Step 4: Ingest RAG data into an index
+
+RAG augments the LLM's knowledge with some supplementary data. 
+
+First, create an index in which to store this data and set the default search pipeline to the pipeline created in the previous step:

 ```json
+PUT /my_rag_test_data
 {
-  "_index": "qa_demo",
-  "_id": "SimKcIoBOVKVCYpk1IL-",
-  "_source": {
-    "title": "Abraham Lincoln 2",
-    "text": "Abraham Lincoln was born on February 12, 1809, the second child of Thomas Lincoln and Nancy Hanks Lincoln, in a log cabin on Sinking Spring Farm near Hodgenville, Kentucky.[2] He was a descendant of Samuel Lincoln, an Englishman who migrated from Hingham, Norfolk, to its namesake, Hingham, Massachusetts, in 1638. The family then migrated west, passing through New Jersey, Pennsylvania, and Virginia.[3] Lincoln was also a descendant of the Harrison family of Virginia; his paternal grandfather and namesake, Captain Abraham Lincoln and wife Bathsheba (née Herring) moved the family from Virginia to Jefferson County, Kentucky.[b] The captain was killed in an Indian raid in 1786.[5] His children, including eight-year-old Thomas, Abraham's father, witnessed the attack.[6][c] Thomas then worked at odd jobs in Kentucky and Tennessee before the family settled in Hardin County, Kentucky, in the early 1800s.[6]\n"
+  "settings": {
+    "index.search.default_pipeline" : "rag_pipeline"
+  },
+  "mappings": {
+    "properties": {
+      "text": {
+        "type": "text"
+      }
+    }
  }
 }
 ```
-
-You can customize `context_field_list` in your RAG pipeline to send any fields that exist in your documents to the LLM.
-
-### RAG parameter options
-
-Use the following options when setting up a RAG pipeline under the `retrieval_augmented_generation` argument.
-
-Parameter | Required | Description
-:--- | :--- | :---
-`tag` | No | A tag to help identify the pipeline.
-`description` | Yes | A description of the pipeline.
-`model_id` | Yes | The ID of the model used in the pipeline.
-`context_field_list` | Yes | The list of fields in document sources that the pipeline uses as context for the RAG. For more information, see [Context Field List](#context-field-list).
-`system_prompt` | No | The message sent to the LLM with a `system` role. This is the message the user sees when the LLM receives an interaction.
-`user_instructions` | No | An additional message sent by the LLM with a `user` role. This parameter allows for further customization of what the user receives when interacting with the LLM.
-
-### Using the pipeline
-
-Using the pipeline is similar to submitting [search queries]({{site.url}}{{site.baseurl}}/api-reference/search/#example) to OpenSearch, as shown in the following example:
-
-```json
-GET /<index_name>/_search?search_pipeline=<pipeline_name>
-{
-	"query" : {...},
-	"ext": {
-		"generative_qa_parameters": {
-			"llm_model": "gpt-3.5-turbo",
-			"llm_question": "Was Abraham Lincoln a good politician",
-			"conversation_id": "_ikaSooBHvd8_FqDUOjZ",
-                         "context_size": 5,
-                         "interaction_size": 5,
-                         "timeout": 15
-		}
-	}
-}
-```
 {% include copy-curl.html %}

-The RAG search query uses the following request objects under the `generative_qa_parameters` option.
+Next, ingest the supplementary data into the index:
+
+```json
+POST _bulk
+{"index": {"_index": "my_rag_test_data", "_id": "1"}}
+{"text": "Abraham Lincoln was born on February 12, 1809, the second child of Thomas Lincoln and Nancy Hanks Lincoln, in a log cabin on Sinking Spring Farm near Hodgenville, Kentucky.[2] He was a descendant of Samuel Lincoln, an Englishman who migrated from Hingham, Norfolk, to its namesake, Hingham, Massachusetts, in 1638. The family then migrated west, passing through New Jersey, Pennsylvania, and Virginia.[3] Lincoln was also a descendant of the Harrison family of Virginia; his paternal grandfather and namesake, Captain Abraham Lincoln and wife Bathsheba (née Herring) moved the family from Virginia to Jefferson County, Kentucky.[b] The captain was killed in an Indian raid in 1786.[5] His children, including eight-year-old Thomas, Abraham's father, witnessed the attack.[6][c] Thomas then worked at odd jobs in Kentucky and Tennessee before the family settled in Hardin County, Kentucky, in the early 1800s."}
+{"index": {"_index": "my_rag_test_data", "_id": "2"}}
+{"text": "Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019."}
+```
+{% include copy-curl.html %}
+
+### Step 5: Create a conversation memory
+
+You'll need to create a conversation memory that will store all messages from a conversation. To make the memory easily identifiable, provide a name for the memory in the optional `name` field, as shown in the following example. Because the `name` parameter is not updatable, this is your only opportunity to name your conversation.
+
+```json
+POST /_plugins/_ml/memory/
+{
+  "name": "Conversation about NYC population"
+}
+```
+{% include copy-curl.html %}
+
+OpenSearch responds with a memory ID for the newly created memory:
+
+```json
+{
+  "memory_id": "znCqcI0BfUsSoeNTntd7"
+}
+```
+
+You'll use the `memory_id` to add messages to the memory. 
+
+
+### Step 6: Use the pipeline for RAG
+
+To use the RAG pipeline, send a query to OpenSearch and provide additional parameters in the `ext.generative_qa_parameters` object. 
+
+The `generative_qa_parameters` object supports the following parameters.

 Parameter | Required | Description
 :--- | :--- | :---
-`llm_question` | Yes | The question the LLM must answer. 
+`llm_question` | Yes | The question that the LLM must answer. 
 `llm_model` | No | Overrides the original model set in the connection in cases where you want to use a different model (for example, GPT 4 instead of GPT 3.5). This option is required if a default model is not set during pipeline creation.
-`conversation_id` | No | Integrates conversation memory into your RAG pipeline by adding the 10 most recent conversations into the context of the search query to the LLM. 
-`context_size` | No | The number of search results sent to the LLM. This is typically needed in order to meet the token size limit, which can vary by model. Alternatively, you can use the `size` parameter in the Search API to control the amount of information sent to the LLM.
-`interaction_size` | No | The number of interactions sent to the LLM. Similarly to the number of search results, this affects the total number of tokens seen by the LLM. When not set, the pipeline uses the default interaction size of `10`.
+`memory_id` | No | If you provide a `memory_id`, the pipeline retrieves the 10 most recent messages in the specified memory and adds them to the LLM prompt. If you don't specify a `memory_id`, the prior context is not added to the LLM prompt. 
+`context_size` | No | The number of search results sent to the LLM. This is typically needed in order to meet the token size limit, which can vary by model. Alternatively, you can use the `size` parameter in the Search API to control the number of search results sent to the LLM.
+`message_size` | No | The number of messages sent to the LLM. Similarly to the number of search results, this affects the total number of tokens received by the LLM. When not set, the pipeline uses the default message size of `10`.
 `timeout` | No | The number of seconds that the pipeline waits for the remote model using a connector to respond. Default is `30`.

 If your LLM includes a set token limit, set the `size` field in your OpenSearch query to limit the number of documents used in the search response. Otherwise, the RAG pipeline will send every document in the search results to the LLM.
+{: .note}
+
+If you ask an LLM a question about the present, it cannot provide an answer because it was trained on data from a few years ago. However, if you add current information as context, the LLM is able to generate a response. For example, you can ask the LLM about the population of the New York City metro area in 2023. You'll construct a query that includes an OpenSearch match query and an LLM query. Provide the `memory_id` so that the message is stored in the appropriate memory object:
+
+```json
+GET /my_rag_test_data/_search
+{
+  "query": {
+    "match": {
+      "text": "What's the population of NYC metro area in 2023"
+    }
+  },
+  "ext": {
+    "generative_qa_parameters": {
+      "llm_model": "gpt-3.5-turbo",
+      "llm_question": "What's the population of NYC metro area in 2023",
+      "memory_id": "znCqcI0BfUsSoeNTntd7",
+      "context_size": 5,
+      "message_size": 5,
+      "timeout": 15
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+Because the context included a document containing information about the population of New York City, the LLM was able to correctly answer the question (though it included the word "projected" because it was trained on data from previous years). The response contains the matching documents from the supplementary RAG data and the LLM response:
+
+<details open markdown="block">
+  <summary>
+    Response
+  </summary>
+  {: .text-delta}
+
+```json
+{
+  "took": 1,
+  "timed_out": false,
+  "_shards": {
+    "total": 1,
+    "successful": 1,
+    "skipped": 0,
+    "failed": 0
+  },
+  "hits": {
+    "total": {
+      "value": 2,
+      "relation": "eq"
+    },
+    "max_score": 5.781642,
+    "hits": [
+      {
+        "_index": "my_rag_test_data",
+        "_id": "2",
+        "_score": 5.781642,
+        "_source": {
+          "text": """Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019."""
+        }
+      },
+      {
+        "_index": "my_rag_test_data",
+        "_id": "1",
+        "_score": 0.9782871,
+        "_source": {
+          "text": "Abraham Lincoln was born on February 12, 1809, the second child of Thomas Lincoln and Nancy Hanks Lincoln, in a log cabin on Sinking Spring Farm near Hodgenville, Kentucky.[2] He was a descendant of Samuel Lincoln, an Englishman who migrated from Hingham, Norfolk, to its namesake, Hingham, Massachusetts, in 1638. The family then migrated west, passing through New Jersey, Pennsylvania, and Virginia.[3] Lincoln was also a descendant of the Harrison family of Virginia; his paternal grandfather and namesake, Captain Abraham Lincoln and wife Bathsheba (née Herring) moved the family from Virginia to Jefferson County, Kentucky.[b] The captain was killed in an Indian raid in 1786.[5] His children, including eight-year-old Thomas, Abraham's father, witnessed the attack.[6][c] Thomas then worked at odd jobs in Kentucky and Tennessee before the family settled in Hardin County, Kentucky, in the early 1800s."
+        }
+      }
+    ]
+  },
+  "ext": {
+    "retrieval_augmented_generation": {
+      "answer": "The population of the New York City metro area in 2023 is projected to be 18,937,000.",
+      "message_id": "x3CecI0BfUsSoeNT9tV9"
+    }
+  }
+}
+```
+</details>
+
+Now you'll ask an LLM a follow-up question as part of the same conversation. Again, provide the `memory_id` in the request: 
+
+```json
+GET /my_rag_test_data/_search
+{
+  "query": {
+    "match": {
+      "text": "What was it in 2022"
+    }
+  },
+  "ext": {
+    "generative_qa_parameters": {
+      "llm_model": "gpt-3.5-turbo",
+      "llm_question": "What was it in 2022",
+      "memory_id": "znCqcI0BfUsSoeNTntd7",
+      "context_size": 5,
+      "message_size": 5,
+      "timeout": 15
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+The LLM correctly identifies the subject of the conversation and returns a relevant response:
+
+```json
+{
+  ...
+  "ext": {
+    "retrieval_augmented_generation": {
+      "answer": "The population of the New York City metro area in 2022 was 18,867,000.",
+      "message_id": "p3CvcI0BfUsSoeNTj9iH"
+    }
+  }
+}
+```
+
+To verify that both messages were added to the memory, provide the `memory_ID` to the Get Messages API:
+
+```json
+GET /_plugins/_ml/memory/znCqcI0BfUsSoeNTntd7/messages
+```
+
+The response contains both messages:
+
+<details open markdown="block">
+  <summary>
+    Response
+  </summary>
+  {: .text-delta}
+
+```json
+{
+  "messages": [
+    {
+      "memory_id": "znCqcI0BfUsSoeNTntd7",
+      "message_id": "x3CecI0BfUsSoeNT9tV9",
+      "create_time": "2024-02-03T20:33:50.754708446Z",
+      "input": "What's the population of NYC metro area in 2023",
+      "prompt_template": """[{"role":"system","content":"You are a helpful assistant"},{"role":"user","content":"Generate a concise and informative answer in less than 100 words for the given question"}]""",
+      "response": "The population of the New York City metro area in 2023 is projected to be 18,937,000.",
+      "origin": "retrieval_augmented_generation",
+      "additional_info": {
+        "metadata": """["Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019.","Abraham Lincoln was born on February 12, 1809, the second child of Thomas Lincoln and Nancy Hanks Lincoln, in a log cabin on Sinking Spring Farm near Hodgenville, Kentucky.[2] He was a descendant of Samuel Lincoln, an Englishman who migrated from Hingham, Norfolk, to its namesake, Hingham, Massachusetts, in 1638. The family then migrated west, passing through New Jersey, Pennsylvania, and Virginia.[3] Lincoln was also a descendant of the Harrison family of Virginia; his paternal grandfather and namesake, Captain Abraham Lincoln and wife Bathsheba (née Herring) moved the family from Virginia to Jefferson County, Kentucky.[b] The captain was killed in an Indian raid in 1786.[5] His children, including eight-year-old Thomas, Abraham's father, witnessed the attack.[6][c] Thomas then worked at odd jobs in Kentucky and Tennessee before the family settled in Hardin County, Kentucky, in the early 1800s."]"""
+      }
+    },
+    {
+      "memory_id": "znCqcI0BfUsSoeNTntd7",
+      "message_id": "p3CvcI0BfUsSoeNTj9iH",
+      "create_time": "2024-02-03T20:36:10.24453505Z",
+      "input": "What was it in 2022",
+      "prompt_template": """[{"role":"system","content":"You are a helpful assistant"},{"role":"user","content":"Generate a concise and informative answer in less than 100 words for the given question"}]""",
+      "response": "The population of the New York City metro area in 2022 was 18,867,000.",
+      "origin": "retrieval_augmented_generation",
+      "additional_info": {
+        "metadata": """["Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019.","Abraham Lincoln was born on February 12, 1809, the second child of Thomas Lincoln and Nancy Hanks Lincoln, in a log cabin on Sinking Spring Farm near Hodgenville, Kentucky.[2] He was a descendant of Samuel Lincoln, an Englishman who migrated from Hingham, Norfolk, to its namesake, Hingham, Massachusetts, in 1638. The family then migrated west, passing through New Jersey, Pennsylvania, and Virginia.[3] Lincoln was also a descendant of the Harrison family of Virginia; his paternal grandfather and namesake, Captain Abraham Lincoln and wife Bathsheba (née Herring) moved the family from Virginia to Jefferson County, Kentucky.[b] The captain was killed in an Indian raid in 1786.[5] His children, including eight-year-old Thomas, Abraham's father, witnessed the attack.[6][c] Thomas then worked at odd jobs in Kentucky and Tennessee before the family settled in Hardin County, Kentucky, in the early 1800s."]"""
+      }
+    }
+  ]
+}
+```
+</details>

 ## Next steps

- To learn more about connecting to models on external platforms, see [Connectors]({{site.url}}{{site.baseurl}}ml-commons-plugin/remote-models/connectors/).
- To learn more about using custom models within your OpenSearch cluster, see [Using ML models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/).
-
+- To learn more about connecting to models on external platforms, see [Connectors]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/connectors/).
+- For supported APIs, see [Memory APIs]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/memory-apis/index/).
+- To learn more about search pipelines and processors, see [Search pipelines]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/index/).
+- For available OpenSearch queries, see [Query DSL]({{site.url}}{{site.baseurl}}/query-dsl/).
--- a/_search-plugins/search-pipelines/neural-query-enricher.md
+++ b/_search-plugins/search-pipelines/neural-query-enricher.md
@ -9,7 +9,7 @@ grand_parent: Search pipelines

 # Neural query enricher processor

-The `neural_query_enricher` search request processor is designed to set a default machine learning (ML) model ID at the index or field level for [neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/) queries. To learn more about ML models, see [Using ML models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/) and [Connecting to remote models]({{site.url}}{{site.baseurl}}ml-commons-plugin/remote-models/index/).
+The `neural_query_enricher` search request processor is designed to set a default machine learning (ML) model ID at the index or field level for [neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/) queries. To learn more about ML models, see [Using ML models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/) and [Connecting to remote models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/index/).

 ## Request fields

--- a/_search-plugins/search-pipelines/rag-processor.md
+++ b/_search-plugins/search-pipelines/rag-processor.md
@ -0,0 +1,100 @@
+---
+layout: default
+title: Retrieval-augmented generation
+nav_order: 18
+has_children: false
+parent: Search processors
+grand_parent: Search pipelines
+---
+
+# Retrieval-augmented generation processor
+
+The `retrieval_augmented_generation` processor is a search results processor that you can use in [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/) for retrieval-augmented generation (RAG). The processor intercepts query results, retrieves previous messages from the conversation from the conversational memory, and sends a prompt to a large language model (LLM). After the processor receives a response from the LLM, it saves the response in conversational memory and returns both the original OpenSearch query results and the LLM response.
+
+As of OpenSearch 2.12, the `retrieval_augmented_generation` processor supports only OpenAI and Amazon Bedrock models.
+{: .note}
+
+## Request fields
+
+The following table lists all available request fields.
+
+Field | Data type | Description
+:--- | :--- | :---
+`model_id` | String | The ID of the model used in the pipeline. Required.
+`context_field_list` | Array | A list of fields contained in document sources that the pipeline uses as context for RAG. Required. For more information, see [Context field list](#context-field-list). 
+`system_prompt` | String | The system prompt that is sent to the LLM to adjust its behavior, such as its response tone. Can be a persona description or a set of instructions. Optional.
+`user_instructions` | String | Human-generated instructions sent to the LLM to guide it in producing results. 
+`tag` | String | The processor's identifier. Optional.
+`description` | String | A description of the processor. Optional.
+
+### Context field list
+
+The `context_field_list` is a list of fields contained in document sources that the pipeline uses as context for RAG. For example, suppose your OpenSearch index contains a collection of documents, each including a `title` and `text`:
+
+```json
+{
+  "_index": "qa_demo",
+  "_id": "SimKcIoBOVKVCYpk1IL-",
+  "_source": {
+    "title": "Abraham Lincoln 2",
+    "text": "Abraham Lincoln was born on February 12, 1809, the second child of Thomas Lincoln and Nancy Hanks Lincoln, in a log cabin on Sinking Spring Farm near Hodgenville, Kentucky.[2] He was a descendant of Samuel Lincoln, an Englishman who migrated from Hingham, Norfolk, to its namesake, Hingham, Massachusetts, in 1638. The family then migrated west, passing through New Jersey, Pennsylvania, and Virginia.[3] Lincoln was also a descendant of the Harrison family of Virginia; his paternal grandfather and namesake, Captain Abraham Lincoln and wife Bathsheba (née Herring) moved the family from Virginia to Jefferson County, Kentucky.[b] The captain was killed in an Indian raid in 1786.[5] His children, including eight-year-old Thomas, Abraham's father, witnessed the attack.[6][c] Thomas then worked at odd jobs in Kentucky and Tennessee before the family settled in Hardin County, Kentucky, in the early 1800s.[6]\n"
+  }
+}
+```
+
+You can specify that only the `text` contents should be sent to the LLM by setting `"context_field_list": ["text"]` in the processor. 
+
+## Example 
+
+The following example demonstrates using a search pipeline with a `retrieval_augmented_generation` processor. 
+
+### Creating a search pipeline 
+
+The following request creates a search pipeline containing a `retrieval_augmented_generation` processor for an OpenAI model:
+
+```json
+PUT /_search/pipeline/rag_pipeline
+{
+  "response_processors": [
+    {
+      "retrieval_augmented_generation": {
+        "tag": "openai_pipeline_demo",
+        "description": "Demo pipeline Using OpenAI Connector",
+        "model_id": "gnDIbI0BfUsSoeNT_jAw",
+        "context_field_list": ["text"],
+        "system_prompt": "You are a helpful assistant",
+        "user_instructions": "Generate a concise and informative answer in less than 100 words for the given question"
+      }
+    }
+  ]
+}
+```
+{% include copy-curl.html %}
+
+### Using a search pipeline
+
+Combine an OpenSearch query with an `ext` object that stores generative question answering parameters for the LLM:
+
+```json
+GET /my_rag_test_data/_search?search_pipeline=rag_pipeline
+{
+  "query": {
+    "match": {
+      "text": "Abraham Lincoln"
+    }
+  },
+  "ext": {
+    "generative_qa_parameters": {
+      "llm_model": "gpt-3.5-turbo",
+      "llm_question": "Was Abraham Lincoln a good politician",
+      "memory_id": "iXC4bI0BfUsSoeNTjS30",
+      "context_size": 5,
+      "message_size": 5,
+      "timeout": 15
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+For more information about setting up conversational search, see [Using conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/#using-conversational-search).
--- a/_search-plugins/search-pipelines/search-processors.md
+++ b/_search-plugins/search-pipelines/search-processors.md
@ -38,6 +38,7 @@ The following table lists all supported search response processors.
 Processor | Description | Earliest available version
 :--- | :--- | :---
 [`personalize_search_ranking`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/personalize-search-ranking/) | Uses [Amazon Personalize](https://aws.amazon.com/personalize/) to rerank search results (requires setting up the Amazon Personalize service). | 2.9
+[`retrieval_augmented_generation`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rag-processor/) | Used for retrieval-augmented generation (RAG) in [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). | 2.10 (generally available in 2.12)
 [`rename_field`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rename-field-processor/)| Renames an existing field. | 2.8
 [`rerank`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/)| Reranks search results using a cross-encoder model. | 2.12
 [`collapse`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/collapse-processor/)| Deduplicates search hits based on a field value, similarly to `collapse` in a search request. | 2.12