Merge pull request #327 from opensearch-project/shard_indexing_backpressure

Shard indexing backpressure
2021-12-13 11:16:02 -08:00 · 2021-12-13 11:16:02 -08:00 · ca7db133a7
parent ce9ba0fd60 baf9941f2a
commit ca7db133a7
3 changed files with 573 additions and 0 deletions
--- a/_opensearch/shard-indexing-backpressure.md
+++ b/_opensearch/shard-indexing-backpressure.md
@ -0,0 +1,30 @@
+---
+layout: default
+title: Shard indexing backpressure
+nav_order: 62
+has_children: true
+---
+
+# Shard indexing backpressure
+
+Shard indexing backpressure is a smart rejection mechanism at a per-shard level that dynamically rejects indexing requests when your cluster is under strain. It propagates a backpressure that transfers requests from an overwhelmed node or shard to other nodes or shards that are still healthy.
+
+With shard indexing backpressure, you can prevent nodes in your cluster from running into cascading failures due to performance degradation caused by slow nodes, stuck tasks, resource-intensive requests, traffic surges, skewed shard allocations, and so on.
+
+Shard indexing backpressure comes into effect only when one primary and one secondary parameter is breached.
+
+## Primary parameters
+
+Primary parameters are early indicators that a cluster is under strain:
+
+- Shard memory limit breach: If the memory usage of a shard exceeds 95% of its allocated memory, this limit is breached.
+- Node memory limit breach: If the memory usage of a node exceeds 70% of its allocated memory, this limit is breached.
+
+The breach of primary parameters doesn’t cause any actual request rejections, it just triggers an evaluation of the secondary parameters.
+
+## Secondary parameters
+
+Secondary parameters check the performance at the shard level to confirm that the cluster is under strain:
+
+- Throughput: If the throughput at the shard level decreases significantly in its historic view, this limit is breached.
+- Successful Request: If the number of pending requests increases significantly in its historic view, this limit is breached.
--- a/_opensearch/shard-indexing-settings.md
+++ b/_opensearch/shard-indexing-settings.md
@ -0,0 +1,50 @@
+---
+layout: default
+title: Settings
+parent: Shard indexing backpressure
+nav_order: 1
+has_children: false
+---
+
+# Settings
+
+Shard indexing backpressure adds several settings to the standard OpenSearch cluster settings. They are dynamic, so you can change the default behavior of this feature without restarting your cluster.
+
+## High-level controls
+
+The high-level controls allow you to turn the shard indexing backpressure feature on or off.
+
+Setting | Default | Description
+:--- | :--- | :---
+`shard_indexing_pressure.enable` | False | Change to `true` to enable shard indexing backpressure.
+`shard_indexing_pressure.enforced` | False | Run shard indexing backpressure in shadow mode or enforced mode. In shadow mode (value set as `false`), shard indexing backpressure tracks all granular-level metrics, but it doesn't actually reject any indexing requests. In enforced mode (value set as `true`), shard indexing backpressure rejects any requests to the cluster that might cause a dip in its performance.
+
+## Node-level limits
+
+Node-level limits allow you to control memory usage on a node.
+
+Setting | Default | Description
+:--- | :--- | :---
+`shard_indexing_pressure.primary_parameter.node.soft_limit` | 70% | Define the percentage of the node-level memory threshold that acts as a soft indicator for strain on a node.
+
+## Shard-level limits
+
+Shard-level limits allow you to control memory usage on a shard.
+
+Setting | Default | Description
+:--- | :--- | :---
+`shard_indexing_pressure.primary_parameter.shard.min_limit` | 0.001d | Specify the minimum assigned quota for a new shard in any role (coordinator, primary, or replica). Shard indexing backpressure increases or decreases this allocated quota based on the inflow of traffic for the shard.
+`shard_indexing_pressure.operating_factor.lower` | 75% | Specify the lower occupancy limit of the allocated quota of memory for the shard. If the total memory usage of a shard is below this limit, shard indexing backpressure decreases the current allocated memory for that shard.
+`shard_indexing_pressure.operating_factor.optimal` | 85% | Specify the optimal occupancy of the allocated quota of memory for the shard. If the total memory usage of a shard is at this level, shard indexing backpressure doesn't change the current allocated memory for that shard.
+`shard_indexing_pressure.operating_factor.upper` | 95% | Specify the upper occupancy limit of the allocated quota of memory for the shard. If the total memory usage of a shard is above this limit, shard indexing backpressure increases the current allocated memory for that shard.
+
+## Performance degradation factors
+
+The performance degradation factors allow you to control the dynamic performance thresholds for a shard.
+
+Setting | Default | Description
+:--- | :--- | :---
+`shard_indexing_pressure.secondary_parameter.throughput.request_size_window` | 2,000 | The number of requests in the sampling window size on a shard. Shard indexing backpressure compares the overall performance of requests with the requests in the sample window to detect any performance degradation.
+`shard_indexing_pressure.secondary_parameter.throughput.degradation_factor` | 5x | The degradation factor per unit byte for a request. This parameter determines the threshold for any latency spikes. The default value is 5x, which implies that if the latency shoots up 5 times in the historic view, shard indexing backpressure marks it as a performance degradation.
+`shard_indexing_pressure.secondary_parameter.successful_request.elapsed_timeout` | 300000 ms | The amount of time a request is pending in a cluster. This parameter helps identify any stuck-request scenarios.
+`shard_indexing_pressure.secondary_parameter.successful_request.max_outstanding_requests` | 100 | The maximum number of pending requests in a cluster.
--- a/_opensearch/stats-api.md
+++ b/_opensearch/stats-api.md
@ -0,0 +1,493 @@
+---
+layout: default
+title: Stats API
+parent: Shard indexing backpressure
+nav_order: 2
+has_children: false
+---
+
+# Stats API
+
+Use the stats operation to monitor shard indexing backpressure.
+
+## Stats
+Introduced 1.2
+{: .label .label-purple }
+
+Returns node-level and shard-level stats for indexing request rejections.
+
+#### Request
+
+```json
+GET _nodes/_local/stats/shard_indexing_pressure
+```
+
+If `enforced` is `true`:
+
+#### Sample response
+
+```json
+{
+  "_nodes": {
+    "total": 1,
+    "successful": 1,
+    "failed": 0
+  },
+  "cluster_name": "runTask",
+  "nodes": {
+    "q3e1dQjFSqyPSLAgpyQlfw": {
+      "timestamp": 1613072111162,
+      "name": "runTask-0",
+      "transport_address": "127.0.0.1:9300",
+      "host": "127.0.0.1",
+      "ip": "127.0.0.1:9300",
+      "roles": [
+        "data",
+        "ingest",
+        "master",
+        "remote_cluster_client"
+      ],
+      "attributes": {
+        "testattr": "test"
+      },
+      "shard_indexing_pressure": {
+        "stats": {
+          "[index_name][0]": {
+            "memory": {
+              "current": {
+                "coordinating_in_bytes": 0,
+                "primary_in_bytes": 0,
+                "replica_in_bytes": 0
+              },
+              "total": {
+                "coordinating_in_bytes": 299,
+                "primary_in_bytes": 299,
+                "replica_in_bytes": 0
+              }
+            },
+            "rejection": {
+              "coordinating": {
+                "coordinating_rejections": 0,
+                "breakup": {
+                  "node_limits": 0,
+                  "no_successful_request_limits": 0,
+                  "throughput_degradation_limits": 0
+                }
+              },
+              "primary": {
+                "primary_rejections": 0,
+                "breakup": {
+                  "node_limits": 0,
+                  "no_successful_request_limits": 0,
+                  "throughput_degradation_limits": 0
+                }
+              },
+              "replica": {
+                "replica_rejections": 0,
+                "breakup": {
+                  "node_limits": 0,
+                  "no_successful_request_limits": 0,
+                  "throughput_degradation_limits": 0
+                }
+              }
+            },
+            "last_successful_timestamp": {
+              "coordinating_last_successful_request_timestamp_in_millis": 1613072107990,
+              "primary_last_successful_request_timestamp_in_millis": 0,
+              "replica_last_successful_request_timestamp_in_millis": 0
+            },
+            "indexing": {
+              "coordinating_time_in_millis": 96,
+              "coordinating_count": 1,
+              "primary_time_in_millis": 0,
+              "primary_count": 0,
+              "replica_time_in_millis": 0,
+              "replica_count": 0
+            },
+            "memory_allocation": {
+              "current": {
+                "current_coordinating_and_primary_bytes": 0,
+                "current_replica_bytes": 0
+              },
+              "limit": {
+                "current_coordinating_and_primary_limits_in_bytes": 51897,
+                "current_replica_limits_in_bytes": 77845
+              }
+            }
+          }
+        },
+        "total_rejections_breakup": {
+          "node_limits": 0,
+          "no_successful_request_limits": 0,
+          "throughput_degradation_limits": 0
+        },
+        "enabled": true,
+        "enforced" : true
+      }
+    }
+  }
+}
+```
+
+If `enforced` is `false`:
+
+#### Sample response
+
+```json
+{
+  "_nodes": {
+    "total": 1,
+    "successful": 1,
+    "failed": 0
+  },
+  "cluster_name": "runTask",
+  "nodes": {
+    "q3e1dQjFSqyPSLAgpyQlfw": {
+      "timestamp": 1613072111162,
+      "name": "runTask-0",
+      "transport_address": "127.0.0.1:9300",
+      "host": "127.0.0.1",
+      "ip": "127.0.0.1:9300",
+      "roles": [
+        "data",
+        "ingest",
+        "master",
+        "remote_cluster_client"
+      ],
+      "attributes": {
+        "testattr": "test"
+      },
+      "shard_indexing_pressure": {
+        "stats": {
+          "[index_name][0]": {
+            "memory": {
+              "current": {
+                "coordinating_in_bytes": 0,
+                "primary_in_bytes": 0,
+                "replica_in_bytes": 0
+              },
+              "total": {
+                "coordinating_in_bytes": 299,
+                "primary_in_bytes": 299,
+                "replica_in_bytes": 0
+              }
+            },
+            "rejection": {
+              "coordinating": {
+                "coordinating_rejections": 0,
+                "breakup_shadow_mode": {
+                  "node_limits": 0,
+                  "no_successful_request_limits": 0,
+                  "throughput_degradation_limits": 0
+                }
+              },
+              "primary": {
+                "primary_rejections": 0,
+                "breakup_shadow_mode": {
+                  "node_limits": 0,
+                  "no_successful_request_limits": 0,
+                  "throughput_degradation_limits": 0
+                }
+              },
+              "replica": {
+                "replica_rejections": 0,
+                "breakup_shadow_mode": {
+                  "node_limits": 0,
+                  "no_successful_request_limits": 0,
+                  "throughput_degradation_limits": 0
+                }
+              }
+            },
+            "last_successful_timestamp": {
+              "coordinating_last_successful_request_timestamp_in_millis": 1613072107990,
+              "primary_last_successful_request_timestamp_in_millis": 0,
+              "replica_last_successful_request_timestamp_in_millis": 0
+            },
+            "indexing": {
+              "coordinating_time_in_millis": 96,
+              "coordinating_count": 1,
+              "primary_time_in_millis": 0,
+              "primary_count": 0,
+              "replica_time_in_millis": 0,
+              "replica_count": 0
+            },
+            "memory_allocation": {
+              "current": {
+                "current_coordinating_and_primary_bytes": 0,
+                "current_replica_bytes": 0
+              },
+              "limit": {
+                "current_coordinating_and_primary_limits_in_bytes": 51897,
+                "current_replica_limits_in_bytes": 77845
+              }
+            }
+          }
+        },
+        "total_rejections_breakup_shadow_mode": {
+          "node_limits": 0,
+          "no_successful_request_limits": 0,
+          "throughput_degradation_limits": 0
+        },
+        "enabled": true,
+        "enforced" : false
+      }
+    }
+  }
+}
+```
+
+To include all the shards with both active and previous write operations performed on them, specify the `include_all` parameter:
+
+#### Request
+
+```json
+GET _nodes/_local/stats/shard_indexing_pressure?include_all
+```
+
+#### Sample response
+
+```json
+{
+  "_nodes": {
+    "total": 1,
+    "successful": 1,
+    "failed": 0
+  },
+  "cluster_name": "runTask",
+  "nodes": {
+    "q3e1dQjFSqyPSLAgpyQlfw": {
+      "timestamp": 1613072198171,
+      "name": "runTask-0",
+      "transport_address": "127.0.0.1:9300",
+      "host": "127.0.0.1",
+      "ip": "127.0.0.1:9300",
+      "roles": [
+        "data",
+        "ingest",
+        "master",
+        "remote_cluster_client"
+      ],
+      "attributes": {
+        "testattr": "test"
+      },
+      "shard_indexing_pressure": {
+        "stats": {
+          "[index_name][0]": {
+            "memory": {
+              "current": {
+                "coordinating_in_bytes": 0,
+                "primary_in_bytes": 0,
+                "replica_in_bytes": 0
+              },
+              "total": {
+                "coordinating_in_bytes": 604,
+                "primary_in_bytes": 604,
+                "replica_in_bytes": 0
+              }
+            },
+            "rejection": {
+              "coordinating": {
+                "coordinating_rejections": 0,
+                "breakup": {
+                  "node_limits": 0,
+                  "no_successful_request_limits": 0,
+                  "throughput_degradation_limits": 0
+                }
+              },
+              "primary": {
+                "primary_rejections": 0,
+                "breakup": {
+                  "node_limits": 0,
+                  "no_successful_request_limits": 0,
+                  "throughput_degradation_limits": 0
+                }
+              },
+              "replica": {
+                "replica_rejections": 0,
+                "breakup": {
+                  "node_limits": 0,
+                  "no_successful_request_limits": 0,
+                  "throughput_degradation_limits": 0
+                }
+              }
+            },
+            "last_successful_timestamp": {
+              "coordinating_last_successful_request_timestamp_in_millis": 1613072194656,
+              "primary_last_successful_request_timestamp_in_millis": 0,
+              "replica_last_successful_request_timestamp_in_millis": 0
+            },
+            "indexing": {
+              "coordinating_time_in_millis": 145,
+              "coordinating_count": 2,
+              "primary_time_in_millis": 0,
+              "primary_count": 0,
+              "replica_time_in_millis": 0,
+              "replica_count": 0
+            },
+            "memory_allocation": {
+              "current": {
+                "current_coordinating_and_primary_bytes": 0,
+                "current_replica_bytes": 0
+              },
+              "limit": {
+                "current_coordinating_and_primary_limits_in_bytes": 51897,
+                "current_replica_limits_in_bytes": 77845
+              }
+            }
+          }
+        },
+        "total_rejections_breakup": {
+          "node_limits": 0,
+          "no_successful_request_limits": 0,
+          "throughput_degradation_limits": 0
+        },
+        "enabled": true,
+        "enforced": true
+      }
+    }
+  }
+}
+```
+
+To get only all the top-level aggregated stats, specify the `top` parameter (skips the per-shard stats).
+
+#### Request
+
+```json
+GET _nodes/_local/stats/shard_indexing_pressure?top
+```
+
+If `enforced` is `true`:
+
+#### Sample response
+
+```json
+{
+  "_nodes": {
+    "total": 1,
+    "successful": 1,
+    "failed": 0
+  },
+  "cluster_name": "runTask",
+  "nodes": {
+    "q3e1dQjFSqyPSLAgpyQlfw": {
+      "timestamp": 1613072382719,
+      "name": "runTask-0",
+      "transport_address": "127.0.0.1:9300",
+      "host": "127.0.0.1",
+      "ip": "127.0.0.1:9300",
+      "roles": [
+        "data",
+        "ingest",
+        "master",
+        "remote_cluster_client"
+      ],
+      "attributes": {
+        "testattr": "test"
+      },
+      "shard_indexing_pressure": {
+        "stats": {},
+        "total_rejections_breakup": {
+          "node_limits": 0,
+          "no_successful_request_limits": 0,
+          "throughput_degradation_limits": 0
+        },
+        "enabled": true,
+        "enforced": true
+      }
+    }
+  }
+}
+```
+
+If `enforced` is `false`:
+
+#### Sample response
+
+```json
+{
+  "_nodes": {
+    "total": 1,
+    "successful": 1,
+    "failed": 0
+  },
+  "cluster_name": "runTask",
+  "nodes": {
+    "q3e1dQjFSqyPSLAgpyQlfw": {
+      "timestamp": 1613072382719,
+      "name": "runTask-0",
+      "transport_address": "127.0.0.1:9300",
+      "host": "127.0.0.1",
+      "ip": "127.0.0.1:9300",
+      "roles": [
+        "data",
+        "ingest",
+        "master",
+        "remote_cluster_client"
+      ],
+      "attributes": {
+        "testattr": "test"
+      },
+      "shard_indexing_pressure": {
+        "stats": {},
+        "total_rejections_breakup_shadow_mode": {
+          "node_limits": 0,
+          "no_successful_request_limits": 0,
+          "throughput_degradation_limits": 0
+        },
+        "enabled": true,
+        "enforced" : false
+      }
+    }
+  }
+}
+```
+
+To get the shard-level breakup of rejections for every node (only includes shards with active write operations):
+
+#### Request
+
+```json
+GET _nodes/stats/shard_indexing_pressure
+```
+
+#### Sample response
+
+```json
+{
+  "_nodes": {
+    "total": 1,
+    "successful": 1,
+    "failed": 0
+  },
+  "cluster_name": "runTask",
+  "nodes": {
+    "q3e1dQjFSqyPSLAgpyQlfw": {
+      "timestamp": 1613072382719,
+      "name": "runTask-0",
+      "transport_address": "127.0.0.1:9300",
+      "host": "127.0.0.1",
+      "ip": "127.0.0.1:9300",
+      "roles": [
+        "data",
+        "ingest",
+        "master",
+        "remote_cluster_client"
+      ],
+      "attributes": {
+        "testattr": "test"
+      },
+      "shard_indexing_pressure": {
+        "stats": {},
+        "total_rejections_breakup": {
+          "node_limits": 0,
+          "no_successful_request_limits": 0,
+          "throughput_degradation_limits": 0
+        },
+        "enabled": true,
+        "enforced": true
+      }
+    }
+  }
+}
+```