Added replica count enforcement (#856)

* Added replica count enforcement Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Incorporated review comments Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Incorporated editorial comments Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
2022-08-11 16:06:34 -04:00 · 2022-08-11 16:06:34 -04:00 · 41c4a8433a
parent e7ef8181f2
commit 41c4a8433a
1 changed files with 26 additions and 0 deletions
--- a/_opensearch/cluster.md
+++ b/_opensearch/cluster.md
@ -176,6 +176,8 @@ To better understand and monitor your cluster, use the [cat API]({{site.url}}{{s

 ## (Advanced) Step 6: Configure shard allocation awareness or forced awareness

+### Shard allocation awareness
+
 If your nodes are spread across several geographical zones, you can configure shard allocation awareness to allocate all replica shards to a zone that’s different from their primary shard.

 With shard allocation awareness, if the nodes in one of your zones fail, you can be assured that your replica shards are spread across your other zones. It adds a layer of fault tolerance to ensure your data survives a zone failure beyond just individual node failures.
@ -205,6 +207,8 @@ You can either use `persistent` or `transient` settings. We recommend the `persi

 Shard allocation awareness attempts to separate primary and replica shards across multiple zones. However, if only one zone is available (such as after a zone failure), OpenSearch allocates replica shards to the only remaining zone.

+### Forced awareness
+
 Another option is to require that primary and replica shards are never allocated to the same zone. This is called forced awareness.

 To configure forced awareness, specify all the possible values for your zone attributes:
@ -226,6 +230,28 @@ If that is not the case, and `opensearch-d1` and `opensearch-d2` do not have the

 Choosing allocation awareness or forced awareness depends on how much space you might need in each zone to balance your primary and replica shards.

+### Replica count enforcement
+
+To enforce an even distribution of shards across all zones and avoid hotspots, you can set the `routing.allocation.awareness.balance` attribute to `true`. This setting can be configured in the opensearch.yml file and dynamically updated using the cluster update settings API:
+
+```json
+PUT _cluster/settings
+{
+  "persistent": {
+    "cluster": {
+      "routing.allocation.awareness.balance": "true"
+    }
+  }
+}
+```
+
+The `routing.allocation.awareness.balance` setting is false by default. When it is set to `true`, the total number of shards for the index must be a multiple of the highest count for any awareness attribute. For example, consider a configuration with two awareness attributes&mdash;zones and rack IDs. Let's say there are two zones and three rack IDs. The highest count of either the number of zones or the number of rack IDs is three. Therefore, the number of shards must be a multiple of three. If it is not, OpenSearch throws a validation exception.
+
+`routing.allocation.awareness.balance` takes effect only if `cluster.routing.allocation.awareness.attributes` and `cluster.routing.allocation.awareness.force.zone.values` are set.
+{: .note}
+
+`routing.allocation.awareness.balance` applies to all operations that create or update indices. For example, let's say you're running a cluster with three nodes and three zones in a zone-aware setting. If you try to create an index with one replica or update an index's settings to one replica, the attempt will fail with a validation exception because the number of shards must be a multiple of three. Similarly, if you try to create an index template with one shard and no replicas, the attempt will fail for the same reason. However, in all of those operations, if you set the number of shards to one and the number of replicas to two, the total number of shards is three and the attempt will succeed. 
+
 ## (Advanced) Step 7: Set up a hot-warm architecture

 You can design a hot-warm architecture where you first index your data to hot nodes---fast and expensive---and after a certain period of time move them to warm nodes---slow and cheap.