OpenSearch/docs/reference/modules/cluster.asciidoc

[[modules-cluster]]
== Cluster

[float]
[[shards-allocation]]
=== Shards Allocation

Shards allocation is the process of allocating shards to nodes. This can
happen during initial recovery, replica allocation, rebalancing, or
handling nodes being added or removed.

The following settings may be used:

`cluster.routing.allocation.allow_rebalance`::
        Allow to control when rebalancing will happen based on the total
        state of all the indices shards in the cluster. `always`,
        `indices_primaries_active`, and `indices_all_active` are allowed,
        defaulting to `indices_all_active` to reduce chatter during
        initial recovery.


`cluster.routing.allocation.cluster_concurrent_rebalance`::
      Allow to control how many concurrent rebalancing of shards are
      allowed cluster wide, and default it to `2`.


`cluster.routing.allocation.node_initial_primaries_recoveries`::
       Allow to control specifically the number of initial recoveries
       of primaries that are allowed per node. Since most times local
       gateway is used, those should be fast and we can handle more of
       those per node without creating load.


`cluster.routing.allocation.node_concurrent_recoveries`::
     How many concurrent recoveries are allowed to happen on a node.
     Defaults to `2`.

`cluster.routing.allocation.enable`::

Controls shard allocation for all indices, by allowing specific
kinds of shard to be allocated.

Can be set to:
    * `all` (default) - Allows shard allocation for all kinds of shards.
    * `primaries` - Allows shard allocation only for primary shards.
    * `new_primaries` - Allows shard allocation only for primary shards for new indices.
    * `none` - No shard allocations of any kind are allowed for all indices.

`cluster.routing.rebalance.enable`::

Controls shard rebalance for all indices, by allowing specific
kinds of shard to be rebalanced.

Can be set to:
    * `all` (default) - Allows shard balancing for all kinds of shards.
    * `primaries` - Allows shard balancing only for primary shards.
    * `replicas` - Allows shard balancing only for replica shards.
    * `none` - No shard balancing of any kind are allowed for all indices.

`cluster.routing.allocation.same_shard.host`::
      Allows to perform a check to prevent allocation of multiple instances
      of the same shard on a single host, based on host name and host address.
      Defaults to `false`, meaning that no check is performed by default. This
      setting only applies if multiple nodes are started on the same machine.

`indices.recovery.concurrent_streams`::
       The number of streams to open (on a *node* level) to recover a
       shard from a peer shard. Defaults to `3`.

[float]
[[allocation-awareness]]
=== Shard Allocation Awareness

Cluster allocation awareness allows to configure shard and replicas
allocation across generic attributes associated the nodes. Lets explain
it through an example:

Assume we have several racks. When we start a node, we can configure an
attribute called `rack_id` (any attribute name works), for example, here
is a sample config:

----------------------
node.rack_id: rack_one
----------------------

The above sets an attribute called `rack_id` for the relevant node with
a value of `rack_one`. Now, we need to configure the `rack_id` attribute
as one of the awareness allocation attributes (set it on *all* (master
eligible) nodes config):

--------------------------------------------------------
cluster.routing.allocation.awareness.attributes: rack_id
--------------------------------------------------------

The above will mean that the `rack_id` attribute will be used to do
awareness based allocation of shard and its replicas. For example, lets
say we start 2 nodes with `node.rack_id` set to `rack_one`, and deploy a
single index with 5 shards and 1 replica. The index will be fully
deployed on the current nodes (5 shards and 1 replica each, total of 10
shards).

Now, if we start two more nodes, with `node.rack_id` set to `rack_two`,
shards will relocate to even the number of shards across the nodes, but,
a shard and its replica will not be allocated in the same `rack_id`
value.

The awareness attributes can hold several values, for example:

-------------------------------------------------------------
cluster.routing.allocation.awareness.attributes: rack_id,zone
-------------------------------------------------------------

*NOTE*: When using awareness attributes, shards will not be allocated to
nodes that don't have values set for those attributes.

[float]
[[forced-awareness]]
=== Forced Awareness

Sometimes, we know in advance the number of values an awareness
attribute can have, and more over, we would like never to have more
replicas than needed allocated on a specific group of nodes with the
same awareness attribute value. For that, we can force awareness on
specific attributes.

For example, lets say we have an awareness attribute called `zone`, and
we know we are going to have two zones, `zone1` and `zone2`. Here is how
we can force awareness on a node:

[source,js]
-------------------------------------------------------------------
cluster.routing.allocation.awareness.force.zone.values: zone1,zone2
cluster.routing.allocation.awareness.attributes: zone
-------------------------------------------------------------------

Now, lets say we start 2 nodes with `node.zone` set to `zone1` and
create an index with 5 shards and 1 replica. The index will be created,
but only 5 shards will be allocated (with no replicas). Only when we
start more shards with `node.zone` set to `zone2` will the replicas be
allocated.

[float]
==== Automatic Preference When Searching / GETing

When executing a search, or doing a get, the node receiving the request
will prefer to execute the request on shards that exists on nodes that
have the same attribute values as the executing node.

[float]
==== Realtime Settings Update

The settings can be updated using the <<cluster-update-settings,cluster update settings API>> on a live cluster.

[float]
[[allocation-filtering]]
=== Shard Allocation Filtering

Allow to control allocation of indices on nodes based on include/exclude
filters. The filters can be set both on the index level and on the
cluster level. Lets start with an example of setting it on the cluster
level:

Lets say we have 4 nodes, each has specific attribute called `tag`
associated with it (the name of the attribute can be any name). Each
node has a specific value associated with `tag`. Node 1 has a setting
`node.tag: value1`, Node 2 a setting of `node.tag: value2`, and so on.

We can create an index that will only deploy on nodes that have `tag`
set to `value1` and `value2` by setting
`index.routing.allocation.include.tag` to `value1,value2`. For example:

[source,js]
--------------------------------------------------
curl -XPUT localhost:9200/test/_settings -d '{
      "index.routing.allocation.include.tag" : "value1,value2"
}'
--------------------------------------------------

On the other hand, we can create an index that will be deployed on all
nodes except for nodes with a `tag` of value `value3` by setting
`index.routing.allocation.exclude.tag` to `value3`. For example:

[source,js]
--------------------------------------------------
curl -XPUT localhost:9200/test/_settings -d '{
      "index.routing.allocation.exclude.tag" : "value3"
}'
--------------------------------------------------

`index.routing.allocation.require.*` can be used to
specify a number of rules, all of which MUST match in order for a shard
to be  allocated to a node. This is in contrast to `include` which will
include a node if ANY rule matches.

The `include`, `exclude` and `require` values can have generic simple
matching wildcards, for example, `value1*`. A special attribute name
called `_ip` can be used to match on node ip values. In addition `_host`
attribute can be used to match on either the node's hostname or its ip
address. Similarly `_name` and `_id` attributes can be used to match on
node name and node id accordingly.

Obviously a node can have several attributes associated with it, and
both the attribute name and value are controlled in the setting. For
example, here is a sample of several node configurations:

[source,js]
--------------------------------------------------
node.group1: group1_value1
node.group2: group2_value4
--------------------------------------------------

In the same manner, `include`, `exclude` and `require` can work against
several attributes, for example:

[source,js]
--------------------------------------------------
curl -XPUT localhost:9200/test/_settings -d '{
    "index.routing.allocation.include.group1" : "xxx"
    "index.routing.allocation.include.group2" : "yyy",
    "index.routing.allocation.exclude.group3" : "zzz",
    "index.routing.allocation.require.group4" : "aaa"
}'
--------------------------------------------------

The provided settings can also be updated in real time using the update
settings API, allowing to "move" indices (shards) around in realtime.

Cluster wide filtering can also be defined, and be updated in real time
using the cluster update settings API. This setting can come in handy
for things like decommissioning nodes (even if the replica count is set
to 0). Here is a sample of how to decommission a node based on `_ip`
address:

[source,js]
--------------------------------------------------
curl -XPUT localhost:9200/_cluster/settings -d '{
    "transient" : {
        "cluster.routing.allocation.exclude._ip" : "10.0.0.1"
    }
}'
--------------------------------------------------
Migrated documentation into the main repo 2013-08-29 01:24:34 +02:00			`[[modules-cluster]]`
			`== Cluster`

			`[float]`
Add more anchor links to documentation Related to #3679 2013-09-25 10:17:40 -06:00			`[[shards-allocation]]`
Migrated documentation into the main repo 2013-08-29 01:24:34 +02:00			`=== Shards Allocation`

			`Shards allocation is the process of allocating shards to nodes. This can`
			`happen during initial recovery, replica allocation, rebalancing, or`
			`handling nodes being added or removed.`

			`The following settings may be used:`

			`cluster.routing.allocation.allow_rebalance`::
[DOCS] Tidied up cluster settings docs 2014-01-15 14:51:18 +01:00			`Allow to control when rebalancing will happen based on the total`
			state of all the indices shards in the cluster. `always`,
			`indices_primaries_active`, and `indices_all_active` are allowed,
			defaulting to `indices_all_active` to reduce chatter during
Migrated documentation into the main repo 2013-08-29 01:24:34 +02:00			`initial recovery.`


			`cluster.routing.allocation.cluster_concurrent_rebalance`::
[DOCS] Tidied up cluster settings docs 2014-01-15 14:51:18 +01:00			`Allow to control how many concurrent rebalancing of shards are`
Migrated documentation into the main repo 2013-08-29 01:24:34 +02:00			allowed cluster wide, and default it to `2`.


			`cluster.routing.allocation.node_initial_primaries_recoveries`::
[DOCS] Tidied up cluster settings docs 2014-01-15 14:51:18 +01:00			`Allow to control specifically the number of initial recoveries`
			`of primaries that are allowed per node. Since most times local`
			`gateway is used, those should be fast and we can handle more of`
Migrated documentation into the main repo 2013-08-29 01:24:34 +02:00			`those per node without creating load.`


			`cluster.routing.allocation.node_concurrent_recoveries`::
[DOCS] Tidied up cluster settings docs 2014-01-15 14:51:18 +01:00			`How many concurrent recoveries are allowed to happen on a node.`
Migrated documentation into the main repo 2013-08-29 01:24:34 +02:00			Defaults to `2`.

Deprecated disable allocation decider which has the following options: `allocation.disable_new_allocation`, `allocation.disable_allocation`, `allocation.disable_replica_allocation`, in favour for the enable allocation decider which has a single option `allocation.enable` wich can be set to the following values: `none`, `new_primaries`, `primaries` and `all` (default). Closes #4488 2014-01-06 13:24:31 +01:00			`cluster.routing.allocation.enable`::
Docs: Removed all the added/deprecated tags from 1.x 2014-09-26 21:04:42 +02:00
Update cluster.asciidoc Fixed asciidoc syntax 2014-10-22 12:45:10 +02:00			`Controls shard allocation for all indices, by allowing specific`
			`kinds of shard to be allocated.`

			`Can be set to:`
Deprecated disable allocation decider which has the following options: `allocation.disable_new_allocation`, `allocation.disable_allocation`, `allocation.disable_replica_allocation`, in favour for the enable allocation decider which has a single option `allocation.enable` wich can be set to the following values: `none`, `new_primaries`, `primaries` and `all` (default). Closes #4488 2014-01-06 13:24:31 +01:00			* `all` (default) - Allows shard allocation for all kinds of shards.
			* `primaries` - Allows shard allocation only for primary shards.
			* `new_primaries` - Allows shard allocation only for primary shards for new indices.
			* `none` - No shard allocations of any kind are allowed for all indices.
Migrated documentation into the main repo 2013-08-29 01:24:34 +02:00
[ROUTING] Add rebalance enabled allocation decider This commit adds the ability to enable / disable relocations on an entire cluster or on individual indices for either: * `primaries` - only primaries can rebalance * `replica` - only replicas can rebalance * `all` - everything can rebalance (default) * `none` - all rebalances are disabled similar to the allocation enable / disable functionality. Relates to #7288 2014-10-22 10:21:43 +02:00			`cluster.routing.rebalance.enable`::

			`Controls shard rebalance for all indices, by allowing specific`
			`kinds of shard to be rebalanced.`

			`Can be set to:`
			* `all` (default) - Allows shard balancing for all kinds of shards.
			* `primaries` - Allows shard balancing only for primary shards.
			* `replicas` - Allows shard balancing only for replica shards.
			* `none` - No shard balancing of any kind are allowed for all indices.

[DOCS] Added documentation for SameShardAllocationDecider Closes #4615 2014-01-09 11:22:49 +01:00			`cluster.routing.allocation.same_shard.host`::
[DOCS] Clarified docs for cluster.routing.allocation.same_shard.host cluster setting Clarified also javadocs for SameShardAllocationDecider 2014-01-28 11:41:16 +01:00			`Allows to perform a check to prevent allocation of multiple instances`
			`of the same shard on a single host, based on host name and host address.`
			Defaults to `false`, meaning that no check is performed by default. This
			`setting only applies if multiple nodes are started on the same machine.`
Migrated documentation into the main repo 2013-08-29 01:24:34 +02:00
			`indices.recovery.concurrent_streams`::
[DOCS] Tidied up cluster settings docs 2014-01-15 14:51:18 +01:00			`The number of streams to open (on a node level) to recover a`
			shard from a peer shard. Defaults to `3`.
Migrated documentation into the main repo 2013-08-29 01:24:34 +02:00
			`[float]`
Add more anchor links to documentation Related to #3679 2013-09-25 10:17:40 -06:00			`[[allocation-awareness]]`
Migrated documentation into the main repo 2013-08-29 01:24:34 +02:00			`=== Shard Allocation Awareness`

			`Cluster allocation awareness allows to configure shard and replicas`
			`allocation across generic attributes associated the nodes. Lets explain`
			`it through an example:`

			`Assume we have several racks. When we start a node, we can configure an`
			attribute called `rack_id` (any attribute name works), for example, here
			`is a sample config:`

			`----------------------`
			`node.rack_id: rack_one`
			`----------------------`

			The above sets an attribute called `rack_id` for the relevant node with
			a value of `rack_one`. Now, we need to configure the `rack_id` attribute
			`as one of the awareness allocation attributes (set it on all (master`
			`eligible) nodes config):`

			`--------------------------------------------------------`
			`cluster.routing.allocation.awareness.attributes: rack_id`
			`--------------------------------------------------------`

			The above will mean that the `rack_id` attribute will be used to do
			`awareness based allocation of shard and its replicas. For example, lets`
			say we start 2 nodes with `node.rack_id` set to `rack_one`, and deploy a
			`single index with 5 shards and 1 replica. The index will be fully`
			`deployed on the current nodes (5 shards and 1 replica each, total of 10`
			`shards).`

			Now, if we start two more nodes, with `node.rack_id` set to `rack_two`,
			`shards will relocate to even the number of shards across the nodes, but,`
			a shard and its replica will not be allocated in the same `rack_id`
			`value.`

			`The awareness attributes can hold several values, for example:`

			`-------------------------------------------------------------`
			`cluster.routing.allocation.awareness.attributes: rack_id,zone`
			`-------------------------------------------------------------`

			`NOTE: When using awareness attributes, shards will not be allocated to`
			`nodes that don't have values set for those attributes.`

			`[float]`
Add more anchor links to documentation Related to #3679 2013-09-25 10:17:40 -06:00			`[[forced-awareness]]`
Migrated documentation into the main repo 2013-08-29 01:24:34 +02:00			`=== Forced Awareness`

			`Sometimes, we know in advance the number of values an awareness`
			`attribute can have, and more over, we would like never to have more`
Docs: fix typo Closes #8111 2014-10-16 15:45:41 +02:00			`replicas than needed allocated on a specific group of nodes with the`
Migrated documentation into the main repo 2013-08-29 01:24:34 +02:00			`same awareness attribute value. For that, we can force awareness on`
			`specific attributes.`

			For example, lets say we have an awareness attribute called `zone`, and
			we know we are going to have two zones, `zone1` and `zone2`. Here is how
[DOCS] Multiple doc fixes Closes #5047 2014-03-07 14:21:45 +01:00			`we can force awareness on a node:`
Migrated documentation into the main repo 2013-08-29 01:24:34 +02:00
			`[source,js]`
			`-------------------------------------------------------------------`
			`cluster.routing.allocation.awareness.force.zone.values: zone1,zone2`
			`cluster.routing.allocation.awareness.attributes: zone`
			`-------------------------------------------------------------------`

			Now, lets say we start 2 nodes with `node.zone` set to `zone1` and
			`create an index with 5 shards and 1 replica. The index will be created,`
			`but only 5 shards will be allocated (with no replicas). Only when we`
			start more shards with `node.zone` set to `zone2` will the replicas be
			`allocated.`

			`[float]`
			`==== Automatic Preference When Searching / GETing`

			`When executing a search, or doing a get, the node receiving the request`
			`will prefer to execute the request on shards that exists on nodes that`
			`have the same attribute values as the executing node.`

			`[float]`
			`==== Realtime Settings Update`

			`The settings can be updated using the <<cluster-update-settings,cluster update settings API>> on a live cluster.`

			`[float]`
Add more anchor links to documentation Related to #3679 2013-09-25 10:17:40 -06:00			`[[allocation-filtering]]`
Migrated documentation into the main repo 2013-08-29 01:24:34 +02:00			`=== Shard Allocation Filtering`

[DOCS] Multiple doc fixes Closes #5047 2014-03-07 14:21:45 +01:00			`Allow to control allocation of indices on nodes based on include/exclude`
Migrated documentation into the main repo 2013-08-29 01:24:34 +02:00			`filters. The filters can be set both on the index level and on the`
			`cluster level. Lets start with an example of setting it on the cluster`
			`level:`

			Lets say we have 4 nodes, each has specific attribute called `tag`
			`associated with it (the name of the attribute can be any name). Each`
			node has a specific value associated with `tag`. Node 1 has a setting
			`node.tag: value1`, Node 2 a setting of `node.tag: value2`, and so on.

			We can create an index that will only deploy on nodes that have `tag`
			set to `value1` and `value2` by setting
			`index.routing.allocation.include.tag` to `value1,value2`. For example:

			`[source,js]`
			`--------------------------------------------------`
			`curl -XPUT localhost:9200/test/_settings -d '{`
[DOCS] Tidied up cluster settings docs 2014-01-15 14:51:18 +01:00			`"index.routing.allocation.include.tag" : "value1,value2"`
			`}'`
Migrated documentation into the main repo 2013-08-29 01:24:34 +02:00			`--------------------------------------------------`

			`On the other hand, we can create an index that will be deployed on all`
			nodes except for nodes with a `tag` of value `value3` by setting
			`index.routing.allocation.exclude.tag` to `value3`. For example:

			`[source,js]`
			`--------------------------------------------------`
			`curl -XPUT localhost:9200/test/_settings -d '{`
[DOCS] Tidied up cluster settings docs 2014-01-15 14:51:18 +01:00			`"index.routing.allocation.exclude.tag" : "value3"`
			`}'`
Migrated documentation into the main repo 2013-08-29 01:24:34 +02:00			`--------------------------------------------------`

[DOCS] Tidied up cluster settings docs 2014-01-15 14:51:18 +01:00			`index.routing.allocation.require.*` can be used to
Migrated documentation into the main repo 2013-08-29 01:24:34 +02:00			`specify a number of rules, all of which MUST match in order for a shard`
			to be allocated to a node. This is in contrast to `include` which will
			`include a node if ANY rule matches.`

			The `include`, `exclude` and `require` values can have generic simple
			matching wildcards, for example, `value1*`. A special attribute name
			called `_ip` can be used to match on node ip values. In addition `_host`
			`attribute can be used to match on either the node's hostname or its ip`
Add documentation for index.routing.allocation.._name and index.routing.allocation.._id options 2014-01-14 16:20:46 -05:00			address. Similarly `_name` and `_id` attributes can be used to match on
			`node name and node id accordingly.`
Migrated documentation into the main repo 2013-08-29 01:24:34 +02:00
			`Obviously a node can have several attributes associated with it, and`
			`both the attribute name and value are controlled in the setting. For`
			`example, here is a sample of several node configurations:`

			`[source,js]`
			`--------------------------------------------------`
			`node.group1: group1_value1`
			`node.group2: group2_value4`
			`--------------------------------------------------`

			In the same manner, `include`, `exclude` and `require` can work against
			`several attributes, for example:`

			`[source,js]`
			`--------------------------------------------------`
			`curl -XPUT localhost:9200/test/_settings -d '{`
			`"index.routing.allocation.include.group1" : "xxx"`
			`"index.routing.allocation.include.group2" : "yyy",`
			`"index.routing.allocation.exclude.group3" : "zzz",`
			`"index.routing.allocation.require.group4" : "aaa"`
[DOCS] Tidied up cluster settings docs 2014-01-15 14:51:18 +01:00			`}'`
Migrated documentation into the main repo 2013-08-29 01:24:34 +02:00			`--------------------------------------------------`

			`The provided settings can also be updated in real time using the update`
			`settings API, allowing to "move" indices (shards) around in realtime.`

			`Cluster wide filtering can also be defined, and be updated in real time`
			`using the cluster update settings API. This setting can come in handy`
			`for things like decommissioning nodes (even if the replica count is set`
			to 0). Here is a sample of how to decommission a node based on `_ip`
			`address:`

			`[source,js]`
			`--------------------------------------------------`
			`curl -XPUT localhost:9200/_cluster/settings -d '{`
			`"transient" : {`
			`"cluster.routing.allocation.exclude._ip" : "10.0.0.1"`
[DOCS] Tidied up cluster settings docs 2014-01-15 14:51:18 +01:00			`}`
			`}'`
Migrated documentation into the main repo 2013-08-29 01:24:34 +02:00			`--------------------------------------------------`