OpenSearch/docs/reference/setup/add-nodes.asciidoc

[[add-elasticsearch-nodes]]
== Add and remove nodes in your cluster

When you start an instance of {es}, you are starting a _node_. An {es} _cluster_
is a group of nodes that have the same `cluster.name` attribute. As nodes join
or leave a cluster, the cluster automatically reorganizes itself to evenly
distribute the data across the available nodes.

If you are running a single instance of {es}, you have a cluster of one node.
All primary shards reside on the single node. No replica shards can be
allocated, therefore the cluster state remains yellow. The cluster is fully
functional but is at risk of data loss in the event of a failure.

image::setup/images/elas_0202.png["A cluster with one node and three primary shards"]

You add nodes to a cluster to increase its capacity and reliability. By default,
a node is both a data node and eligible to be elected as the master node that
controls the cluster. You can also configure a new node for a specific purpose,
such as handling ingest requests. For more information, see
<<modules-node,Nodes>>.

When you add more nodes to a cluster, it automatically allocates replica shards.
When all primary and replica shards are active, the cluster state changes to
green.

image::setup/images/elas_0204.png["A cluster with three nodes"]

You can run multiple nodes on your local machine in order to experiment with how
an {es} cluster of multiple nodes behaves. To add a node to a cluster running on
your local machine:

. Set up a new {es} instance.
. Specify the name of the cluster with the `cluster.name` setting in
`elasticsearch.yml`. For example, to add a node to the `logging-prod` cluster,
add the line `cluster.name: "logging-prod"` to `elasticsearch.yml`.
. Start {es}. The node automatically discovers and joins the specified cluster.

To add a node to a cluster running on multiple machines, you must also
<<unicast.hosts,set `discovery.seed_hosts`>> so that the new node can discover
the rest of its cluster.

For more information about discovery and shard allocation, see
<<modules-discovery>> and <<modules-cluster>>.

[discrete]
[[add-elasticsearch-nodes-master-eligible]]
=== Master-eligible nodes

As nodes are added or removed Elasticsearch maintains an optimal level of fault
tolerance by automatically updating the cluster's _voting configuration_, which
is the set of <<master-node,master-eligible nodes>> whose responses are counted
when making decisions such as electing a new master or committing a new cluster
state.

It is recommended to have a small and fixed number of master-eligible nodes in a
cluster, and to scale the cluster up and down by adding and removing
master-ineligible nodes only. However there are situations in which it may be
desirable to add or remove some master-eligible nodes to or from a cluster.

[discrete]
[[modules-discovery-adding-nodes]]
==== Adding master-eligible nodes

If you wish to add some nodes to your cluster, simply configure the new nodes
to find the existing cluster and start them up. Elasticsearch adds the new nodes
to the voting configuration if it is appropriate to do so.

During master election or when joining an existing formed cluster, a node
sends a join request to the master in order to be officially added to the
cluster.

[discrete]
[[modules-discovery-removing-nodes]]
==== Removing master-eligible nodes

When removing master-eligible nodes, it is important not to remove too many all
at the same time. For instance, if there are currently seven master-eligible
nodes and you wish to reduce this to three, it is not possible simply to stop
four of the nodes at once: to do so would leave only three nodes remaining,
which is less than half of the voting configuration, which means the cluster
cannot take any further actions.

More precisely, if you shut down half or more of the master-eligible nodes all
at the same time then the cluster will normally become unavailable. If this
happens then you can bring the cluster back online by starting the removed
nodes again.

As long as there are at least three master-eligible nodes in the cluster, as a
general rule it is best to remove nodes one-at-a-time, allowing enough time for
the cluster to <<modules-discovery-quorums,automatically adjust>> the voting
configuration and adapt the fault tolerance level to the new set of nodes.

If there are only two master-eligible nodes remaining then neither node can be
safely removed since both are required to reliably make progress. To remove one
of these nodes you must first inform {es} that it should not be part of the
voting configuration, and that the voting power should instead be given to the
other node. You can then take the excluded node offline without preventing the
other node from making progress. A node which is added to a voting
configuration exclusion list still works normally, but {es} tries to remove it
from the voting configuration so its vote is no longer required. Importantly,
{es} will never automatically move a node on the voting exclusions list back
into the voting configuration. Once an excluded node has been successfully
auto-reconfigured out of the voting configuration, it is safe to shut it down
without affecting the cluster's master-level availability. A node can be added
to the voting configuration exclusion list using the
<<voting-config-exclusions>> API. For example:

[source,console]
--------------------------------------------------
# Add node to voting configuration exclusions list and wait for the system
# to auto-reconfigure the node out of the voting configuration up to the
# default timeout of 30 seconds
POST /_cluster/voting_config_exclusions?node_names=node_name

# Add node to voting configuration exclusions list and wait for
# auto-reconfiguration up to one minute
POST /_cluster/voting_config_exclusions?node_names=node_name&timeout=1m
--------------------------------------------------
// TEST[skip:this would break the test cluster if executed]

The nodes that should be added to the exclusions list are specified by name
using the `?node_names` query parameter, or by their persistent node IDs using
the `?node_ids` query parameter. If a call to the voting configuration
exclusions API fails, you can safely retry it. Only a successful response
guarantees that the node has actually been removed from the voting configuration
and will not be reinstated. If the elected master node is excluded from the
voting configuration then it will abdicate to another master-eligible node that
is still in the voting configuration if such a node is available.

Although the voting configuration exclusions API is most useful for down-scaling
a two-node to a one-node cluster, it is also possible to use it to remove
multiple master-eligible nodes all at the same time. Adding multiple nodes to
the exclusions list has the system try to auto-reconfigure all of these nodes
out of the voting configuration, allowing them to be safely shut down while
keeping the cluster available. In the example described above, shrinking a
seven-master-node cluster down to only have three master nodes, you could add
four nodes to the exclusions list, wait for confirmation, and then shut them
down simultaneously.

NOTE: Voting exclusions are only required when removing at least half of the
master-eligible nodes from a cluster in a short time period. They are not
required when removing master-ineligible nodes, nor are they required when
removing fewer than half of the master-eligible nodes.

Adding an exclusion for a node creates an entry for that node in the voting
configuration exclusions list, which has the system automatically try to
reconfigure the voting configuration to remove that node and prevents it from
returning to the voting configuration once it has removed. The current list of
exclusions is stored in the cluster state and can be inspected as follows:

[source,console]
--------------------------------------------------
GET /_cluster/state?filter_path=metadata.cluster_coordination.voting_config_exclusions
--------------------------------------------------

This list is limited in size by the `cluster.max_voting_config_exclusions` 
setting, which defaults to `10`. See <<modules-discovery-settings>>. Since
voting configuration exclusions are persistent and limited in number, they must
be cleaned up. Normally an exclusion is added when performing some maintenance
on the cluster, and the exclusions should be cleaned up when the maintenance is
complete. Clusters should have no voting configuration exclusions in normal
operation.

If a node is excluded from the voting configuration because it is to be shut
down permanently, its exclusion can be removed after it is shut down and removed
from the cluster. Exclusions can also be cleared if they were created in error
or were only required temporarily by specifying `?wait_for_removal=false`.

[source,console]
--------------------------------------------------
# Wait for all the nodes with voting configuration exclusions to be removed from
# the cluster and then remove all the exclusions, allowing any node to return to
# the voting configuration in the future.
DELETE /_cluster/voting_config_exclusions

# Immediately remove all the voting configuration exclusions, allowing any node
# to return to the voting configuration in the future.
DELETE /_cluster/voting_config_exclusions?wait_for_removal=false
--------------------------------------------------
[DOCS] Intro for adding nodes (#37202) 2019-01-15 14:11:01 -05:00			`[[add-elasticsearch-nodes]]`
[DOCS] Relocate discovery module content (#56611) (#57454) * Moves `Discovery and cluster formation` content from `Modules` to `Set up Elasticsearch`. * Combines `Adding and removing nodes` with `Adding nodes to your cluster`. Adds related redirect. * Removes and redirects the `Modules` page. * Rewrites parts of `Discovery and cluster formation` to remove `module` references and meta references to the section. 2020-06-01 14:13:13 -04:00			`== Add and remove nodes in your cluster`
[DOCS] Intro for adding nodes (#37202) 2019-01-15 14:11:01 -05:00
			`When you start an instance of {es}, you are starting a _node_. An {es} _cluster_`
			is a group of nodes that have the same `cluster.name` attribute. As nodes join
			`or leave a cluster, the cluster automatically reorganizes itself to evenly`
			`distribute the data across the available nodes.`

			`If you are running a single instance of {es}, you have a cluster of one node.`
			`All primary shards reside on the single node. No replica shards can be`
			`allocated, therefore the cluster state remains yellow. The cluster is fully`
			`functional but is at risk of data loss in the event of a failure.`

			`image::setup/images/elas_0202.png["A cluster with one node and three primary shards"]`

			`You add nodes to a cluster to increase its capacity and reliability. By default,`
			`a node is both a data node and eligible to be elected as the master node that`
			`controls the cluster. You can also configure a new node for a specific purpose,`
			`such as handling ingest requests. For more information, see`
			`<<modules-node,Nodes>>.`

			`When you add more nodes to a cluster, it automatically allocates replica shards.`
			`When all primary and replica shards are active, the cluster state changes to`
			`green.`

			`image::setup/images/elas_0204.png["A cluster with three nodes"]`

"Adding nodes" instructions only work on localhost (#52677) The introductory sections of the reference manual contains some simplified instructions for adding a node to the cluster. Unfortunately they are a little too simplified and only really work for clusters running on `localhost`. If you try and follow these instructions for a distributed cluster then the new node will, confusingly, auto-bootstrap itself into a distinct one-node cluster. Multiple nodes running on localhost is a valid config, of course, but we should spell out that these instructions are really only for experimentation and that it takes a bit more work to add nodes to a distributed cluster. This commit does so. Also, the "important config" instructions for discovery say that you MUST set `discovery.seed_hosts` whereas in fact it is fine to ignore this setting and use a dynamic discovery mechanism instead. This commit weakens this statement and links to the docs for dynamic discovery mechanisms. Finally, this section is also overloaded with some technical details that are not important for this context and are adequately covered elsewhere, and completely fails to note that the default discovery port is 9300. This commit addresses this. 2020-02-27 03:51:17 -05:00			`You can run multiple nodes on your local machine in order to experiment with how`
			`an {es} cluster of multiple nodes behaves. To add a node to a cluster running on`
			`your local machine:`
[DOCS] Intro for adding nodes (#37202) 2019-01-15 14:11:01 -05:00
			`. Set up a new {es} instance.`
"Adding nodes" instructions only work on localhost (#52677) The introductory sections of the reference manual contains some simplified instructions for adding a node to the cluster. Unfortunately they are a little too simplified and only really work for clusters running on `localhost`. If you try and follow these instructions for a distributed cluster then the new node will, confusingly, auto-bootstrap itself into a distinct one-node cluster. Multiple nodes running on localhost is a valid config, of course, but we should spell out that these instructions are really only for experimentation and that it takes a bit more work to add nodes to a distributed cluster. This commit does so. Also, the "important config" instructions for discovery say that you MUST set `discovery.seed_hosts` whereas in fact it is fine to ignore this setting and use a dynamic discovery mechanism instead. This commit weakens this statement and links to the docs for dynamic discovery mechanisms. Finally, this section is also overloaded with some technical details that are not important for this context and are adequately covered elsewhere, and completely fails to note that the default discovery port is 9300. This commit addresses this. 2020-02-27 03:51:17 -05:00			. Specify the name of the cluster with the `cluster.name` setting in
			`elasticsearch.yml`. For example, to add a node to the `logging-prod` cluster,
			add the line `cluster.name: "logging-prod"` to `elasticsearch.yml`.
[DOCS] Intro for adding nodes (#37202) 2019-01-15 14:11:01 -05:00			`. Start {es}. The node automatically discovers and joins the specified cluster.`

"Adding nodes" instructions only work on localhost (#52677) The introductory sections of the reference manual contains some simplified instructions for adding a node to the cluster. Unfortunately they are a little too simplified and only really work for clusters running on `localhost`. If you try and follow these instructions for a distributed cluster then the new node will, confusingly, auto-bootstrap itself into a distinct one-node cluster. Multiple nodes running on localhost is a valid config, of course, but we should spell out that these instructions are really only for experimentation and that it takes a bit more work to add nodes to a distributed cluster. This commit does so. Also, the "important config" instructions for discovery say that you MUST set `discovery.seed_hosts` whereas in fact it is fine to ignore this setting and use a dynamic discovery mechanism instead. This commit weakens this statement and links to the docs for dynamic discovery mechanisms. Finally, this section is also overloaded with some technical details that are not important for this context and are adequately covered elsewhere, and completely fails to note that the default discovery port is 9300. This commit addresses this. 2020-02-27 03:51:17 -05:00			`To add a node to a cluster running on multiple machines, you must also`
			<<unicast.hosts,set `discovery.seed_hosts`>> so that the new node can discover
			`the rest of its cluster.`

[DOCS] Intro for adding nodes (#37202) 2019-01-15 14:11:01 -05:00			`For more information about discovery and shard allocation, see`
			`<<modules-discovery>> and <<modules-cluster>>.`
[DOCS] Relocate discovery module content (#56611) (#57454) * Moves `Discovery and cluster formation` content from `Modules` to `Set up Elasticsearch`. * Combines `Adding and removing nodes` with `Adding nodes to your cluster`. Adds related redirect. * Removes and redirects the `Modules` page. * Rewrites parts of `Discovery and cluster formation` to remove `module` references and meta references to the section. 2020-06-01 14:13:13 -04:00
			`[discrete]`
			`[[add-elasticsearch-nodes-master-eligible]]`
			`=== Master-eligible nodes`

			`As nodes are added or removed Elasticsearch maintains an optimal level of fault`
			`tolerance by automatically updating the cluster's _voting configuration_, which`
			`is the set of <<master-node,master-eligible nodes>> whose responses are counted`
			`when making decisions such as electing a new master or committing a new cluster`
			`state.`

			`It is recommended to have a small and fixed number of master-eligible nodes in a`
			`cluster, and to scale the cluster up and down by adding and removing`
			`master-ineligible nodes only. However there are situations in which it may be`
			`desirable to add or remove some master-eligible nodes to or from a cluster.`

			`[discrete]`
			`[[modules-discovery-adding-nodes]]`
			`==== Adding master-eligible nodes`

			`If you wish to add some nodes to your cluster, simply configure the new nodes`
			`to find the existing cluster and start them up. Elasticsearch adds the new nodes`
			`to the voting configuration if it is appropriate to do so.`

			`During master election or when joining an existing formed cluster, a node`
			`sends a join request to the master in order to be officially added to the`
Deprecate and ignore join timeout (#60872) There is no point in timing out a join attempt any more once a cluster is entirely in 7.x. Timing out and retrying with the same master is pointless, and an in-flight join attempt to one master no longer blocks attempts to join other masters. This commit deprecates this unnecessary setting and removes its effect from the joining process. Relates #60873 which removes this setting in master. 2020-08-10 08:57:41 -04:00			`cluster.`
[DOCS] Relocate discovery module content (#56611) (#57454) * Moves `Discovery and cluster formation` content from `Modules` to `Set up Elasticsearch`. * Combines `Adding and removing nodes` with `Adding nodes to your cluster`. Adds related redirect. * Removes and redirects the `Modules` page. * Rewrites parts of `Discovery and cluster formation` to remove `module` references and meta references to the section. 2020-06-01 14:13:13 -04:00
			`[discrete]`
			`[[modules-discovery-removing-nodes]]`
			`==== Removing master-eligible nodes`

			`When removing master-eligible nodes, it is important not to remove too many all`
			`at the same time. For instance, if there are currently seven master-eligible`
			`nodes and you wish to reduce this to three, it is not possible simply to stop`
			`four of the nodes at once: to do so would leave only three nodes remaining,`
			`which is less than half of the voting configuration, which means the cluster`
			`cannot take any further actions.`

			`More precisely, if you shut down half or more of the master-eligible nodes all`
			`at the same time then the cluster will normally become unavailable. If this`
			`happens then you can bring the cluster back online by starting the removed`
			`nodes again.`

			`As long as there are at least three master-eligible nodes in the cluster, as a`
			`general rule it is best to remove nodes one-at-a-time, allowing enough time for`
			`the cluster to <<modules-discovery-quorums,automatically adjust>> the voting`
			`configuration and adapt the fault tolerance level to the new set of nodes.`

			`If there are only two master-eligible nodes remaining then neither node can be`
			`safely removed since both are required to reliably make progress. To remove one`
			`of these nodes you must first inform {es} that it should not be part of the`
			`voting configuration, and that the voting power should instead be given to the`
			`other node. You can then take the excluded node offline without preventing the`
			`other node from making progress. A node which is added to a voting`
			`configuration exclusion list still works normally, but {es} tries to remove it`
			`from the voting configuration so its vote is no longer required. Importantly,`
			`{es} will never automatically move a node on the voting exclusions list back`
			`into the voting configuration. Once an excluded node has been successfully`
			`auto-reconfigured out of the voting configuration, it is safe to shut it down`
			`without affecting the cluster's master-level availability. A node can be added`
			`to the voting configuration exclusion list using the`
			`<<voting-config-exclusions>> API. For example:`

			`[source,console]`
			`--------------------------------------------------`
			`# Add node to voting configuration exclusions list and wait for the system`
			`# to auto-reconfigure the node out of the voting configuration up to the`
			`# default timeout of 30 seconds`
			`POST /_cluster/voting_config_exclusions?node_names=node_name`

			`# Add node to voting configuration exclusions list and wait for`
			`# auto-reconfiguration up to one minute`
			`POST /_cluster/voting_config_exclusions?node_names=node_name&timeout=1m`
			`--------------------------------------------------`
			`// TEST[skip:this would break the test cluster if executed]`

			`The nodes that should be added to the exclusions list are specified by name`
			using the `?node_names` query parameter, or by their persistent node IDs using
			the `?node_ids` query parameter. If a call to the voting configuration
			`exclusions API fails, you can safely retry it. Only a successful response`
			`guarantees that the node has actually been removed from the voting configuration`
			`and will not be reinstated. If the elected master node is excluded from the`
			`voting configuration then it will abdicate to another master-eligible node that`
			`is still in the voting configuration if such a node is available.`

			`Although the voting configuration exclusions API is most useful for down-scaling`
			`a two-node to a one-node cluster, it is also possible to use it to remove`
			`multiple master-eligible nodes all at the same time. Adding multiple nodes to`
			`the exclusions list has the system try to auto-reconfigure all of these nodes`
			`out of the voting configuration, allowing them to be safely shut down while`
			`keeping the cluster available. In the example described above, shrinking a`
			`seven-master-node cluster down to only have three master nodes, you could add`
			`four nodes to the exclusions list, wait for confirmation, and then shut them`
			`down simultaneously.`

			`NOTE: Voting exclusions are only required when removing at least half of the`
			`master-eligible nodes from a cluster in a short time period. They are not`
			`required when removing master-ineligible nodes, nor are they required when`
			`removing fewer than half of the master-eligible nodes.`

			`Adding an exclusion for a node creates an entry for that node in the voting`
			`configuration exclusions list, which has the system automatically try to`
			`reconfigure the voting configuration to remove that node and prevents it from`
			`returning to the voting configuration once it has removed. The current list of`
			`exclusions is stored in the cluster state and can be inspected as follows:`

			`[source,console]`
			`--------------------------------------------------`
			`GET /_cluster/state?filter_path=metadata.cluster_coordination.voting_config_exclusions`
			`--------------------------------------------------`

			This list is limited in size by the `cluster.max_voting_config_exclusions`
			setting, which defaults to `10`. See <<modules-discovery-settings>>. Since
			`voting configuration exclusions are persistent and limited in number, they must`
			`be cleaned up. Normally an exclusion is added when performing some maintenance`
			`on the cluster, and the exclusions should be cleaned up when the maintenance is`
			`complete. Clusters should have no voting configuration exclusions in normal`
			`operation.`

			`If a node is excluded from the voting configuration because it is to be shut`
			`down permanently, its exclusion can be removed after it is shut down and removed`
			`from the cluster. Exclusions can also be cleared if they were created in error`
			or were only required temporarily by specifying `?wait_for_removal=false`.

			`[source,console]`
			`--------------------------------------------------`
			`# Wait for all the nodes with voting configuration exclusions to be removed from`
			`# the cluster and then remove all the exclusions, allowing any node to return to`
			`# the voting configuration in the future.`
			`DELETE /_cluster/voting_config_exclusions`

			`# Immediately remove all the voting configuration exclusions, allowing any node`
			`# to return to the voting configuration in the future.`
			`DELETE /_cluster/voting_config_exclusions?wait_for_removal=false`
			`--------------------------------------------------`