180 lines
9.2 KiB
Plaintext
180 lines
9.2 KiB
Plaintext
[[add-elasticsearch-nodes]]
|
|
== Add and remove nodes in your cluster
|
|
|
|
When you start an instance of {es}, you are starting a _node_. An {es} _cluster_
|
|
is a group of nodes that have the same `cluster.name` attribute. As nodes join
|
|
or leave a cluster, the cluster automatically reorganizes itself to evenly
|
|
distribute the data across the available nodes.
|
|
|
|
If you are running a single instance of {es}, you have a cluster of one node.
|
|
All primary shards reside on the single node. No replica shards can be
|
|
allocated, therefore the cluster state remains yellow. The cluster is fully
|
|
functional but is at risk of data loss in the event of a failure.
|
|
|
|
image::setup/images/elas_0202.png["A cluster with one node and three primary shards"]
|
|
|
|
You add nodes to a cluster to increase its capacity and reliability. By default,
|
|
a node is both a data node and eligible to be elected as the master node that
|
|
controls the cluster. You can also configure a new node for a specific purpose,
|
|
such as handling ingest requests. For more information, see
|
|
<<modules-node,Nodes>>.
|
|
|
|
When you add more nodes to a cluster, it automatically allocates replica shards.
|
|
When all primary and replica shards are active, the cluster state changes to
|
|
green.
|
|
|
|
image::setup/images/elas_0204.png["A cluster with three nodes"]
|
|
|
|
You can run multiple nodes on your local machine in order to experiment with how
|
|
an {es} cluster of multiple nodes behaves. To add a node to a cluster running on
|
|
your local machine:
|
|
|
|
. Set up a new {es} instance.
|
|
. Specify the name of the cluster with the `cluster.name` setting in
|
|
`elasticsearch.yml`. For example, to add a node to the `logging-prod` cluster,
|
|
add the line `cluster.name: "logging-prod"` to `elasticsearch.yml`.
|
|
. Start {es}. The node automatically discovers and joins the specified cluster.
|
|
|
|
To add a node to a cluster running on multiple machines, you must also
|
|
<<unicast.hosts,set `discovery.seed_hosts`>> so that the new node can discover
|
|
the rest of its cluster.
|
|
|
|
For more information about discovery and shard allocation, see
|
|
<<modules-discovery>> and <<modules-cluster>>.
|
|
|
|
[discrete]
|
|
[[add-elasticsearch-nodes-master-eligible]]
|
|
=== Master-eligible nodes
|
|
|
|
As nodes are added or removed Elasticsearch maintains an optimal level of fault
|
|
tolerance by automatically updating the cluster's _voting configuration_, which
|
|
is the set of <<master-node,master-eligible nodes>> whose responses are counted
|
|
when making decisions such as electing a new master or committing a new cluster
|
|
state.
|
|
|
|
It is recommended to have a small and fixed number of master-eligible nodes in a
|
|
cluster, and to scale the cluster up and down by adding and removing
|
|
master-ineligible nodes only. However there are situations in which it may be
|
|
desirable to add or remove some master-eligible nodes to or from a cluster.
|
|
|
|
[discrete]
|
|
[[modules-discovery-adding-nodes]]
|
|
==== Adding master-eligible nodes
|
|
|
|
If you wish to add some nodes to your cluster, simply configure the new nodes
|
|
to find the existing cluster and start them up. Elasticsearch adds the new nodes
|
|
to the voting configuration if it is appropriate to do so.
|
|
|
|
During master election or when joining an existing formed cluster, a node
|
|
sends a join request to the master in order to be officially added to the
|
|
cluster.
|
|
|
|
[discrete]
|
|
[[modules-discovery-removing-nodes]]
|
|
==== Removing master-eligible nodes
|
|
|
|
When removing master-eligible nodes, it is important not to remove too many all
|
|
at the same time. For instance, if there are currently seven master-eligible
|
|
nodes and you wish to reduce this to three, it is not possible simply to stop
|
|
four of the nodes at once: to do so would leave only three nodes remaining,
|
|
which is less than half of the voting configuration, which means the cluster
|
|
cannot take any further actions.
|
|
|
|
More precisely, if you shut down half or more of the master-eligible nodes all
|
|
at the same time then the cluster will normally become unavailable. If this
|
|
happens then you can bring the cluster back online by starting the removed
|
|
nodes again.
|
|
|
|
As long as there are at least three master-eligible nodes in the cluster, as a
|
|
general rule it is best to remove nodes one-at-a-time, allowing enough time for
|
|
the cluster to <<modules-discovery-quorums,automatically adjust>> the voting
|
|
configuration and adapt the fault tolerance level to the new set of nodes.
|
|
|
|
If there are only two master-eligible nodes remaining then neither node can be
|
|
safely removed since both are required to reliably make progress. To remove one
|
|
of these nodes you must first inform {es} that it should not be part of the
|
|
voting configuration, and that the voting power should instead be given to the
|
|
other node. You can then take the excluded node offline without preventing the
|
|
other node from making progress. A node which is added to a voting
|
|
configuration exclusion list still works normally, but {es} tries to remove it
|
|
from the voting configuration so its vote is no longer required. Importantly,
|
|
{es} will never automatically move a node on the voting exclusions list back
|
|
into the voting configuration. Once an excluded node has been successfully
|
|
auto-reconfigured out of the voting configuration, it is safe to shut it down
|
|
without affecting the cluster's master-level availability. A node can be added
|
|
to the voting configuration exclusion list using the
|
|
<<voting-config-exclusions>> API. For example:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
# Add node to voting configuration exclusions list and wait for the system
|
|
# to auto-reconfigure the node out of the voting configuration up to the
|
|
# default timeout of 30 seconds
|
|
POST /_cluster/voting_config_exclusions?node_names=node_name
|
|
|
|
# Add node to voting configuration exclusions list and wait for
|
|
# auto-reconfiguration up to one minute
|
|
POST /_cluster/voting_config_exclusions?node_names=node_name&timeout=1m
|
|
--------------------------------------------------
|
|
// TEST[skip:this would break the test cluster if executed]
|
|
|
|
The nodes that should be added to the exclusions list are specified by name
|
|
using the `?node_names` query parameter, or by their persistent node IDs using
|
|
the `?node_ids` query parameter. If a call to the voting configuration
|
|
exclusions API fails, you can safely retry it. Only a successful response
|
|
guarantees that the node has actually been removed from the voting configuration
|
|
and will not be reinstated. If the elected master node is excluded from the
|
|
voting configuration then it will abdicate to another master-eligible node that
|
|
is still in the voting configuration if such a node is available.
|
|
|
|
Although the voting configuration exclusions API is most useful for down-scaling
|
|
a two-node to a one-node cluster, it is also possible to use it to remove
|
|
multiple master-eligible nodes all at the same time. Adding multiple nodes to
|
|
the exclusions list has the system try to auto-reconfigure all of these nodes
|
|
out of the voting configuration, allowing them to be safely shut down while
|
|
keeping the cluster available. In the example described above, shrinking a
|
|
seven-master-node cluster down to only have three master nodes, you could add
|
|
four nodes to the exclusions list, wait for confirmation, and then shut them
|
|
down simultaneously.
|
|
|
|
NOTE: Voting exclusions are only required when removing at least half of the
|
|
master-eligible nodes from a cluster in a short time period. They are not
|
|
required when removing master-ineligible nodes, nor are they required when
|
|
removing fewer than half of the master-eligible nodes.
|
|
|
|
Adding an exclusion for a node creates an entry for that node in the voting
|
|
configuration exclusions list, which has the system automatically try to
|
|
reconfigure the voting configuration to remove that node and prevents it from
|
|
returning to the voting configuration once it has removed. The current list of
|
|
exclusions is stored in the cluster state and can be inspected as follows:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
GET /_cluster/state?filter_path=metadata.cluster_coordination.voting_config_exclusions
|
|
--------------------------------------------------
|
|
|
|
This list is limited in size by the `cluster.max_voting_config_exclusions`
|
|
setting, which defaults to `10`. See <<modules-discovery-settings>>. Since
|
|
voting configuration exclusions are persistent and limited in number, they must
|
|
be cleaned up. Normally an exclusion is added when performing some maintenance
|
|
on the cluster, and the exclusions should be cleaned up when the maintenance is
|
|
complete. Clusters should have no voting configuration exclusions in normal
|
|
operation.
|
|
|
|
If a node is excluded from the voting configuration because it is to be shut
|
|
down permanently, its exclusion can be removed after it is shut down and removed
|
|
from the cluster. Exclusions can also be cleared if they were created in error
|
|
or were only required temporarily by specifying `?wait_for_removal=false`.
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
# Wait for all the nodes with voting configuration exclusions to be removed from
|
|
# the cluster and then remove all the exclusions, allowing any node to return to
|
|
# the voting configuration in the future.
|
|
DELETE /_cluster/voting_config_exclusions
|
|
|
|
# Immediately remove all the voting configuration exclusions, allowing any node
|
|
# to return to the voting configuration in the future.
|
|
DELETE /_cluster/voting_config_exclusions?wait_for_removal=false
|
|
--------------------------------------------------
|