21 lines
1.1 KiB
Plaintext
21 lines
1.1 KiB
Plaintext
[[cluster-fault-detection]]
|
|
=== Cluster fault detection
|
|
|
|
The elected master periodically checks each of the nodes in the cluster to
|
|
ensure that they are still connected and healthy. Each node in the cluster also
|
|
periodically checks the health of the elected master. These checks are known
|
|
respectively as _follower checks_ and _leader checks_.
|
|
|
|
Elasticsearch allows these checks to occasionally fail or timeout without
|
|
taking any action. It considers a node to be faulty only after a number of
|
|
consecutive checks have failed. You can control fault detection behavior with
|
|
<<modules-discovery-settings,`cluster.fault_detection.*` settings>>.
|
|
|
|
If the elected master detects that a node has disconnected, however, this
|
|
situation is treated as an immediate failure. The master bypasses the timeout
|
|
and retry setting values and attempts to remove the node from the cluster.
|
|
Similarly, if a node detects that the elected master has disconnected, this
|
|
situation is treated as an immediate failure. The node bypasses the timeout and
|
|
retry settings and restarts its discovery phase to try and find or elect a new
|
|
master.
|