43 lines
2.6 KiB
Plaintext
43 lines
2.6 KiB
Plaintext
|
[[cluster-state-publishing]]
|
||
|
=== Publishing the cluster state
|
||
|
|
||
|
The master node is the only node in a cluster that can make changes to the
|
||
|
cluster state. The master node processes one batch of cluster state updates at
|
||
|
a time, computing the required changes and publishing the updated cluster state
|
||
|
to all the other nodes in the cluster. Each publication starts with the master
|
||
|
broadcasting the updated cluster state to all nodes in the cluster. Each node
|
||
|
responds with an acknowledgement but does not yet apply the newly-received
|
||
|
state. Once the master has collected acknowledgements from enough
|
||
|
master-eligible nodes, the new cluster state is said to be _committed_ and the
|
||
|
master broadcasts another message instructing nodes to apply the now-committed
|
||
|
state. Each node receives this message, applies the updated state, and then
|
||
|
sends a second acknowledgement back to the master.
|
||
|
|
||
|
The master allows a limited amount of time for each cluster state update to be
|
||
|
completely published to all nodes. It is defined by the
|
||
|
`cluster.publish.timeout` setting, which defaults to `30s`, measured from the
|
||
|
time the publication started. If this time is reached before the new cluster
|
||
|
state is committed then the cluster state change is rejected and the master
|
||
|
considers itself to have failed. It stands down and starts trying to elect a
|
||
|
new master.
|
||
|
|
||
|
If the new cluster state is committed before `cluster.publish.timeout` has
|
||
|
elapsed, the master node considers the change to have succeeded. It waits until
|
||
|
the timeout has elapsed or until it has received acknowledgements that each
|
||
|
node in the cluster has applied the updated state, and then starts processing
|
||
|
and publishing the next cluster state update. If some acknowledgements have not
|
||
|
been received (i.e. some nodes have not yet confirmed that they have applied
|
||
|
the current update), these nodes are said to be _lagging_ since their cluster
|
||
|
states have fallen behind the master's latest state. The master waits for the
|
||
|
lagging nodes to catch up for a further time, `cluster.follower_lag.timeout`,
|
||
|
which defaults to `90s`. If a node has still not successfully applied the
|
||
|
cluster state update within this time then it is considered to have failed and
|
||
|
is removed from the cluster.
|
||
|
|
||
|
NOTE: Elasticsearch is a peer to peer based system, in which nodes communicate
|
||
|
with one another directly. The high-throughput APIs (index, delete, search) do
|
||
|
not normally interact with the master node. The responsibility of the master
|
||
|
node is to maintain the global cluster state and reassign shards when nodes join or leave
|
||
|
the cluster. Each time the cluster state is changed, the
|
||
|
new state is published to all nodes in the cluster as described above.
|