OpenSearch/docs/reference/modules/discovery/publishing.asciidoc

[[cluster-state-publishing]]
=== Publishing the cluster state

The master node is the only node in a cluster that can make changes to the
cluster state. The master node processes one batch of cluster state updates at
a time, computing the required changes and publishing the updated cluster state
to all the other nodes in the cluster. Each publication starts with the master
broadcasting the updated cluster state to all nodes in the cluster.  Each node
responds with an acknowledgement but does not yet apply the newly-received
state. Once the master has collected acknowledgements from enough
master-eligible nodes, the new cluster state is said to be _committed_ and the
master broadcasts another message instructing nodes to apply the now-committed
state. Each node receives this message, applies the updated state, and then
sends a second acknowledgement back to the master.

The master allows a limited amount of time for each cluster state update to be
completely published to all nodes. It is defined by the
`cluster.publish.timeout` setting, which defaults to `30s`, measured from the
time the publication started. If this time is reached before the new cluster
state is committed then the cluster state change is rejected and the master
considers itself to have failed. It stands down and starts trying to elect a
new master.

If the new cluster state is committed before `cluster.publish.timeout` has
elapsed, the master node considers the change to have succeeded. It waits until
the timeout has elapsed or until it has received acknowledgements that each
node in the cluster has applied the updated state, and then starts processing
and publishing the next cluster state update. If some acknowledgements have not
been received (i.e. some nodes have not yet confirmed that they have applied
the current update), these nodes are said to be _lagging_ since their cluster
states have fallen behind the master's latest state. The master waits for the
lagging nodes to catch up for a further time, `cluster.follower_lag.timeout`,
which defaults to `90s`. If a node has still not successfully applied the
cluster state update within this time then it is considered to have failed and
is removed from the cluster.

Cluster state updates are typically published as diffs to the previous cluster
state, which reduces the time and network bandwidth needed to publish a cluster
state update. For example, when updating the mappings for only a subset of the
indices in the cluster state, only the updates for those indices need to be
published to the nodes in the cluster, as long as those nodes have the previous
cluster state. If a node is missing the previous cluster state, for example
when rejoining a cluster, the master will publish the full cluster state to
that node so that it can receive future updates as diffs.

NOTE: {es} is a peer to peer based system, in which nodes communicate with one
another directly. The high-throughput APIs (index, delete, search) do not
normally interact with the master node. The responsibility of the master node
is to maintain the global cluster state and reassign shards when nodes join or
leave the cluster. Each time the cluster state is changed, the new state is
published to all nodes in the cluster as described above.
[Zen2] Update documentation for Zen2 (#34714) This commit overhauls the documentation of discovery and cluster coordination, removing mention of the Zen Discovery module and replacing it with docs for the new cluster coordination mechanism introduced in 7.0. Relates #32006 2018-12-20 08:02:44 -05:00			`[[cluster-state-publishing]]`
			`=== Publishing the cluster state`

			`The master node is the only node in a cluster that can make changes to the`
			`cluster state. The master node processes one batch of cluster state updates at`
			`a time, computing the required changes and publishing the updated cluster state`
			`to all the other nodes in the cluster. Each publication starts with the master`
			`broadcasting the updated cluster state to all nodes in the cluster. Each node`
			`responds with an acknowledgement but does not yet apply the newly-received`
			`state. Once the master has collected acknowledgements from enough`
			`master-eligible nodes, the new cluster state is said to be _committed_ and the`
			`master broadcasts another message instructing nodes to apply the now-committed`
			`state. Each node receives this message, applies the updated state, and then`
			`sends a second acknowledgement back to the master.`

			`The master allows a limited amount of time for each cluster state update to be`
			`completely published to all nodes. It is defined by the`
			`cluster.publish.timeout` setting, which defaults to `30s`, measured from the
			`time the publication started. If this time is reached before the new cluster`
			`state is committed then the cluster state change is rejected and the master`
			`considers itself to have failed. It stands down and starts trying to elect a`
			`new master.`

			If the new cluster state is committed before `cluster.publish.timeout` has
			`elapsed, the master node considers the change to have succeeded. It waits until`
			`the timeout has elapsed or until it has received acknowledgements that each`
			`node in the cluster has applied the updated state, and then starts processing`
			`and publishing the next cluster state update. If some acknowledgements have not`
			`been received (i.e. some nodes have not yet confirmed that they have applied`
			`the current update), these nodes are said to be _lagging_ since their cluster`
			`states have fallen behind the master's latest state. The master waits for the`
			lagging nodes to catch up for a further time, `cluster.follower_lag.timeout`,
			which defaults to `90s`. If a node has still not successfully applied the
			`cluster state update within this time then it is considered to have failed and`
			`is removed from the cluster.`

Add note about cluster state diffs (#39847) Mentions cluster state diffs in CS publishing docs. 2019-03-11 10:36:41 -04:00			`Cluster state updates are typically published as diffs to the previous cluster`
			`state, which reduces the time and network bandwidth needed to publish a cluster`
			`state update. For example, when updating the mappings for only a subset of the`
			`indices in the cluster state, only the updates for those indices need to be`
			`published to the nodes in the cluster, as long as those nodes have the previous`
			`cluster state. If a node is missing the previous cluster state, for example`
			`when rejoining a cluster, the master will publish the full cluster state to`
			`that node so that it can receive future updates as diffs.`

Align docs etc with new discovery setting names (#38492) In #38333 and #38350 we moved away from the `discovery.zen` settings namespace since these settings have an effect even though Zen Discovery itself is being phased out. This change aligns the documentation and the names of related classes and methods with the newly-introduced naming conventions. 2019-02-06 06:34:38 -05:00			`NOTE: {es} is a peer to peer based system, in which nodes communicate with one`
			`another directly. The high-throughput APIs (index, delete, search) do not`
			`normally interact with the master node. The responsibility of the master node`
			`is to maintain the global cluster state and reassign shards when nodes join or`
			`leave the cluster. Each time the cluster state is changed, the new state is`
			`published to all nodes in the cluster as described above.`