OpenSearch/docs/reference/modules/discovery
David Turner 532ade7816 More logging for slow cluster state application (#45007)
Today the lag detector may remove nodes from the cluster if they fail to apply
a cluster state within a reasonable timeframe, but it is rather unclear from
the default logging that this has occurred and there is very little extra
information beyond the fact that the removed node was lagging. Moreover the
only forewarning that the lag detector might be invoked is a message indicating
that cluster state publication took unreasonably long, which does not contain
enough information to investigate the problem further.

This commit adds a good deal more detail to make the issues of slow nodes more
prominent:

- after 10 seconds (by default) we log an INFO message indicating that a
  publication is still waiting for responses from some nodes, including the
  identities of the problematic nodes.

- when the publication times out after 30 seconds (by default) we log a WARN
  message identifying the nodes that are still pending.

- the lag detector logs a more detailed warning when a fatally-lagging node is
  detected.

- if applying a cluster state takes too long then the cluster applier service
  logs a breakdown of all the tasks it ran as part of that process.
2019-08-01 13:20:46 +01:00
..
adding-removing-nodes.asciidoc More improvements to cluster coordination docs (#42799) 2019-06-04 08:25:41 +01:00
bootstrapping.asciidoc More improvements to cluster coordination docs (#42799) 2019-06-04 08:25:41 +01:00
discovery-settings.asciidoc More logging for slow cluster state application (#45007) 2019-08-01 13:20:46 +01:00
discovery.asciidoc Align docs etc with new discovery setting names (#38492) 2019-02-06 11:34:38 +00:00
fault-detection.asciidoc [DOCS] Adds overview and API ref for cluster voting configurations (#36954) 2019-01-07 09:11:14 -08:00
publishing.asciidoc Add note about cluster state diffs (#39847) 2019-03-11 15:40:07 +01:00
quorums.asciidoc [DOCS] Adds overview and API ref for cluster voting configurations (#36954) 2019-01-07 09:11:14 -08:00
voting.asciidoc [DOCS] Adds overview and API ref for cluster voting configurations (#36954) 2019-01-07 09:11:14 -08:00