From 8f4f844e6ec9ffa996a337fba2880e26b6ef52bf Mon Sep 17 00:00:00 2001
From: David Turner <david.turner@elastic.co>
Date: Tue, 7 Jul 2020 14:14:35 +0100
Subject: [PATCH] Add docs for filesystem health checks (#59134)

Documents the feature and settings introduced in #52680.

Co-authored-by: James Rodewig <james.rodewig@elastic.co>
---
 .../discovery/discovery-settings.asciidoc     | 19 +++++++++++++++++++
 .../discovery/fault-detection.asciidoc        |  7 +++++++
 2 files changed, 26 insertions(+)

diff --git a/docs/reference/modules/discovery/discovery-settings.asciidoc b/docs/reference/modules/discovery/discovery-settings.asciidoc
index f0ce103ebb3..f7f3b899293 100644
--- a/docs/reference/modules/discovery/discovery-settings.asciidoc
+++ b/docs/reference/modules/discovery/discovery-settings.asciidoc
@@ -245,3 +245,22 @@ WARNING: This setting replaces the `discovery.zen.no_master_block` setting in
 earlier versions. The `discovery.zen.no_master_block` setting is ignored.
 
 --
+
+`monitor.fs.health.enabled`::
+
+    (<<cluster-update-settings,Dynamic>>, boolean) If `true`, the node runs
+    periodic <<cluster-fault-detection-filesystem-health,filesystem health
+    checks>>. Defaults to `true`.
+
+`monitor.fs.health.refresh_interval`::
+
+    (<<time-units, Time value>>) Interval between successive
+    <<cluster-fault-detection-filesystem-health,filesystem health checks>>.
+    Defaults to `2m`.
+
+`monitor.fs.health.slow_path_logging_threshold`::
+
+    (<<time-units, Time value>>) If a
+    <<cluster-fault-detection-filesystem-health,filesystem health checks>>
+    takes longer than this threshold then {es} logs a warning. Defaults to
+    `5s`.
diff --git a/docs/reference/modules/discovery/fault-detection.asciidoc b/docs/reference/modules/discovery/fault-detection.asciidoc
index 9062444b80d..56b5bc32a75 100644
--- a/docs/reference/modules/discovery/fault-detection.asciidoc
+++ b/docs/reference/modules/discovery/fault-detection.asciidoc
@@ -18,3 +18,10 @@ Similarly, if a node detects that the elected master has disconnected, this
 situation is treated as an immediate failure. The node bypasses the timeout and
 retry settings and restarts its discovery phase to try and find or elect a new
 master.
+
+[[cluster-fault-detection-filesystem-health]]
+Additionally, each node periodically verifies that its data path is healthy by
+writing a small file to disk and then deleting it again. If a node discovers
+its data path is unhealthy then it is removed from the cluster until the data
+path recovers. You can control this behavior with the
+<<modules-discovery-settings,`monitor.fs.health` settings>>.