From 4f3da55d257dd788bd8e326a375c311a796eacbe Mon Sep 17 00:00:00 2001
From: Doug Meil <dmeil@apache.org>
Date: Wed, 29 Feb 2012 22:21:20 +0000
Subject: [PATCH] hbase-5496.  ops_mgt.xml - fleshing out HBase Monitoring
 section.

git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1295321 13f79535-47bb-0310-9956-ffa450edef68
---
 src/docbkx/ops_mgt.xml | 35 ++++++++++++++++++++++++++++++++---
 1 file changed, 32 insertions(+), 3 deletions(-)
diff --git a/src/docbkx/ops_mgt.xml b/src/docbkx/ops_mgt.xml
index 3dbd718a89c..d3035a46a4e 100644
--- a/src/docbkx/ops_mgt.xml
+++ b/src/docbkx/ops_mgt.xml
@@ -300,7 +300,7 @@ false
     </section>  <!--  node mgt -->
 
   <section xml:id="hbase_metrics">
-  <title>Metrics</title>
+  <title>HBase Metrics</title>
   <section xml:id="metric_setup">
   <title>Metric Setup</title>
   <para>See <link xlink:href="http://hbase.apache.org/metrics.html">Metrics</link> for
@@ -381,8 +381,37 @@ false
 
   <section xml:id="ops.monitoring">
     <title >HBase Monitoring</title>
-    <para>TODO
-    </para>
+    <section xml:id="ops.monitoring.overview">
+    <title>Overview</title>
+      <para>The following metrics are arguably the most important to monitor for each RegionServer for
+      "macro monitoring", preferably with a system like <link xlink:href="http://opentsdb.net/">OpenTSDB</link>.
+      If your cluster is having performance issues it's likely that you'll see something unusual with 
+      this group.
+      </para>
+      <para>HBase: 
+      <itemizedlist>
+      <listitem>Requests</listitem>
+      <listitem>Compactions queue</listitem>
+      </itemizedlist>
+      </para> 
+      <para>OS: 
+      <itemizedlist>
+      <listitem>IO Wait</listitem>
+      <listitem>User CPU</listitem>
+      </itemizedlist>
+      </para> 
+      <para>Java: 
+      <itemizedlist>
+      <listitem>GC</listitem>
+      </itemizedlist>
+      </para> 
+      <para>
+      </para>
+      <para>
+      For more information on HBase metrics, see <xref linkend="hbase_metrics"/>.
+      </para>
+    </section>
+    
     <section xml:id="ops.slow.query">
     <title>Slow Query Log</title>
 <para>The HBase slow query log consists of parseable JSON structures describing the properties of those client operations (Gets, Puts, Deletes, etc.) that either took too long to run, or produced too much output. The thresholds for "too long to run" and "too much output" are configurable, as described below. The output is produced inline in the main region server logs so that it is easy to discover further details from context with other logged events. It is also prepended with identifying tags <constant>(responseTooSlow)</constant>, <constant>(responseTooLarge)</constant>, <constant>(operationTooSlow)</constant>, and <constant>(operationTooLarge)</constant> in order to enable easy filtering with grep, in case the user desires to see only slow queries.