From 4f3da55d257dd788bd8e326a375c311a796eacbe Mon Sep 17 00:00:00 2001 From: Doug Meil Date: Wed, 29 Feb 2012 22:21:20 +0000 Subject: [PATCH] hbase-5496. ops_mgt.xml - fleshing out HBase Monitoring section. git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1295321 13f79535-47bb-0310-9956-ffa450edef68 --- src/docbkx/ops_mgt.xml | 35 ++++++++++++++++++++++++++++++++--- 1 file changed, 32 insertions(+), 3 deletions(-) diff --git a/src/docbkx/ops_mgt.xml b/src/docbkx/ops_mgt.xml index 3dbd718a89c..d3035a46a4e 100644 --- a/src/docbkx/ops_mgt.xml +++ b/src/docbkx/ops_mgt.xml @@ -300,7 +300,7 @@ false
- Metrics + HBase Metrics
Metric Setup See Metrics for @@ -381,8 +381,37 @@ false
HBase Monitoring - TODO - +
+ Overview + The following metrics are arguably the most important to monitor for each RegionServer for + "macro monitoring", preferably with a system like OpenTSDB. + If your cluster is having performance issues it's likely that you'll see something unusual with + this group. + + HBase: + + Requests + Compactions queue + + + OS: + + IO Wait + User CPU + + + Java: + + GC + + + + + + For more information on HBase metrics, see . + +
+
Slow Query Log The HBase slow query log consists of parseable JSON structures describing the properties of those client operations (Gets, Puts, Deletes, etc.) that either took too long to run, or produced too much output. The thresholds for "too long to run" and "too much output" are configurable, as described below. The output is produced inline in the main region server logs so that it is easy to discover further details from context with other logged events. It is also prepended with identifying tags (responseTooSlow), (responseTooLarge), (operationTooSlow), and (operationTooLarge) in order to enable easy filtering with grep, in case the user desires to see only slow queries.