diff --git a/CHANGES.txt b/CHANGES.txt
index d7a9ecfdc64..7dceeecb127 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -135,6 +135,8 @@ Release 0.91.0 - Unreleased
(Ted Yu via Stack)
HBASE-3694 high multiput latency due to checking global mem store size
in a synchronized function (Liyin Tang via Stack)
+ HBASE-3710 Book.xml - fill out descriptions of metrics
+ (Doug Meil via Stack)
TASK
HBASE-3559 Move report of split to master OFF the heartbeat channel
diff --git a/src/docbkx/book.xml b/src/docbkx/book.xml
index d8d85401466..0643176cda0 100644
--- a/src/docbkx/book.xml
+++ b/src/docbkx/book.xml
@@ -232,49 +232,55 @@ throws InterruptedException, IOException {
Region Server Metrics
hbase.regionserver.blockCacheCount
-
+ Block cache item count in memory. This is the number of blocks of storefiles (HFiles) in the cache.
hbase.regionserver.blockCacheFree
-
+ Block cache memory available (MB).
hbase.regionserver.blockCacheHitRatio
-
+ Block cache hit ratio (0 to 100). TODO: describe impact to ratio where read requests that have cacheBlocks=false
hbase.regionserver.blockCacheSize
-
+ Block cache size in memory (MB)
+
+ hbase.regionserver.compactionQueueSize
+ Size of the compaction queue.
hbase.regionserver.fsReadLatency_avg_time
-
+ Filesystem read latency (ms)
hbase.regionserver.fsReadLatency_num_ops
-
+ TODO
hbase.regionserver.fsSyncLatency_avg_time
-
+ Filesystem sync latency (ms)
hbase.regionserver.fsSyncLatency_num_ops
-
+ TODO
hbase.regionserver.fsWriteLatency_avg_time
-
+ Filesystem write latency (ms)
hbase.regionserver.fsWriteLatency_num_ops
-
+ TODO
hbase.regionserver.memstoreSizeMB
-
+ Sum of all the memstore sizes in this regionserver (MB)
hbase.regionserver.regions
-
+ Number of regions served by the regionserver
hbase.regionserver.requests
-
+ Total number of read and write requests. Requests correspond to regionserver RPC calls, thus a single Get will result in 1 request, but a Scan with caching set to 1000 will result in 1 request for each 'next' call (i.e., not each row). A bulk-load request will constitute 1 request per HFile.
hbase.regionserver.storeFileIndexSizeMB
-
+ Sum of all the storefile index sizes in this regionserver (MB)
hbase.regionserver.stores
-
+ Number of stores open on the regionserver. A store corresponds to a column family. For example, if a table (which contains the column family) has 3 regions on a regionserver, there will be 3 stores open for that column family.
+
+ hbase.regionserver.storeFiles
+ Number of store filles open on the regionserver. A store may have more than one storefile (HFile).
@@ -1055,24 +1061,38 @@ throws InterruptedException, IOException {
Node Decommission
You can have a node gradually shed its load and then shutdown using the
- graceful_restart.sh script. Here is its usage:
- $ ./bin/graceful_stop.sh
-Usage: graceful_stop.sh [--config &conf-dir>] [--restart] [--reload] &hostname>
- restart If we should restart after graceful stop
- reload Move offloaded regions back on to the stopped server
- debug Move offloaded regions back on to the stopped server
- hostname Hostname of server we are to stop
+ graceful_stop.sh script. Here is its usage:
+ $ ./bin/graceful_stop.sh
+Usage: graceful_stop.sh [--config &conf-dir>] [--restart] [--reload] [--thrift] [--rest] &hostname>
+ thrift If we should stop/start thrift before/after the hbase stop/start
+ rest If we should stop/start rest before/after the hbase stop/start
+ restart If we should restart after graceful stop
+ reload Move offloaded regions back on to the stopped server
+ debug Move offloaded regions back on to the stopped server
+ hostname Hostname of server we are to stop
To decommission a loaded regionserver, run the following:
- $ ./bin/graceful_stop.sh HOSTNAME
+ $ ./bin/graceful_stop.sh HOSTNAME
where HOSTNAME is the host carrying the RegionServer
- you would decommission. The script will move the regions off the
+ you would decommission.
+ On HOSTNAME
+ The HOSTNAME passed to graceful_stop.sh
+ must match the hostname that hbase is using to identify regionservers.
+ Check the list of regionservers in the master UI for how HBase is
+ referring to servers. Its usually hostname but can also be FQDN.
+ Whatever HBase is using, this is what you should pass the
+ graceful_stop.sh decommission
+ script. If you pass IPs, the script is not yet smart enough to make
+ a hostname (or FQDN) of it and so it will fail when it checks if server is
+ currently running; the graceful unloading of regions will not run.
+
+ The graceful_stop.sh script will move the regions off the
decommissioned regionserver one at a time to minimize region churn.
It will verify the region deployed in the new location before it
will moves the next region and so on until the decommissioned server
- is carrying zero regions. At this point, the graceful_stop
- tells the RegionServer stop. The master will at this point notice the
+ is carrying zero regions. At this point, the graceful_stop.sh
+ tells the RegionServer stop. The master will at this point notice the
RegionServer gone but all regions will have already been redeployed
and because the RegionServer went down cleanly, there will be no
WAL logs to split.