From e85e7ad354764e21804bea051f5fffd3943868f8 Mon Sep 17 00:00:00 2001 From: Michael Stack Date: Sat, 30 Apr 2011 21:05:39 +0000 Subject: [PATCH] HBASE-3831 docbook xml files - standardized RegionServer, DataNode, and ZooKeeper in several xml docs git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1098158 13f79535-47bb-0310-9956-ffa450edef68 --- src/docbkx/book.xml | 52 +++++++++++++++++----------------- src/docbkx/configuration.xml | 6 ++-- src/docbkx/getting_started.xml | 16 +++++------ src/docbkx/performance.xml | 20 ++++++------- src/docbkx/troubleshooting.xml | 8 +++--- 5 files changed, 51 insertions(+), 51 deletions(-) diff --git a/src/docbkx/book.xml b/src/docbkx/book.xml index 0633163f068..094d5c62c87 100644 --- a/src/docbkx/book.xml +++ b/src/docbkx/book.xml @@ -231,7 +231,7 @@ throws InterruptedException, IOException {
- Region Server Metrics + RegionServer Metrics
<varname>hbase.regionserver.blockCacheCount</varname> Block cache item count in memory. This is the number of blocks of storefiles (HFiles) in the cache.
@@ -266,22 +266,22 @@ throws InterruptedException, IOException { TODO
<varname>hbase.regionserver.memstoreSizeMB</varname> - Sum of all the memstore sizes in this regionserver (MB) + Sum of all the memstore sizes in this RegionServer (MB)
<varname>hbase.regionserver.regions</varname> - Number of regions served by the regionserver + Number of regions served by the RegionServer
<varname>hbase.regionserver.requests</varname> - Total number of read and write requests. Requests correspond to regionserver RPC calls, thus a single Get will result in 1 request, but a Scan with caching set to 1000 will result in 1 request for each 'next' call (i.e., not each row). A bulk-load request will constitute 1 request per HFile. + Total number of read and write requests. Requests correspond to RegionServer RPC calls, thus a single Get will result in 1 request, but a Scan with caching set to 1000 will result in 1 request for each 'next' call (i.e., not each row). A bulk-load request will constitute 1 request per HFile.
<varname>hbase.regionserver.storeFileIndexSizeMB</varname> - Sum of all the storefile index sizes in this regionserver (MB) + Sum of all the storefile index sizes in this RegionServer (MB)
<varname>hbase.regionserver.stores</varname> - Number of stores open on the regionserver. A store corresponds to a column family. For example, if a table (which contains the column family) has 3 regions on a regionserver, there will be 3 stores open for that column family. + Number of stores open on the RegionServer. A store corresponds to a column family. For example, if a table (which contains the column family) has 3 regions on a RegionServer, there will be 3 stores open for that column family.
<varname>hbase.regionserver.storeFiles</varname> - Number of store filles open on the regionserver. A store may have more than one storefile (HFile). + Number of store filles open on the RegionServer. A store may have more than one storefile (HFile).
@@ -712,7 +712,7 @@ throws InterruptedException, IOException { HTable instances are not thread-safe. When creating HTable instances, it is advisable to use the same HBaseConfiguration -instance. This will ensure sharing of zookeeper and socket instances to the region servers +instance. This will ensure sharing of ZooKeeper and socket instances to the RegionServers which is usually what you want. For example, this is preferred: HBaseConfiguration conf = HBaseConfiguration.create(); HTable table1 = new HTable(conf, "myTable"); @@ -729,7 +729,7 @@ HTable table2 = new HTable(conf2, "myTable");
WriteBuffer and Batch Methods If is turned off on HTable, - Puts are sent to region servers when the writebuffer + Puts are sent to RegionServers when the writebuffer is filled. The writebuffer is 2MB by default. Before an HTable instance is discarded, either close() or flushCommits() should be invoked so Puts @@ -742,7 +742,7 @@ HTable table2 = new HTable(conf2, "myTable");
Filters Get and Scan instances can be - optionally configured with filters which are applied on the region server. + optionally configured with filters which are applied on the RegionServer.
@@ -796,7 +796,7 @@ HTable table2 = new HTable(conf2, "myTable"); There is not much memory footprint difference between 1 region - and 10 in terms of indexes, etc, held by the regionserver. + and 10 in terms of indexes, etc, held by the RegionServer.
@@ -1118,27 +1118,27 @@ HTable table2 = new HTable(conf2, "myTable"); See .
Node Decommission - You can stop an individual regionserver by running the following + You can stop an individual RegionServer by running the following script in the HBase directory on the particular node: $ ./bin/hbase-daemon.sh stop regionserver - The regionserver will first close all regions and then shut itself down. - On shutdown, the regionserver's ephemeral node in ZooKeeper will expire. - The master will notice the regionserver gone and will treat it as - a 'crashed' server; it will reassign the nodes the regionserver was carrying. + The RegionServer will first close all regions and then shut itself down. + On shutdown, the RegionServer's ephemeral node in ZooKeeper will expire. + The master will notice the RegionServer gone and will treat it as + a 'crashed' server; it will reassign the nodes the RegionServer was carrying. Disable the Load Balancer before Decommissioning a node If the load balancer runs while a node is shutting down, then there could be contention between the Load Balancer and the - Master's recovery of the just decommissioned regionserver. + Master's recovery of the just decommissioned RegionServer. Avoid any problems by disabling the balancer first. See below. - A downside to the above stop of a regionserver is that regions could be offline for + A downside to the above stop of a RegionServer is that regions could be offline for a good period of time. Regions are closed in order. If many regions on the server, the first region to close may not be back online until all regions close and after the master - notices the regionserver's znode gone. In HBase 0.90.2, we added facility for having + notices the RegionServer's znode gone. In HBase 0.90.2, we added facility for having a node gradually shed its load and then shutdown itself down. HBase 0.90.2 added the graceful_stop.sh script. Here is its usage: $ ./bin/graceful_stop.sh @@ -1151,14 +1151,14 @@ Usage: graceful_stop.sh [--config &conf-dir>] [--restart] [--reload] [--thri hostname Hostname of server we are to stop - To decommission a loaded regionserver, run the following: + To decommission a loaded RegionServer, run the following: $ ./bin/graceful_stop.sh HOSTNAME where HOSTNAME is the host carrying the RegionServer you would decommission. On <varname>HOSTNAME</varname> The HOSTNAME passed to graceful_stop.sh - must match the hostname that hbase is using to identify regionservers. - Check the list of regionservers in the master UI for how HBase is + must match the hostname that hbase is using to identify RegionServers. + Check the list of RegionServers in the master UI for how HBase is referring to servers. Its usually hostname but can also be FQDN. Whatever HBase is using, this is what you should pass the graceful_stop.sh decommission @@ -1167,7 +1167,7 @@ Usage: graceful_stop.sh [--config &conf-dir>] [--restart] [--reload] [--thri currently running; the graceful unloading of regions will not run. The graceful_stop.sh script will move the regions off the - decommissioned regionserver one at a time to minimize region churn. + decommissioned RegionServer one at a time to minimize region churn. It will verify the region deployed in the new location before it will moves the next region and so on until the decommissioned server is carrying zero regions. At this point, the graceful_stop.sh @@ -1201,7 +1201,7 @@ false $ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart --reload --debug $i; done &> /tmp/log.txt & Tail the output of /tmp/log.txt to follow the scripts - progress. The above does regionservers only. Be sure to disable the + progress. The above does RegionServers only. Be sure to disable the load balancer before doing the above. You'd need to do the master update separately. Do it before you run the above script. Here is a pseudo-script for how you might craft a rolling restart script: @@ -1227,10 +1227,10 @@ false - Run the graceful_stop.sh script per regionserver. For example: + Run the graceful_stop.sh script per RegionServer. For example: $ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart --reload --debug $i; done &> /tmp/log.txt & - If you are running thrift or rest servers on the regionserver, pass --thrift or --rest options (See usage + If you are running thrift or rest servers on the RegionServer, pass --thrift or --rest options (See usage for graceful_stop.sh script). diff --git a/src/docbkx/configuration.xml b/src/docbkx/configuration.xml index 245266798d2..de338e39015 100644 --- a/src/docbkx/configuration.xml +++ b/src/docbkx/configuration.xml @@ -114,7 +114,7 @@ to ensure well-formedness of your document after an edit session. a minute or even less so the Master notices failures the sooner. Before changing this value, be sure you have your JVM garbage collection configuration under control otherwise, a long garbage collection that lasts - beyond the zookeeper session timeout will take out + beyond the ZooKeeper session timeout will take out your RegionServer (You might be fine with this -- you probably want recovery to start on the server if a RegionServer has been in GC for a long period of time). @@ -274,7 +274,7 @@ of all regions. Minimally, a client of HBase needs the hbase, hadoop, log4j, commons-logging, commons-lang, - and zookeeper jars in its CLASSPATH connecting to a cluster. + and ZooKeeper jars in its CLASSPATH connecting to a cluster. An example basic hbase-site.xml for client only @@ -307,7 +307,7 @@ of all regions. ensemble for the cluster programmatically do as follows: Configuration config = HBaseConfiguration.create(); config.set("hbase.zookeeper.quorum", "localhost"); // Here we are running zookeeper locally - If multiple ZooKeeper instances make up your zookeeper ensemble, + If multiple ZooKeeper instances make up your ZooKeeper ensemble, they may be specified in a comma-separated list (just as in the hbase-site.xml file). This populated Configuration instance can then be passed to an HTable, diff --git a/src/docbkx/getting_started.xml b/src/docbkx/getting_started.xml index 9fc6ed47853..abfeb50f10b 100644 --- a/src/docbkx/getting_started.xml +++ b/src/docbkx/getting_started.xml @@ -453,7 +453,7 @@ stopping hbase............... in the section. In standalone mode, HBase does not use HDFS -- it uses the local filesystem instead -- and it runs all HBase daemons and a local - zookeeper all up in the same JVM. Zookeeper binds to a well known port + ZooKeeper all up in the same JVM. Zookeeper binds to a well known port so clients may talk to HBase.
@@ -508,7 +508,7 @@ stopping hbase............... <property> <name>hbase.rootdir</name> <value>hdfs://localhost:9000/hbase</value> - <description>The directory shared by region servers. + <description>The directory shared by RegionServers. </description> </property> <property> @@ -539,7 +539,7 @@ stopping hbase............... See Pseudo-distributed mode extras for notes on how to start extra Masters and - regionservers when running pseudo-distributed. + RegionServers when running pseudo-distributed. @@ -564,7 +564,7 @@ stopping hbase............... <property> <name>hbase.rootdir</name> <value>hdfs://namenode.example.org:9000/hbase</value> - <description>The directory shared by region servers. + <description>The directory shared by RegionServers. </description> </property> <property> @@ -873,7 +873,7 @@ stopping hbase............... Shutdown can take a moment to <property> <name>hbase.zookeeper.quorum</name> <value>example1,example2,example3</value> - <description>The directory shared by region servers. + <description>The directory shared by RegionServers. </description> </property> <property> @@ -886,7 +886,7 @@ stopping hbase............... Shutdown can take a moment to <property> <name>hbase.rootdir</name> <value>hdfs://example0:9000/hbase</value> - <description>The directory shared by region servers. + <description>The directory shared by RegionServers. </description> </property> <property> @@ -905,8 +905,8 @@ stopping hbase............... Shutdown can take a moment to
<filename>regionservers</filename> - In this file you list the nodes that will run regionservers. - In our case we run regionservers on all but the head node + In this file you list the nodes that will run RegionServers. + In our case we run RegionServers on all but the head node example1 which is carrying the HBase Master and the HDFS namenode diff --git a/src/docbkx/performance.xml b/src/docbkx/performance.xml index 37a563affd6..c7ac17d0b2c 100644 --- a/src/docbkx/performance.xml +++ b/src/docbkx/performance.xml @@ -16,14 +16,14 @@ here for more pointers. Enabling RPC-level logging - Enabling the RPC-level logging on a regionserver can often given + Enabling the RPC-level logging on a RegionServer can often given insight on timings at the server. Once enabled, the amount of log spewed is voluminous. It is not recommended that you leave this logging on for more than short bursts of time. To enable RPC-level - logging, browse to the regionserver UI and click on + logging, browse to the RegionServer UI and click on Log Level. Set the log level to DEBUG for the package org.apache.hadoop.ipc (Thats right, for - hadoop.ipc, NOT, hbase.ipc). Then tail the regionservers log. + hadoop.ipc, NOT, hbase.ipc). Then tail the RegionServers log. Analyze. To disable, set the logging level back to INFO level. @@ -87,13 +87,13 @@
<varname>hbase.regionserver.handler.count</varname> This setting is in essence sets how many requests are - concurrently being processed inside the regionserver at any + concurrently being processed inside the RegionServer at any one time. If set too high, then throughput may suffer as the concurrent requests contend; if set too low, requests will be stuck waiting to get into the machine. You can get a sense of whether you have too little or too many handlers by - on an individual regionserver then tailing its logs. + on an individual RegionServer then tailing its logs.
@@ -167,7 +167,7 @@ public static byte[][] getHexSplits(String startKey, String endKey, int numRegio to false on your HTable instance. Otherwise, the Puts will be sent one at a time to the - regionserver. Puts added via htable.add(Put) and htable.add( <List> Put) + RegionServer. Puts added via htable.add(Put) and htable.add( <List> Put) wind up in the same write buffer. If autoFlush = false, these messages are not sent until the write-buffer is filled. To explicitly flush the messages, call flushCommits. @@ -187,7 +187,7 @@ public static byte[][] getHexSplits(String startKey, String endKey, int numRegio processed. Setting this value to 500, for example, will transfer 500 rows at a time to the client to be processed. There is a cost/benefit to have the cache value be large because it costs more in memory for both - client and regionserver, so bigger isn't always better. + client and RegionServer, so bigger isn't always better.
@@ -197,7 +197,7 @@ public static byte[][] getHexSplits(String startKey, String endKey, int numRegio avoiding performance problems. If you forget to close ResultScanners - you can cause problems on the regionservers. Always have ResultScanner + you can cause problems on the RegionServers. Always have ResultScanner processing enclosed in try/catch blocks... Scan scan = new Scan(); // set attrs... @@ -216,7 +216,7 @@ htable.close(); Scan - instances can be set to use the block cache in the region server via the + instances can be set to use the block cache in the RegionServer via the setCacheBlocks method. For input Scans to MapReduce jobs, this should be false. For frequently accessed rows, it is advisable to use the block cache. @@ -228,7 +228,7 @@ htable.close(); MUST_PASS_ALL operator to the scanner using setFilter. The filter list should include both a FirstKeyOnlyFilter and a KeyOnlyFilter. - Using this filter combination will result in a worst case scenario of a region server reading a single value from disk + Using this filter combination will result in a worst case scenario of a RegionServer reading a single value from disk and minimal network traffic to the client for a single row.
diff --git a/src/docbkx/troubleshooting.xml b/src/docbkx/troubleshooting.xml index 67b67419faa..89adfee9ce5 100644 --- a/src/docbkx/troubleshooting.xml +++ b/src/docbkx/troubleshooting.xml @@ -28,7 +28,7 @@ RegionServer suicides are “normal”, as this is what they do when something goes wrong. For example, if ulimit and xcievers (the two most important initial settings, see ) - aren’t changed, it will make it impossible at some point for datanodes to create new threads + aren’t changed, it will make it impossible at some point for DataNodes to create new threads that from the HBase point of view is seen as if HDFS was gone. Think about what would happen if your MySQL database was suddenly unable to access files on your local file system, well it’s the same with HBase and HDFS. Another very common reason to see RegionServers committing seppuku is when they enter @@ -145,7 +145,7 @@ hadoop@sv4borg12:~$ jps Child, its MapReduce task, cannot tell which type exactly Hadoop TaskTracker, manages the local Childs Hadoop DataNode, serves blocks - HQuorumPeer, a zookeeper ensemble member + HQuorumPeer, a ZooKeeper ensemble member Jps, well… it’s the current process ThriftServer, it’s a special one will be running only if thrift was started jmx, this is a local process that’s part of our monitoring platform ( poorly named maybe). You probably don’t have that. @@ -275,7 +275,7 @@ hadoop 17789 155 35.2 9067824 8604364 ? S<l Mar04 9855:48 /usr/java/j - And here is a master trying to recover a lease after a region server died: + And here is a master trying to recover a lease after a RegionServer died: "LeaseChecker" daemon prio=10 tid=0x00000000407ef800 nid=0x76cd waiting on condition [0x00007f6d0eae2000..0x00007f6d0eae2a70] -- @@ -370,7 +370,7 @@ java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path
- System instability, and the presence of "java.lang.OutOfMemoryError: unable to create new native thread in exceptions" HDFS datanode logs or that of any system daemon + System instability, and the presence of "java.lang.OutOfMemoryError: unable to create new native thread in exceptions" HDFS DataNode logs or that of any system daemon See the Getting Started section on ulimit and nproc configuration.