diff --git a/src/main/docbkx/ops_mgt.xml b/src/main/docbkx/ops_mgt.xml index 841ad89342f..8c04c9a576d 100644 --- a/src/main/docbkx/ops_mgt.xml +++ b/src/main/docbkx/ops_mgt.xml @@ -793,48 +793,138 @@ false
Rolling Restart - You can also ask this script to restart a RegionServer after the shutdown AND move its - old regions back into place. The latter you might do to retain data locality. A primitive - rolling restart might be effected by running something like the following: - $ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart --reload --debug $i; done &> /tmp/log.txt & - Tail the output of /tmp/log.txt to follow the scripts progress. - The above does RegionServers only. The script will also disable the load balancer before - moving the regions. You'd need to do the master update separately. Do it before you run the - above script. Here is a pseudo-script for how you might craft a rolling restart script: - - - Untar your release, make sure of its configuration and then rsync it across the - cluster. If this is 0.90.2, patch it with HBASE-3744 and HBASE-3756. - - - Run hbck to ensure the cluster consistent - $ ./bin/hbase hbck Effect repairs if inconsistent. - - - - Restart the Master: - $ ./bin/hbase-daemon.sh stop master; ./bin/hbase-daemon.sh start master - - - - Run the graceful_stop.sh script per RegionServer. For - example: - $ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart --reload --debug $i; done &> /tmp/log.txt & - - If you are running thrift or rest servers on the RegionServer, pass --thrift or - --rest options (See usage for graceful_stop.sh script). - - - Restart the Master again. This will clear out dead servers list and reenable the - balancer. - - - Run hbck to ensure the cluster is consistent. - - - It is important to drain HBase regions slowly when restarting regionservers. Otherwise, - multiple regions go offline simultaneously as they are re-assigned to other nodes. Depending - on your usage patterns, this might not be desirable. + + Some cluster configuration changes require either the entire cluster, or the + RegionServers, to be restarted in order to pick up the changes. In addition, rolling + restarts are supported for upgrading to a minor or maintenance release, and to a major + release if at all possible. See the release notes for release you want to upgrade to, to + find out about limitations to the ability to perform a rolling upgrade. + There are multiple ways to restart your cluster nodes, depending on your situation. + These methods are detailed below. +
+ Using the <command>rolling-restart.sh</command> Script + + HBase ships with a script, bin/rolling-restart.sh, that allows + you to perform rolling restarts on the entire cluster, the master only, or the + RegionServers only. The script is provided as a template for your own script, and is not + explicitly tested. It requires password-less SSH login to be configured and assumes that + you have deployed using a tarball. The script requires you to set some environment + variables before running it. Examine the script and modify it to suit your needs. + + <filename>rolling-restart.sh</filename> General Usage + +$ ./bin/rolling-restart.sh --help] [--rs-only] [--master-only] [--graceful] [--maxthreads xx] + ]]> + + + + Rolling Restart on RegionServers Only + + To perform a rolling restart on the RegionServers only, use the + --rs-only option. This might be necessary if you need to reboot the + individual RegionServer or if you make a configuration change that only affects + RegionServers and not the other HBase processes. + If you need to restart only a single RegionServer, or if you need to do extra + actions during the restart, use the bin/graceful_stop.sh + command instead. See . + + + + Rolling Restart on Masters Only + + To perform a rolling restart on the active and backup Masters, use the + --master-only option. You might use this if you know that your + configuration change only affects the Master and not the RegionServers, or if you + need to restart the server where the active Master is running. + If you are not running backup Masters, the Master is simply restarted. If you + are running backup Masters, they are all stopped before any are restarted, to avoid + a race condition in ZooKeeper to determine which is the new Master. First the main + Master is restarted, then the backup Masters are restarted. Directly after restart, + it checks for and cleans out any regions in transition before taking on its normal + workload. + + + + Graceful Restart + + If you specify the --graceful option, RegionServers are restarted + using the bin/graceful_stop.sh script, which moves regions off + a RegionServer before restarting it. This is safer, but can delay the + restart. + + + + Limiting the Number of Threads + + To limit the rolling restart to using only a specific number of threads, use the + --maxthreads option. + + + +
+
+ Manual Rolling Restart + To retain more control over the process, you may wish to manually do a rolling restart + across your cluster. This uses the graceful-stop.sh command . In this method, you can restart each RegionServer + individually and then move its old regions back into place, retaining locality. If you + also need to restart the Master, you need to do it separately, and restart the Master + before restarting the RegionServers using this method. The following is an example of such + a command. You may need to tailor it to your environment. This script does a rolling + restart of RegionServers only. It disables the load balancer before moving the + regions. + /tmp/log.txt &; + ]]> + Monitor the output of the /tmp/log.txt file to follow the + progress of the script. +
+ +
+ Logic for Crafting Your Own Rolling Restart Script + Use the following guidelines if you want to create your own rolling restart script. + + + Extract the new release, verify its configuration, and synchronize it to all nodes + of your cluster using rsync, scp, or another + secure synchronization mechanism. + Use the hbck utility to ensure that the cluster is consistent. + +$ ./bin/hbck + + Perform repairs if required. See for details. + + Restart the master first. You may need to modify these commands if your + new HBase directory is different from the old one, such as for an upgrade. + +$ ./bin/hbase-daemon.sh stop master; ./bin/hbase-daemon.sh start master + + + Gracefully restart each RegionServer, using a script such as the + following, from the Master. + /tmp/log.txt & + ]]> + If you are running Thrift or REST servers, pass the --thrift or --rest options. + For other available options, run the bin/graceful-stop.sh --help + command. + It is important to drain HBase regions slowly when restarting multiple + RegionServers. Otherwise, multiple regions go offline simultaneously and must be + reassigned to other nodes, which may also go offline soon. This can negatively affect + performance. You can inject delays into the script above, for instance, by adding a + Shell command such as sleep. To wait for 5 minutes between each + RegionServer restart, modify the above script to the following: + /tmp/log.txt & + ]]> + + Restart the Master again, to clear out the dead servers list and re-enable + the load balancer. + Run the hbck utility again, to be sure the cluster is + consistent. + +