diff --git a/src/main/docbkx/ops_mgt.xml b/src/main/docbkx/ops_mgt.xml
index 841ad89342f..8c04c9a576d 100644
--- a/src/main/docbkx/ops_mgt.xml
+++ b/src/main/docbkx/ops_mgt.xml
@@ -793,48 +793,138 @@ false
Rolling Restart
- You can also ask this script to restart a RegionServer after the shutdown AND move its
- old regions back into place. The latter you might do to retain data locality. A primitive
- rolling restart might be effected by running something like the following:
- $ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart --reload --debug $i; done &> /tmp/log.txt &
- Tail the output of /tmp/log.txt to follow the scripts progress.
- The above does RegionServers only. The script will also disable the load balancer before
- moving the regions. You'd need to do the master update separately. Do it before you run the
- above script. Here is a pseudo-script for how you might craft a rolling restart script:
-
-
- Untar your release, make sure of its configuration and then rsync it across the
- cluster. If this is 0.90.2, patch it with HBASE-3744 and HBASE-3756.
-
-
- Run hbck to ensure the cluster consistent
- $ ./bin/hbase hbck Effect repairs if inconsistent.
-
-
-
- Restart the Master:
- $ ./bin/hbase-daemon.sh stop master; ./bin/hbase-daemon.sh start master
-
-
-
- Run the graceful_stop.sh script per RegionServer. For
- example:
- $ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart --reload --debug $i; done &> /tmp/log.txt &
-
- If you are running thrift or rest servers on the RegionServer, pass --thrift or
- --rest options (See usage for graceful_stop.sh script).
-
-
- Restart the Master again. This will clear out dead servers list and reenable the
- balancer.
-
-
- Run hbck to ensure the cluster is consistent.
-
-
- It is important to drain HBase regions slowly when restarting regionservers. Otherwise,
- multiple regions go offline simultaneously as they are re-assigned to other nodes. Depending
- on your usage patterns, this might not be desirable.
+
+ Some cluster configuration changes require either the entire cluster, or the
+ RegionServers, to be restarted in order to pick up the changes. In addition, rolling
+ restarts are supported for upgrading to a minor or maintenance release, and to a major
+ release if at all possible. See the release notes for release you want to upgrade to, to
+ find out about limitations to the ability to perform a rolling upgrade.
+ There are multiple ways to restart your cluster nodes, depending on your situation.
+ These methods are detailed below.
+
+ Using the rolling-restart.sh Script
+
+ HBase ships with a script, bin/rolling-restart.sh, that allows
+ you to perform rolling restarts on the entire cluster, the master only, or the
+ RegionServers only. The script is provided as a template for your own script, and is not
+ explicitly tested. It requires password-less SSH login to be configured and assumes that
+ you have deployed using a tarball. The script requires you to set some environment
+ variables before running it. Examine the script and modify it to suit your needs.
+
+ rolling-restart.sh General Usage
+
+$ ./bin/rolling-restart.sh --help] [--rs-only] [--master-only] [--graceful] [--maxthreads xx]
+ ]]>
+
+
+
+ Rolling Restart on RegionServers Only
+
+ To perform a rolling restart on the RegionServers only, use the
+ --rs-only
option. This might be necessary if you need to reboot the
+ individual RegionServer or if you make a configuration change that only affects
+ RegionServers and not the other HBase processes.
+ If you need to restart only a single RegionServer, or if you need to do extra
+ actions during the restart, use the bin/graceful_stop.sh
+ command instead. See .
+
+
+
+ Rolling Restart on Masters Only
+
+ To perform a rolling restart on the active and backup Masters, use the
+ --master-only
option. You might use this if you know that your
+ configuration change only affects the Master and not the RegionServers, or if you
+ need to restart the server where the active Master is running.
+ If you are not running backup Masters, the Master is simply restarted. If you
+ are running backup Masters, they are all stopped before any are restarted, to avoid
+ a race condition in ZooKeeper to determine which is the new Master. First the main
+ Master is restarted, then the backup Masters are restarted. Directly after restart,
+ it checks for and cleans out any regions in transition before taking on its normal
+ workload.
+
+
+
+ Graceful Restart
+
+ If you specify the --graceful
option, RegionServers are restarted
+ using the bin/graceful_stop.sh script, which moves regions off
+ a RegionServer before restarting it. This is safer, but can delay the
+ restart.
+
+
+
+ Limiting the Number of Threads
+
+ To limit the rolling restart to using only a specific number of threads, use the
+ --maxthreads
option.
+
+
+
+
+
+ Manual Rolling Restart
+ To retain more control over the process, you may wish to manually do a rolling restart
+ across your cluster. This uses the graceful-stop.sh command . In this method, you can restart each RegionServer
+ individually and then move its old regions back into place, retaining locality. If you
+ also need to restart the Master, you need to do it separately, and restart the Master
+ before restarting the RegionServers using this method. The following is an example of such
+ a command. You may need to tailor it to your environment. This script does a rolling
+ restart of RegionServers only. It disables the load balancer before moving the
+ regions.
+ /tmp/log.txt &;
+ ]]>
+ Monitor the output of the /tmp/log.txt file to follow the
+ progress of the script.
+
+
+
+ Logic for Crafting Your Own Rolling Restart Script
+ Use the following guidelines if you want to create your own rolling restart script.
+
+
+ Extract the new release, verify its configuration, and synchronize it to all nodes
+ of your cluster using rsync, scp, or another
+ secure synchronization mechanism.
+ Use the hbck utility to ensure that the cluster is consistent.
+
+$ ./bin/hbck
+
+ Perform repairs if required. See for details.
+
+ Restart the master first. You may need to modify these commands if your
+ new HBase directory is different from the old one, such as for an upgrade.
+
+$ ./bin/hbase-daemon.sh stop master; ./bin/hbase-daemon.sh start master
+
+
+ Gracefully restart each RegionServer, using a script such as the
+ following, from the Master.
+ /tmp/log.txt &
+ ]]>
+ If you are running Thrift or REST servers, pass the --thrift or --rest options.
+ For other available options, run the bin/graceful-stop.sh --help
+ command.
+ It is important to drain HBase regions slowly when restarting multiple
+ RegionServers. Otherwise, multiple regions go offline simultaneously and must be
+ reassigned to other nodes, which may also go offline soon. This can negatively affect
+ performance. You can inject delays into the script above, for instance, by adding a
+ Shell command such as sleep. To wait for 5 minutes between each
+ RegionServer restart, modify the above script to the following:
+ /tmp/log.txt &
+ ]]>
+
+ Restart the Master again, to clear out the dead servers list and re-enable
+ the load balancer.
+ Run the hbck utility again, to be sure the cluster is
+ consistent.
+
+