diff --git a/src/main/docbkx/configuration.xml b/src/main/docbkx/configuration.xml
index 2fbc3e83b07..5e6bdbb88b3 100644
--- a/src/main/docbkx/configuration.xml
+++ b/src/main/docbkx/configuration.xml
@@ -1194,7 +1194,7 @@ index e70ebc6..96f8c27 100644
xml:id="recommended_configurations.zk">
ZooKeeper Configuration
+ xml:id="sect.zookeeper.session.timeout">
zookeeper.session.timeoutThe default timeout is three minutes (specified in milliseconds). This means that if
a server crashes, it will be three minutes before the Master notices the crash and
@@ -1295,41 +1295,52 @@ index e70ebc6..96f8c27 100644
Managed Splitting
- Rather than let HBase auto-split your Regions, manage the splitting manually
- What follows is taken from the javadoc at the head of the
- org.apache.hadoop.hbase.util.RegionSplitter tool added to
- HBase post-0.90.0 release.
- . With growing amounts of data, splits will continually be needed. Since you
- always know exactly what regions you have, long-term debugging and profiling is much
- easier with manual splits. It is hard to trace the logs to understand region level
- problems if it keeps splitting and getting renamed. Data offlining bugs + unknown number
- of split regions == oh crap! If an HLog or
- StoreFile was mistakenly unprocessed by HBase due to a weird bug
- and you notice it a day or so later, you can be assured that the regions specified in
- these files are the same as the current regions and you have less headaches trying to
- restore/replay your data. You can finely tune your compaction algorithm. With roughly
- uniform data growth, it's easy to cause split / compaction storms as the regions all
- roughly hit the same data size at the same time. With manual splits, you can let
- staggered, time-based major compactions spread out your network IO load.
- How do I turn off automatic splitting? Automatic splitting is determined by the
- configuration value hbase.hregion.max.filesize. It is not recommended that
- you set this to Long.MAX_VALUE in case you forget about manual splits.
- A suggested setting is 100GB, which would result in > 1hr major compactions if reached.
- What's the optimal number of pre-split regions to create? Mileage will vary depending
- upon your application. You could start low with 10 pre-split regions / server and watch as
- data grows over time. It's better to err on the side of too little regions and rolling
- split later. A more complicated answer is that this depends upon the largest storefile in
- your region. With a growing data size, this will get larger over time. You want the
- largest region to be just big enough that the Store compact
- selection algorithm only compacts it due to a timed major. If you don't, your cluster can
- be prone to compaction storms as the algorithm decides to run major compactions on a large
- series of regions all at once. Note that compaction storms are due to the uniform data
- growth, not the manual split decision.
- If you pre-split your regions too thin, you can increase the major compaction
- interval by configuring HConstants.MAJOR_COMPACTION_PERIOD. If your
- data size grows too large, use the (post-0.90.0 HBase)
- org.apache.hadoop.hbase.util.RegionSplitter script to perform a
- network IO safe rolling split of all regions.
+ HBase generally handles splitting your regions, based upon the settings in your
+ hbase-default.xml and hbase-site.xml
+ configuration files. Important settings include
+ hbase.regionserver.region.split.policy,
+ hbase.hregion.max.filesize,
+ hbase.regionserver.regionSplitLimit. A simplistic view of splitting
+ is that when a region grows to hbase.hregion.max.filesize, it is split.
+ For most use patterns, most of the time, you should use automatic splitting.
+ Instead of allowing HBase to split your regions automatically, you can choose to
+ manage the splitting yourself. This feature was added in HBase 0.90.0. Manually managing
+ splits works if you know your keyspace well, otherwise let HBase figure where to split for you.
+ Manual splitting can mitigate region creation and movement under load. It also makes it so
+ region boundaries are known and invariant (if you disable region splitting). If you use manual
+ splits, it is easier doing staggered, time-based major compactions spread out your network IO
+ load.
+
+
+ Disable Automatic Splitting
+ To disable automatic splitting, set hbase.hregion.max.filesize to
+ a very large value, such as 100 GB It is not recommended to set it to
+ its absolute maximum value of Long.MAX_VALUE.
+
+
+ Automatic Splitting Is Recommended
+ If you disable automatic splits to diagnose a problem or during a period of fast
+ data growth, it is recommended to re-enable them when your situation becomes more
+ stable. The potential benefits of managing region splits yourself are not
+ undisputed.
+
+
+ Determine the Optimal Number of Pre-Split Regions
+ The optimal number of pre-split regions depends on your application and environment.
+ A good rule of thumb is to start with 10 pre-split regions per server and watch as data
+ grows over time. It is better to err on the side of too few regions and perform rolling
+ splits later. The optimal number of regions depends upon the largest StoreFile in your
+ region. The size of the largest StoreFile will increase with time if the amount of data
+ grows. The goal is for the largest region to be just large enough that the compaction
+ selection algorithm only compacts it during a timed major compaction. Otherwise, the
+ cluster can be prone to compaction storms where a large number of regions under
+ compaction at the same time. It is important to understand that the data growth causes
+ compaction storms, and not the manual split decision.
+
+ If the regions are split into too many large regions, you can increase the major
+ compaction interval by configuring HConstants.MAJOR_COMPACTION_PERIOD.
+ HBase 0.90 introduced org.apache.hadoop.hbase.util.RegionSplitter,
+ which provides a network-IO-safe rolling split of all regions.
@@ -1356,62 +1367,44 @@ index e70ebc6..96f8c27 100644
mapreduce.reduce.speculative to false.
-
-
- Other Configurations
-
- Balancer
- The balancer is a periodic operation which is run on the master to redistribute
- regions on the cluster. It is configured via hbase.balancer.period and
- defaults to 300000 (5 minutes).
- See for more information on the LoadBalancer.
-
-
-
- Disabling Blockcache
- Do not turn off block cache (You'd do it by setting
- hbase.block.cache.size to zero). Currently we do not do well if you
- do this because the regionserver will spend all its time loading hfile indices over and
- over again. If your working set it such that block cache does you no good, at least size
- the block cache such that hfile indices will stay up in the cache (you can get a rough
- idea on the size you need by surveying regionserver UIs; you'll see index block size
- accounted near the top of the webpage).
-
-
- Nagle's or the small
- package problem
- If a big 40ms or so occasional delay is seen in operations against HBase, try the
- Nagles' setting. For example, see the user mailing list thread, Inconsistent
- scan performance with caching set to 1 and the issue cited therein where setting
- notcpdelay improved scan speeds. You might also see the graphs on the tail of HBASE-7008 Set scanner
- caching to a better default where our Lars Hofhansl tries various data sizes w/
- Nagle's on and off measuring the effect.
-
-
- Better Mean Time to Recover (MTTR)
- This section is about configurations that will make servers come back faster after a
- fail. See the Deveraj Das an Nicolas Liochon blog post Introduction
- to HBase Mean Time to Recover (MTTR) for a brief introduction.
- The issue HBASE-8354 forces Namenode
- into loop with lease recovery requests is messy but has a bunch of good
- discussion toward the end on low timeouts and how to effect faster recovery including
- citation of fixes added to HDFS. Read the Varun Sharma comments. The below suggested
- configurations are Varun's suggestions distilled and tested. Make sure you are running on
- a late-version HDFS so you have the fixes he refers too and himself adds to HDFS that help
- HBase MTTR (e.g. HDFS-3703, HDFS-3712, and HDFS-4791 -- hadoop 2 for sure has them and
- late hadoop 1 has some). Set the following in the RegionServer.
- Other Configurations
+ Balancer
+ The balancer is a periodic operation which is run on the master to redistribute regions on the cluster. It is configured via
+ hbase.balancer.period and defaults to 300000 (5 minutes).
+ See for more information on the LoadBalancer.
+
+
+ Disabling Blockcache
+ Do not turn off block cache (You'd do it by setting hbase.block.cache.size to zero).
+ Currently we do not do well if you do this because the regionserver will spend all its time loading hfile
+ indices over and over again. In fact, in later versions of HBase, it is not possible to disable the
+ block cache completely.
+ HBase will cache meta blocks -- the INDEX and BLOOM blocks -- even if the block cache
+ is disabled.
+
+
+ Nagle's or the small package problem
+ If a big 40ms or so occasional delay is seen in operations against HBase,
+ try the Nagles' setting. For example, see the user mailing list thread,
+ Inconsistent scan performance with caching set to 1
+ and the issue cited therein where setting notcpdelay improved scan speeds. You might also
+ see the graphs on the tail of HBASE-7008 Set scanner caching to a better default
+ where our Lars Hofhansl tries various data sizes w/ Nagle's on and off measuring the effect.
+
+
+ Better Mean Time to Recover (MTTR)
+ This section is about configurations that will make servers come back faster after a fail.
+ See the Deveraj Das an Nicolas Liochon blog post
+ Introduction to HBase Mean Time to Recover (MTTR)
+ for a brief introduction.
+ The issue HBASE-8354 forces Namenode into loop with lease recovery requests
+ is messy but has a bunch of good discussion toward the end on low timeouts and how to effect faster recovery including citation of fixes
+ added to HDFS. Read the Varun Sharma comments. The below suggested configurations are Varun's suggestions distilled and tested. Make sure you are
+ running on a late-version HDFS so you have the fixes he refers too and himself adds to HDFS that help HBase MTTR
+ (e.g. HDFS-3703, HDFS-3712, and HDFS-4791 -- hadoop 2 for sure has them and late hadoop 1 has some).
+ Set the following in the RegionServer.
+
+
hbase.lease.recovery.dfs.timeout23000