diff --git a/src/main/docbkx/book.xml b/src/main/docbkx/book.xml index 4c06dc636c2..7f8f0a11f15 100644 --- a/src/main/docbkx/book.xml +++ b/src/main/docbkx/book.xml @@ -1948,42 +1948,43 @@ rs.close(); LruBlockCache, BucketCache, and SlabCache, which are both (usually) offheap. This section discusses benefits and drawbacks of each implementation, how to choose the appropriate option, and configuration options for each. + + Block Cache Reporting: UI + See the RegionServer UI for detail on caching deploy. Since HBase 1.0, the + Block Cache detail has been significantly extended showing configurations, + sizings, current usage, and even detail on block counts and types. + +
+ Cache Choices LruBlockCache is the original implementation, and is - entirely within the Java heap. SlabCache and - BucketCache are mainly intended for keeping blockcache - data offheap, although BucketCache can also keep data onheap and in files. - SlabCache is deprecated and will be removed in 1.0! - BucketCache has seen more production deploys and has more deploy options. Fetching - will always be slower when fetching from BucketCache or SlabCache, as compared with the - native onheap LruBlockCache. However, latencies tend to be less erratic over time, - because there is less garbage collection. - Anecdotal evidence indicates that BucketCache requires less garbage collection than - SlabCache so should be even less erratic (than SlabCache or LruBlockCache). - SlabCache tends to do more garbage collections, because blocks are always moved - between L1 and L2, at least given the way DoubleBlockCache - currently works. When you enable SlabCache, you are enabling a two tier caching + entirely within the Java heap. BucketCache is mainly + intended for keeping blockcache data offheap, although BucketCache can also + keep data onheap and serve from a file-backed cache. There is also an older + offheap BlockCache, called SlabCache that has since been deprecated and + removed in HBase 1.0. + + + Fetching will always be slower when fetching from BucketCache, + as compared with the native onheap LruBlockCache. However, latencies tend to be + less erratic across time, because there is less garbage collection. This is why + you'd use BucketCache, so your latencies are less erratic and to mitigate GCs + and heap fragmentation. See Nick Dimiduk's BlockCache 101 for + comparisons running onheap vs offheap tests. + + + When you enable BucketCache, you are enabling a two tier caching system, an L1 cache which is implemented by an instance of LruBlockCache and - an offheap L2 cache which is implemented by SlabCache. Management of these - two tiers and how blocks move between them is done by DoubleBlockCache - when you are using SlabCache. DoubleBlockCache works by caching all blocks in L1 - AND L2. When blocks are evicted from L1, they are moved to L2. See - for more detail on how DoubleBlockCache works. - - The hosting class for BucketCache is CombinedBlockCache. - It keeps all DATA blocks in the BucketCache and meta blocks -- INDEX and BLOOM blocks -- + an offheap L2 cache which is implemented by BucketCache. Management of these + two tiers and the policy that dictates how blocks move between them is done by + CombinedBlockCache. It keeps all DATA blocks in the L2 + BucketCache and meta blocks -- INDEX and BLOOM blocks -- onheap in the L1 LruBlockCache. - - Because the hosting class for each implementation - (DoubleBlockCache vs CombinedBlockCache) - works so differently, it is difficult to do a fair comparison between BucketCache and SlabCache. - See Nick Dimiduk's BlockCache 101 for some - numbers. - For more information about the off heap cache options, see . + See for more detail on going offheap.
+
General Cache Configurations Apart from the cache implementaiton itself, you can set some general @@ -1993,6 +1994,7 @@ rs.close(); After setting any of these options, restart or rolling restart your cluster for the configuration to take effect. Check logs for errors or unexpected behavior.
+
LruBlockCache Design @@ -2136,7 +2138,7 @@ rs.close(); xml:id="offheap.blockcache"> Offheap Block Cache
- Enable SlabCache + How to Enable SlabCache SlabCache is deprecated and will be removed in 1.0! SlabCache is originally described in Caching @@ -2160,29 +2162,39 @@ rs.close(); Check logs for errors or unexpected behavior.
- Enable BucketCache - The usual deploy of BucketCache is via a - managing class that sets up two caching tiers: an L1 onheap cache - implemented by LruBlockCache and a second L2 cache implemented - with BucketCache. The managing class is CombinedBlockCache by default. The just-previous link describes the mechanism of CombinedBlockCache. In short, it works + How to Enable BucketCache + The usual deploy of BucketCache is via a managing class that sets up two caching tiers: an L1 onheap cache + implemented by LruBlockCache and a second L2 cache implemented with BucketCache. The managing class is CombinedBlockCache by default. + The just-previous link describes the caching 'policy' implemented by CombinedBlockCache. In short, it works by keeping meta blocks -- INDEX and BLOOM in the L1, onheap LruBlockCache tier -- and DATA blocks are kept in the L2, BucketCache tier. It is possible to amend this behavior in - HBase since version 1.0 and ask that a column family have both its meta and DATA blocks hosted onheap in the L1 tier by + HBase since version 1.0 and ask that a column family has both its meta and DATA blocks hosted onheap in the L1 tier by setting cacheDataInL1 via (HColumnDescriptor.setCacheDataInL1(true) or in the shell, creating or amending column families setting CACHE_DATA_IN_L1 to true: e.g. hbase(main):003:0> create 't', {NAME => 't', CONFIGURATION => {CACHE_DATA_IN_L1 => 'true'}} - The BucketCache deploy can be onheap, offheap, or file based. You set which via the - hbase.bucketcache.ioengine setting it to - heap for BucketCache running as part of the java heap, - offheap for BucketCache to make allocations offheap, - and file:PATH_TO_FILE for BucketCache to use a file - (Useful in particular if you have some fast i/o attached to the box such + + The BucketCache Block Cache can be deployed onheap, offheap, or file based. + You set which via the + hbase.bucketcache.ioengine setting. Setting it to + heap will have BucketCache deployed inside the + allocated java heap. Setting it to offheap will have + BucketCache make its allocations offheap, + and an ioengine setting of file:PATH_TO_FILE will direct + BucketCache to use a file caching (Useful in particular if you have some fast i/o attached to the box such as SSDs). - To disable CombinedBlockCache, and use the BucketCache as a strict L2 cache to the L1 - LruBlockCache, set CacheConfig.BUCKET_CACHE_COMBINED_KEY to - false. In this mode, on eviction from L1, blocks go to L2. + It is possible to deploy an L1+L2 setup where we bypass the CombinedBlockCache + policy and have BucketCache working as a strict L2 cache to the L1 + LruBlockCache. For such a setup, set CacheConfig.BUCKET_CACHE_COMBINED_KEY to + false. In this mode, on eviction from L1, blocks go to L2. + When a block is cached, it is cached first in L1. When we go to look for a cached block, + we look first in L1 and if none found, then search L2. Let us call this deploy format, + Raw L1+L2. + Other BucketCache configs include: specifying a location to persist cache to across + restarts, how many threads to use writing the cache, etc. See the + CacheConfig.html + class for configuration options and descriptions. BucketCache Example Configuration @@ -2230,6 +2242,27 @@ rs.close(); In other words, you configure the L1 LruBlockCache as you would normally, as you would when there is no L2 BucketCache present. + + Direct Memory Usage In HBase + The default maximum direct memory varies by JVM. Traditionally it is 64M + or some relation to allocated heap size (-Xmx) or no limit at all (JDK7 apparently). + HBase servers use direct memory, in particular short-circuit reading, the hosted DFSClient will + allocate direct memory buffers. If you do offheap block caching, you'll + be making use of direct memory. Starting your JVM, make sure + the -XX:MaxDirectMemorySize setting in + conf/hbase-env.sh is set to some value that is + higher than what you have allocated to your offheap blockcache + (hbase.bucketcache.size). It should be larger than your offheap block + cache and then some for DFSClient usage (How much the DFSClient uses is not + easy to quantify; it is the number of open hfiles * hbase.dfs.client.read.shortcircuit.buffer.size + where hbase.dfs.client.read.shortcircuit.buffer.size is set to 128k in HBase -- see hbase-default.xml + default configurations). + + You can see how much memory -- onheap and offheap/direct -- a RegionServer is configured to use + and how much it is using at any one time by looking at the + Server Metrics: Memory tab in the UI. + +