Add doc on direct memory, the block cache UI additions, list block cache options, downplay slab cache even more

2014-07-18 16:47:31 -07:00 · 2014-07-18 16:47:31 -07:00 · 19979d770d
parent 5f4e85d3f9
commit 19979d770d
1 changed files with 80 additions and 47 deletions
--- a/src/main/docbkx/book.xml
+++ b/src/main/docbkx/book.xml
@ -1948,42 +1948,43 @@ rs.close();
          LruBlockCache, BucketCache, and SlabCache, which are both (usually) offheap. This section
          discusses benefits and drawbacks of each implementation, how to choose the appropriate
          option, and configuration options for each.</para>
+
+      <note><title>Block Cache Reporting: UI</title>
+      <para>See the RegionServer UI for detail on caching deploy.  Since HBase 1.0, the
+          Block Cache detail has been significantly extended showing configurations,
+          sizings, current usage, and even detail on block counts and types.</para>
+  </note>
+
        <section>
+
          <title>Cache Choices</title>
          <para><classname>LruBlockCache</classname> is the original implementation, and is
-              entirely within the Java heap.  <classname>SlabCache</classname> and
-              <classname>BucketCache</classname> are mainly intended for keeping blockcache
-              data offheap, although BucketCache can also keep data onheap and in files.</para>
-          <para><emphasis>SlabCache is deprecated and will be removed in 1.0!</emphasis></para>
-          <para>BucketCache has seen more production deploys and has more deploy options. Fetching
-            will always be slower when fetching from BucketCache or SlabCache, as compared with the
-            native onheap LruBlockCache. However, latencies tend to be less erratic over time,
-            because there is less garbage collection.</para>
-          <para>Anecdotal evidence indicates that BucketCache requires less garbage collection than
-            SlabCache so should be even less erratic (than SlabCache or LruBlockCache).</para>
-          <para>SlabCache tends to do more garbage collections, because blocks are always moved
-              between L1 and L2, at least given the way <classname>DoubleBlockCache</classname>
-              currently works. When you enable SlabCache, you are enabling a two tier caching
+              entirely within the Java heap. <classname>BucketCache</classname> is mainly
+              intended for keeping blockcache data offheap, although BucketCache can also
+              keep data onheap and serve from a file-backed cache. There is also an older
+              offheap BlockCache, called SlabCache that has since been deprecated and
+              removed in HBase 1.0.
+          </para>
+
+          <para>Fetching will always be slower when fetching from BucketCache,
+              as compared with the native onheap LruBlockCache. However, latencies tend to be
+              less erratic across time, because there is less garbage collection. This is why
+              you'd use BucketCache, so your latencies are less erratic and to mitigate GCs
+              and heap fragmentation.  See Nick Dimiduk's <link
+              xlink:href="http://www.n10k.com/blog/blockcache-101/">BlockCache 101</link> for
+              comparisons running onheap vs offheap tests.
+              </para>
+
+              <para>When you enable BucketCache, you are enabling a two tier caching
              system, an L1 cache which is implemented by an instance of LruBlockCache and
-              an offheap L2 cache which is implemented by SlabCache.  Management of these
-              two tiers and how blocks move between them is done by <classname>DoubleBlockCache</classname>
-              when you are using SlabCache. DoubleBlockCache works by caching all blocks in L1
-              AND L2.  When blocks are evicted from L1, they are moved to L2.  See
-              <xref linkend="offheap.blockcache.slabcache" /> for more detail on how DoubleBlockCache works.
-          </para>
-          <para>The hosting class for BucketCache is <classname>CombinedBlockCache</classname>.
-              It keeps all DATA blocks in the BucketCache and meta blocks -- INDEX and BLOOM blocks --
+              an offheap L2 cache which is implemented by BucketCache.  Management of these
+              two tiers and the policy that dictates how blocks move between them is done by
+              <classname>CombinedBlockCache</classname>. It keeps all DATA blocks in the L2
+              BucketCache and meta blocks -- INDEX and BLOOM blocks --
              onheap in the L1 <classname>LruBlockCache</classname>.
-          </para>
-          <para>Because the hosting class for each implementation
-              (<classname>DoubleBlockCache</classname> vs <classname>CombinedBlockCache</classname>)
-              works so differently, it is difficult to do a fair comparison between BucketCache and SlabCache.
-            See Nick Dimiduk's <link
-              xlink:href="http://www.n10k.com/blog/blockcache-101/">BlockCache 101</link> for some
-          numbers.</para>
-          <para>For more information about the off heap cache options, see <xref
-              linkend="offheap.blockcache" />.</para>
+              See <xref linkend="offheap.blockcache" /> for more detail on going offheap.</para>
        </section>
+
        <section xml:id="cache.configurations">
            <title>General Cache Configurations</title>
            <para>Apart from the cache implementaiton itself, you can set some general
@ -1993,6 +1994,7 @@ rs.close();
              After setting any of these options, restart or rolling restart your cluster for the
              configuration to take effect. Check logs for errors or unexpected behavior.</para>
      </section>
+
        <section
          xml:id="block.cache.design">
          <title>LruBlockCache Design</title>
@ -2136,7 +2138,7 @@ rs.close();
          xml:id="offheap.blockcache">
          <title>Offheap Block Cache</title>
          <section xml:id="offheap.blockcache.slabcache">
-            <title>Enable SlabCache</title>
+            <title>How to Enable SlabCache</title>
            <para><emphasis>SlabCache is deprecated and will be removed in 1.0!</emphasis></para>
            <para> SlabCache is originally described in <link
                xlink:href="http://blog.cloudera.com/blog/2012/01/caching-in-hbase-slabcache/">Caching
@ -2160,29 +2162,39 @@ rs.close();
              Check logs for errors or unexpected behavior.</para>
          </section>
          <section xml:id="enable.bucketcache">
-            <title>Enable BucketCache</title>
-                <para>The usual deploy of BucketCache is via a
-                    managing class that sets up two caching tiers: an L1 onheap cache
-                    implemented by LruBlockCache and a second L2 cache implemented
-                    with BucketCache. The managing class is <link
-                xlink:href="http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/io/hfile/CombinedBlockCache.html">CombinedBlockCache</link> by default. The just-previous link describes the mechanism of CombinedBlockCache. In short, it works
+            <title>How to Enable BucketCache</title>
+                <para>The usual deploy of BucketCache is via a managing class that sets up two caching tiers: an L1 onheap cache
+                    implemented by LruBlockCache and a second L2 cache implemented with BucketCache. The managing class is <link
+                        xlink:href="http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/io/hfile/CombinedBlockCache.html">CombinedBlockCache</link> by default.
+            The just-previous link describes the caching 'policy' implemented by CombinedBlockCache. In short, it works
            by keeping meta blocks -- INDEX and BLOOM in the L1, onheap LruBlockCache tier -- and DATA
            blocks are kept in the L2, BucketCache tier. It is possible to amend this behavior in
-            HBase since version 1.0 and ask that a column family have both its meta and DATA blocks hosted onheap in the L1 tier by
+            HBase since version 1.0 and ask that a column family has both its meta and DATA blocks hosted onheap in the L1 tier by
            setting <varname>cacheDataInL1</varname> via <programlisting>(HColumnDescriptor.setCacheDataInL1(true)</programlisting>
            or in the shell, creating or amending column families setting <varname>CACHE_DATA_IN_L1</varname>
            to true: e.g. <programlisting>hbase(main):003:0> create 't', {NAME => 't', CONFIGURATION => {CACHE_DATA_IN_L1 => 'true'}}</programlisting></para>
-        <para>The BucketCache deploy can be onheap, offheap, or file based. You set which via the
-            <varname>hbase.bucketcache.ioengine</varname> setting it to
-            <varname>heap</varname> for BucketCache running as part of the java heap,
-            <varname>offheap</varname> for BucketCache to make allocations offheap,
-            and <varname>file:PATH_TO_FILE</varname> for BucketCache to use a file
-            (Useful in particular if you have some fast i/o attached to the box such
+
+        <para>The BucketCache Block Cache can be deployed onheap, offheap, or file based.
+            You set which via the
+            <varname>hbase.bucketcache.ioengine</varname> setting.  Setting it to
+            <varname>heap</varname> will have BucketCache deployed inside the 
+            allocated java heap. Setting it to <varname>offheap</varname> will have
+            BucketCache make its allocations offheap,
+            and an ioengine setting of <varname>file:PATH_TO_FILE</varname> will direct
+            BucketCache to use a file caching (Useful in particular if you have some fast i/o attached to the box such
            as SSDs).
        </para>
-        <para>To disable CombinedBlockCache, and use the BucketCache as a strict L2 cache to the L1
-              LruBlockCache, set <varname>CacheConfig.BUCKET_CACHE_COMBINED_KEY</varname> to
-                <literal>false</literal>. In this mode, on eviction from L1, blocks go to L2.</para>
+        <para xml:id="raw.l1.l2">It is possible to deploy an L1+L2 setup where we bypass the CombinedBlockCache
+            policy and have BucketCache working as a strict L2 cache to the L1
+              LruBlockCache. For such a setup, set <varname>CacheConfig.BUCKET_CACHE_COMBINED_KEY</varname> to
+              <literal>false</literal>. In this mode, on eviction from L1, blocks go to L2.
+              When a block is cached, it is cached first in L1. When we go to look for a cached block,
+              we look first in L1 and if none found, then search L2.  Let us call this deploy format,
+              <emphasis><indexterm><primary>Raw L1+L2</primary></indexterm></emphasis>.</para>
+          <para>Other BucketCache configs include: specifying a location to persist cache to across
+              restarts, how many threads to use writing the cache, etc.  See the
+              <link xlink:href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/io/hfile/CacheConfig.html">CacheConfig.html</link>
+              class for configuration options and descriptions.</para>

            <procedure>
              <title>BucketCache Example Configuration</title>
@ -2230,6 +2242,27 @@ rs.close();
                In other words, you configure the L1 LruBlockCache as you would normally,
                as you would when there is no L2 BucketCache present.
            </para>
+            <note xml:id="direct.memory">
+                <title>Direct Memory Usage In HBase</title>
+                <para>The default maximum direct memory varies by JVM.  Traditionally it is 64M
+                    or some relation to allocated heap size (-Xmx) or no limit at all (JDK7 apparently).
+                    HBase servers use direct memory, in particular short-circuit reading, the hosted DFSClient will
+                    allocate direct memory buffers.  If you do offheap block caching, you'll
+                    be making use of direct memory.  Starting your JVM, make sure
+                    the <varname>-XX:MaxDirectMemorySize</varname> setting in
+                    <filename>conf/hbase-env.sh</filename> is set to some value that is
+                    higher than what you have allocated to your offheap blockcache
+                    (<varname>hbase.bucketcache.size</varname>).  It should be larger than your offheap block
+                    cache and then some for DFSClient usage (How much the DFSClient uses is not
+                    easy to quantify; it is the number of open hfiles * <varname>hbase.dfs.client.read.shortcircuit.buffer.size</varname>
+                    where hbase.dfs.client.read.shortcircuit.buffer.size is set to 128k in HBase -- see <filename>hbase-default.xml</filename>
+                    default configurations).
+                </para>
+                <para>You can see how much memory -- onheap and offheap/direct -- a RegionServer is configured to use
+                    and how much it is using at any one time by looking at the
+                    <emphasis>Server Metrics: Memory</emphasis> tab in the UI.
+                </para>
+            </note>
          </section>
        </section>
      </section>