From a82042205a0604004fad7fa5cb9eecf9b43426af Mon Sep 17 00:00:00 2001
From: Michael Stack <stack@apache.org>
Date: Thu, 10 Mar 2011 20:04:27 +0000
Subject: [PATCH] Added section on keeping row and column names small to schema
 section

git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1080332 13f79535-47bb-0310-9956-ffa450edef68
---
 src/docbkx/book.xml | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)
diff --git a/src/docbkx/book.xml b/src/docbkx/book.xml
index 9c4a9038f14..ada9ae638b6 100644
--- a/src/docbkx/book.xml
+++ b/src/docbkx/book.xml
@@ -1384,6 +1384,25 @@ of all regions.
   successful example.  It has a page describing the schema it uses in
   HBase.  You might also consider just using OpenTSDB altogether.</para>
   </section>
+  <section xml:id="keysize">
+      <title>Try to minimize row and column sizes</title>
+      <para>In HBase, values are always freighted with their coordinates; as a
+          cell value passes through the system, it'll be accompanied by its
+          row, column name, and timestamp.  Always.  If your rows and column names
+          are large, especially compared o the size of the cell value, then
+          you may run up against some interesting scenarios.  One such is
+          the case described by Marc Limotte at the tail of
+          <link xlink:url="https://issues.apache.org/jira/browse/HBASE-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=13005272#comment-13005272">HBASE-3551</link>
+          (recommended!).
+          Therein, the indices that are kept on HBase storefiles (<link linkend="hfile">HFile</link>s)
+                  to facilitate random access may end up occupyng large chunks of the HBase
+                  allotted RAM because the cell value coordinates are large.
+                  Mark in the above cited comment suggests upping the block size so
+                  entries in the store file index happen at a larger interval or
+                  modify the table schema so it makes for smaller rows and column
+                  names.
+      `</para>
+  </section>
   </chapter>
 
   <chapter xml:id="hbase_metrics">