Added section on keeping row and column names small to schema section

git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1080332 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Michael Stack 2011-03-10 20:04:27 +00:00
parent 3d4a190562
commit a82042205a
1 changed files with 19 additions and 0 deletions

View File

@ -1384,6 +1384,25 @@ of all regions.
successful example. It has a page describing the schema it uses in
HBase. You might also consider just using OpenTSDB altogether.</para>
</section>
<section xml:id="keysize">
<title>Try to minimize row and column sizes</title>
<para>In HBase, values are always freighted with their coordinates; as a
cell value passes through the system, it'll be accompanied by its
row, column name, and timestamp. Always. If your rows and column names
are large, especially compared o the size of the cell value, then
you may run up against some interesting scenarios. One such is
the case described by Marc Limotte at the tail of
<link xlink:url="https://issues.apache.org/jira/browse/HBASE-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=13005272#comment-13005272">HBASE-3551</link>
(recommended!).
Therein, the indices that are kept on HBase storefiles (<link linkend="hfile">HFile</link>s)
to facilitate random access may end up occupyng large chunks of the HBase
allotted RAM because the cell value coordinates are large.
Mark in the above cited comment suggests upping the block size so
entries in the store file index happen at a larger interval or
modify the table schema so it makes for smaller rows and column
names.
`</para>
</section>
</chapter>
<chapter xml:id="hbase_metrics">