HBASE-4731 book.xml, schema design - rowkey numeric example

git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1196801 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Doug Meil 2011-11-02 21:09:47 +00:00
parent f0444014b8
commit b08f746cd2
1 changed files with 9 additions and 1 deletions

View File

@ -614,7 +614,8 @@ admin.enableTable(table);
</para>
<para>Most of the time small inefficiencies don't matter all that much. Unfortunately,
this is a case where they do. Whatever patterns are selected for ColumnFamilies, attributes, and rowkeys they could be repeated
several billion times in your data. See <xref linkend="keyvalue"/> for more information on HBase stores data internally.</para>
several billion times in your data. </para>
<para>See <xref linkend="keyvalue"/> for more information on HBase stores data internally.</para>
<section xml:id="keysize.cf"><title>Column Families</title>
<para>Try to keep the ColumnFamily names as small as possible, preferably one character (e.g. "d" for data/default).
</para>
@ -630,6 +631,13 @@ admin.enableTable(table);
when designing rowkeys.
</para>
</section>
<section xml:id="keysize.example"><title>Numeric Example</title>
<para>A long is 8 bytes. You can store an unsigned number up to 18,446,744,073,709,551,615 in those eight bytes.
If you stored this number as a String -- presuming a byte per character -- you need nearly 3x the bytes.
This is a perfect example of a small inefficiency that may not seem like much, but can add up in HBase when
used as rowkeys.
</para>
</section>
</section>
<section xml:id="reverse.timestamp"><title>Reverse Timestamps</title>
<para>A common problem in database processing is quickly finding the most recent version of a value. A technique using reverse timestamps