diff --git a/src/docbkx/book.xml b/src/docbkx/book.xml index 371a88e3cf1..7bb504980cb 100644 --- a/src/docbkx/book.xml +++ b/src/docbkx/book.xml @@ -614,7 +614,8 @@ admin.enableTable(table); Most of the time small inefficiencies don't matter all that much. Unfortunately, this is a case where they do. Whatever patterns are selected for ColumnFamilies, attributes, and rowkeys they could be repeated - several billion times in your data. See for more information on HBase stores data internally. + several billion times in your data. + See for more information on HBase stores data internally.
Column Families Try to keep the ColumnFamily names as small as possible, preferably one character (e.g. "d" for data/default). @@ -630,6 +631,13 @@ admin.enableTable(table); when designing rowkeys.
+
Numeric Example + A long is 8 bytes. You can store an unsigned number up to 18,446,744,073,709,551,615 in those eight bytes. + If you stored this number as a String -- presuming a byte per character -- you need nearly 3x the bytes. + This is a perfect example of a small inefficiency that may not seem like much, but can add up in HBase when + used as rowkeys. + +
Reverse Timestamps A common problem in database processing is quickly finding the most recent version of a value. A technique using reverse timestamps