From 90e9e2f057d09e4862c13e4e7aa2c8ebfc7303f6 Mon Sep 17 00:00:00 2001 From: Doug Meil Date: Tue, 11 Oct 2011 16:40:04 +0000 Subject: [PATCH] HBASE-4573 book.xml, Put to KeyValue examples. git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1181880 13f79535-47bb-0310-9956-ffa450edef68 --- src/docbkx/book.xml | 36 ++++++++++++++++++++++++++++++++++-- 1 file changed, 34 insertions(+), 2 deletions(-) diff --git a/src/docbkx/book.xml b/src/docbkx/book.xml index 9c54de0c5e7..0368718fb31 100644 --- a/src/docbkx/book.xml +++ b/src/docbkx/book.xml @@ -609,8 +609,8 @@ admin.enableTable(table); Another common question is whether one should prefer rows or columns. The context is typically in extreme cases of wide tables, such as having 1 row with 1 million attributes, or 1 million rows with 1 columns apiece. - Winner: Rows (generally speaking). To be clear, this guideline is in the context is in extremely wide cases, not where - one needs to store a few dozen or hundred columns. + Winner: Rows (generally speaking). To be clear, this guideline is in the context is in extremely wide cases, not in the + standard use-case where one needs to store a few dozen or hundred columns. @@ -1687,6 +1687,38 @@ scan.setFilter(filter); For more information, see the KeyValue source code. +
Example + To emphasize the points above, examine what happens with two Puts for two different columns for the same row: + + Put #1: rowkey=row1, cf:attr1=value1 + Put #2: rowkey=row1, cf:attr2=value2 + + Even though these are for the same row, a KeyValue is created for each column: + Key portion for Put #1: + + rowlength (4) + row (row1) + columnfamilylength (2) + columnfamily (cf) + columnqualifier (attr1) + timestamp (server time of Put) + keytype (Put) + + + Key portion for Put #2: + + rowlength (4) + row (row1) + columnfamilylength (2) + columnfamily (cf) + columnqualifier (attr2) + timestamp (server time of Put) + keytype (Put) + + +
+ It is critical to understand that the rowkey, ColumnFamily, and column (aka columnqualifier) are embedded within + the KeyValue instance. The longer these identifiers are, the bigger the KeyValue is.
Compaction