HBASE-3768 Add best practice to book for loading row key only

git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1091644 13f79535-47bb-0310-9956-ffa450edef68
2011-04-13 04:33:58 +00:00 · 2011-04-13 04:33:58 +00:00 · e8cdb7e27f
commit e8cdb7e27f
parent 40eb8bcc32
2 changed files with 13 additions and 0 deletions
--- a/CHANGES.txt
+++ b/CHANGES.txt
@ -153,6 +153,8 @@ Release 0.91.0 - Unreleased
               as a convenience (Erik Onnen via Stack)
   HBASE-3769  TableMapReduceUtil is inconsistent with other table-related
               classes that accept byte[] as a table name (Erik Onnen via Stack)
+   HBASE-3768  Add best practice to book for loading row key only
+               (Erik Onnen via Stack)

  TASKS
   HBASE-3559  Move report of split to master OFF the heartbeat channel
--- a/src/docbkx/performance.xml
+++ b/src/docbkx/performance.xml
@ -199,5 +199,16 @@ htable.close();</programlisting></para>
      <varname>false</varname>. For frequently accessed rows, it is advisable to use the block
      cache.</para>
    </section>
+    <section xml:id="perf.hbase.client.rowkeyonly">
+      <title>Optimal Loading of Row Keys</title>
+      <para>When performing a table <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html">scan</link>
+            where only the row keys are needed (no families, qualifiers, values or timestamps), add a FilterList with a
+            <varname>MUST_PASS_ALL</varname> operator to the scanner using <methodname>setFilter</methodname>. The filter list
+            should include both a <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FirstKeyOnlyFilter.html">FirstKeyOnlyFilter</link>
+            and a <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/KeyOnlyFilter.html">KeyOnlyFilter</link>.
+            Using this filter combination will result in a worst case scenario of a region server reading a single value from disk
+            and minimal network traffic to the client for a single row.
+      </para>
+    </section>
  </section>
 </chapter>