HBASE-3768 Add best practice to book for loading row key only

git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1091644 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Michael Stack 2011-04-13 04:33:58 +00:00
parent 40eb8bcc32
commit e8cdb7e27f
2 changed files with 13 additions and 0 deletions

View File

@ -153,6 +153,8 @@ Release 0.91.0 - Unreleased
as a convenience (Erik Onnen via Stack) as a convenience (Erik Onnen via Stack)
HBASE-3769 TableMapReduceUtil is inconsistent with other table-related HBASE-3769 TableMapReduceUtil is inconsistent with other table-related
classes that accept byte[] as a table name (Erik Onnen via Stack) classes that accept byte[] as a table name (Erik Onnen via Stack)
HBASE-3768 Add best practice to book for loading row key only
(Erik Onnen via Stack)
TASKS TASKS
HBASE-3559 Move report of split to master OFF the heartbeat channel HBASE-3559 Move report of split to master OFF the heartbeat channel

View File

@ -199,5 +199,16 @@ htable.close();</programlisting></para>
<varname>false</varname>. For frequently accessed rows, it is advisable to use the block <varname>false</varname>. For frequently accessed rows, it is advisable to use the block
cache.</para> cache.</para>
</section> </section>
<section xml:id="perf.hbase.client.rowkeyonly">
<title>Optimal Loading of Row Keys</title>
<para>When performing a table <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html">scan</link>
where only the row keys are needed (no families, qualifiers, values or timestamps), add a FilterList with a
<varname>MUST_PASS_ALL</varname> operator to the scanner using <methodname>setFilter</methodname>. The filter list
should include both a <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FirstKeyOnlyFilter.html">FirstKeyOnlyFilter</link>
and a <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/KeyOnlyFilter.html">KeyOnlyFilter</link>.
Using this filter combination will result in a worst case scenario of a region server reading a single value from disk
and minimal network traffic to the client for a single row.
</para>
</section>
</section> </section>
</chapter> </chapter>