HBASE-4504 book.xml - filters

git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1176943 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Doug Meil 2011-09-28 16:19:39 +00:00
parent 2c988abf35
commit 1bc1b593f9
1 changed files with 128 additions and 5 deletions

View File

@ -1234,11 +1234,6 @@ HTable table2 = new HTable(conf2, "myTable");</programlisting>
<para>For fine-grained control of batching of <para>For fine-grained control of batching of
<classname>Put</classname>s or <classname>Delete</classname>s, <classname>Put</classname>s or <classname>Delete</classname>s,
see the <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#batch%28java.util.List%29">batch</link> methods on HTable. see the <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#batch%28java.util.List%29">batch</link> methods on HTable.
</para>
</section>
<section xml:id="client.filter"><title>Filters</title>
<para><link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html">Get</link> and <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html">Scan</link> instances can be
optionally configured with <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/Filter.html">filters</link> which are applied on the RegionServer.
</para> </para>
</section> </section>
<section xml:id="client.external"><title>External Clients</title> <section xml:id="client.external"><title>External Clients</title>
@ -1247,6 +1242,134 @@ HTable table2 = new HTable(conf2, "myTable");</programlisting>
</section> </section>
</section> </section>
<section xml:id="client.filter"><title>Client Filters</title>
<para><link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html">Get</link> and <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html">Scan</link> instances can be
optionally configured with <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/Filter.html">filters</link> which are applied on the RegionServer.
</para>
<para>Filters can be confusing because there are many different types, and it is best to approach them by understanding the groups
of Filter functionality.
</para>
<section xml:id="client.filter.structural"><title>Structural</title>
<para>Structural Filters contain other Filters.</para>
<section xml:id="client.filter.structural.fl"><title>FilterList</title>
<para><link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FilterList.html">FilterList</link>
represents a list of Filters with a relationship of <code>FilterList.Operator.MUST_PASS_ALL</code> or
<code>FilterList.Operator.MUST_PASS_ONE</code> between the Filters. The following example shows an 'or' between two
Filters (checking for either 'my value' or 'my other value' on the same attribute).
<programlisting>
FilterList list = new FilterList(FilterList.Operator.MUST_PASS_ONE);
SingleColumnValueFilter filter1 = new SingleColumnValueFilter(
cf,
column,
CompareOp.EQUAL,
Bytes.toBytes("my value")
);
list.add(filter1);
SingleColumnValueFilter filter2 = new SingleColumnValueFilter(
cf,
column,
CompareOp.EQUAL,
Bytes.toBytes("my other value")
);
list.add(filter2);
scan.setFilter(list);
</programlisting>
</para>
</section>
</section>
<section xml:id="client.filter.cv"><title>Column Value</title>
<section xml:id="client.filter.cv.scvf"><title>SingleColumnValueFilter</title>
<para><link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.html">SingleColumnValueFilter</link>
can be used to test column values for equivalence (<code><link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/CompareFilter.CompareOp.html">CompareOp.EQUAL</link>
</code>), inequality (<code>CompareOp.NOT_EQUAL</code>), or ranges
(e.g., <code>CompareOp.GREATER</code>). The folowing is example of testing equivalence a column to a String value "my value"...
<programlisting>
SingleColumnValueFilter filter = new SingleColumnValueFilter(
cf,
column,
CompareOp.EQUAL,
Bytes.toBytes("my value")
);
scan.setFilter(filter);
</programlisting>
</para>
</section>
</section>
<section xml:id="client.filter.cvp"><title>Column Value Comparators</title>
<para>There are several Comparator classes in the Filter package that deserve special mention.
These Comparators are used in concert with other Filters, such as <xref linkend="client.filter.cv.scvf" />.
</para>
<section xml:id="client.filter.cvp.rcs"><title>RegexStringComparator</title>
<para><link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/RegexStringComparator.html">RegexStringComparator</link>
supports regular expressions for value comparisons.
<programlisting>
RegexStringComparator comp = new RegexStringComparator("my."); // any value that starts with 'my'
SingleColumnValueFilter filter = new SingleColumnValueFilter(
cf,
column,
CompareOp.EQUAL,
comp
);
scan.setFilter(filter);
</programlisting>
See the Oracle JavaDoc for <link xlink:href="http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html">supported RegEx patterns in Java</link>.
</para>
</section>
<section xml:id="client.filter.cvp.rcs"><title>SubstringComparator</title>
<para><link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SubstringComparator.html">SubstringComparator</link>
can be used to determine if a given substring exists in a value. The comparison is case-insensitive.
</para>
<programlisting>
SubstringComparator comp = new SubstringComparator("y val"); // looking for 'my value'
SingleColumnValueFilter filter = new SingleColumnValueFilter(
cf,
column,
CompareOp.EQUAL,
comp
);
scan.setFilter(filter);
</programlisting>
</section>
<section xml:id="client.filter.cvp.bfp"><title>BinaryPrefixComparator</title>
<para>See <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/BinaryPrefixComparator.html">BinaryPrefixComparator</link>.</para>
</section>
<section xml:id="client.filter.cvp.bc"><title>BinaryComparator</title>
<para>See <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/BinaryComparator.html">BinaryComparator</link>.</para>
</section>
</section>
<section xml:id="client.filter.kvm"><title>KeyValue Metadata</title>
<para>As HBase stores data internally as KeyValue pairs, KeyValue Metadata Filters evaluate the existence of keys (i.e., ColumnFamily:Column qualifiers)
for a row, as opposed to values the previous section.
</para>
<section xml:id="client.filter.kvm.ff"><title>FamilyFilter</title>
<para><link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FamilyFilter.html">FamilyFilter</link> can be used
to filter on the ColumnFamily. It is generally a better idea to select ColumnFamilies in the Scan than to do it with a Filter.</para>
</section>
<section xml:id="client.filter.kvm.qf"><title>QualifierFilter</title>
<para><link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/QualifierFilter.html">QualifierFilter</link> can be used
to filter based on Column (aka Qualifier) name.
</para>
</section>
<section xml:id="client.filter.kvm.cpf"><title>ColumnPrefixFilter</title>
<para><link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/ColumnPrefixFilter.html">ColumnPrefixFilter</link> can be used
to filter based on the lead portion of Column (aka Qualifier) names.
</para>
</section>
</section>
<section xml:id="client.filter.row"><title>RowKey</title>
<section xml:id="client.filter.row.rf"><title>RowFilter</title>
<para>It is generally a better idea to use the startRow/stopRow methods on Scan for row selection, however
<link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/RowFilter.html">RowFilter</link> can also be used.</para>
</section>
</section>
<section xml:id="client.filter.utility"><title>Utility</title>
<section xml:id="client.filter.utility.fkof"><title>FirstKeyOnlyFilter</title>
<para>This is primarily used for rowcount jobs.
See <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FirstKeyOnlyFilter.html">FirstKeyOnlyFilter</link>.</para>
</section>
</section>
</section> <!-- client.filter -->
<section xml:id="master"><title>Master</title> <section xml:id="master"><title>Master</title>
<para><code>HMaster</code> is the implementation of the Master Server. The Master server <para><code>HMaster</code> is the implementation of the Master Server. The Master server
is responsible for monitoring all RegionServer instances in the cluster, and is is responsible for monitoring all RegionServer instances in the cluster, and is