HBASE-4593 Design and document the official procedure for posting patches, commits, commit messages, etc. to smooth process and make integration with tools easier (Misty Stanley-Jones)

This commit is contained in:
stack 2014-08-25 14:29:27 -07:00
parent 33928f85f2
commit fefa4a3079
1 changed files with 330 additions and 0 deletions

View File

@ -0,0 +1,330 @@
<?xml version="1.0" encoding="UTF-8"?>
<chapter version="5.0" xml:id="unit.tests" xmlns="http://docbook.org/ns/docbook"
xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:svg="http://www.w3.org/2000/svg" xmlns:m="http://www.w3.org/1998/Math/MathML"
xmlns:html="http://www.w3.org/1999/xhtml" xmlns:db="http://docbook.org/ns/docbook">
<!--
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
-->
<title>Unit Testing HBase Applications</title>
<para>This chapter discusses unit testing your HBase application using JUnit, Mockito, MRUnit,
and HBaseTestingUtility. Much of the information comes from <link
xlink:href="http://blog.cloudera.com/blog/2013/09/how-to-test-hbase-applications-using-popular-tools/"
>a community blog post about testing HBase applications</link>. For information on unit
tests for HBase itself, see <xref linkend="hbase.tests"/>.</para>
<section>
<title>JUnit</title>
<para>HBase uses <link xlink:href="http://junit.org">JUnit</link> 4 for unit tests</para>
<para>This example will add unit tests to the following example class:</para>
<programlisting language="java">
public class MyHBaseDAO {
public static void insertRecord(HTableInterface table, HBaseTestObj obj)
throws Exception {
Put put = createPut(obj);
table.put(put);
}
private static Put createPut(HBaseTestObj obj) {
Put put = new Put(Bytes.toBytes(obj.getRowKey()));
put.add(Bytes.toBytes("CF"), Bytes.toBytes("CQ-1"),
Bytes.toBytes(obj.getData1()));
put.add(Bytes.toBytes("CF"), Bytes.toBytes("CQ-2"),
Bytes.toBytes(obj.getData2()));
return put;
}
}
</programlisting>
<para>The first step is to add JUnit dependencies to your Maven POM file:</para>
<programlisting language="xml"><![CDATA[
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.11</version>
<scope>test</scope>
</dependency>
]]></programlisting>
<para>Next, add some unit tests to your code. Tests are annotated with
<literal>@Test</literal>. Here, the unit tests are in bold.</para>
<programlisting language="java">
public class TestMyHbaseDAOData {
@Test
public void testCreatePut() throws Exception {
HBaseTestObj obj = new HBaseTestObj();
obj.setRowKey("ROWKEY-1");
obj.setData1("DATA-1");
obj.setData2("DATA-2");
Put put = MyHBaseDAO.createPut(obj);
<userinput>assertEquals(obj.getRowKey(), Bytes.toString(put.getRow()));
assertEquals(obj.getData1(), Bytes.toString(put.get(Bytes.toBytes("CF"), Bytes.toBytes("CQ-1")).get(0).getValue()));
assertEquals(obj.getData2(), Bytes.toString(put.get(Bytes.toBytes("CF"), Bytes.toBytes("CQ-2")).get(0).getValue()));</userinput>
}
}
</programlisting>
<para>These tests ensure that your <code>createPut</code> method creates, populates, and
returns a <code>Put</code> object with expected values. Of course, JUnit can do much
more than this. For an introduction to JUnit, see <link
xlink:href="https://github.com/junit-team/junit/wiki/Getting-started"
>https://github.com/junit-team/junit/wiki/Getting-started</link>. </para>
</section>
<section xml:id="mockito">
<title>Mockito</title>
<para>Mockito is a mocking framework. It goes further than JUnit by allowing you to test the
interactions between objects without having to replicate the entire environment. You can
read more about Mockito at its project site, <link
xlink:href="https://code.google.com/p/mockito/"
>https://code.google.com/p/mockito/</link>.</para>
<para>You can use Mockito to do unit testing on smaller units. For instance, you can mock a
<classname>org.apache.hadoop.hbase.Server</classname> instance or a
<classname>org.apache.hadoop.hbase.master.MasterServices</classname> interface
reference rather than a full-blown
<classname>org.apache.hadoop.hbase.master.HMaster</classname>.</para>
<para>This example builds upon the example code in <xref linkend="unit.tests"/>, to test the
<code>insertRecord</code> method.</para>
<para>First, add a dependency for Mockito to your Maven POM file.</para>
<programlisting language="xml"><![CDATA[
<dependency>
<groupId>org.mockito</groupId>
<artifactId>mockito-all</artifactId>
<version>1.9.5</version>
<scope>test</scope>
</dependency>
]]></programlisting>
<para>Next, add a <code>@RunWith</code> annotation to your test class, to direct it to use
Mockito.</para>
<programlisting language="java">
<userinput>@RunWith(MockitoJUnitRunner.class)</userinput>
public class TestMyHBaseDAO{
@Mock
private HTableInterface table;
@Mock
private HTablePool hTablePool;
@Captor
private ArgumentCaptor putCaptor;
@Test
public void testInsertRecord() throws Exception {
//return mock table when getTable is called
when(hTablePool.getTable("tablename")).thenReturn(table);
//create test object and make a call to the DAO that needs testing
HBaseTestObj obj = new HBaseTestObj();
obj.setRowKey("ROWKEY-1");
obj.setData1("DATA-1");
obj.setData2("DATA-2");
MyHBaseDAO.insertRecord(table, obj);
verify(table).put(putCaptor.capture());
Put put = putCaptor.getValue();
assertEquals(Bytes.toString(put.getRow()), obj.getRowKey());
assert(put.has(Bytes.toBytes("CF"), Bytes.toBytes("CQ-1")));
assert(put.has(Bytes.toBytes("CF"), Bytes.toBytes("CQ-2")));
assertEquals(Bytes.toString(put.get(Bytes.toBytes("CF"),Bytes.toBytes("CQ-1")).get(0).getValue()), "DATA-1");
assertEquals(Bytes.toString(put.get(Bytes.toBytes("CF"),Bytes.toBytes("CQ-2")).get(0).getValue()), "DATA-2");
}
}
</programlisting>
<para>This code populates <code>HBaseTestObj</code> with “ROWKEY-1”, “DATA-1”, “DATA-2” as
values. It then inserts the record into the mocked table. The Put that the DAO would
have inserted is captured, and values are tested to verify that they are what you
expected them to be.</para>
<para>The key here is to manage htable pool and htable instance creation outside the DAO.
This allows you to mock them cleanly and test Puts as shown above. Similarly, you can
now expand into other operations such as Get, Scan, or Delete.</para>
</section>
<section>
<title>MRUnit</title>
<para><link xlink:href="http://mrunit.apache.org/">Apache MRUnit</link> is a library that
allows you to unit-test MapReduce jobs. You can use it to test HBase jobs in the same
way as other MapReduce jobs.</para>
<para>Given a MapReduce job that writes to an HBase table called <literal>MyTest</literal>,
which has one column family called <literal>CF</literal>, the reducer of such a job
could look like the following:</para>
<programlisting language="java"><![CDATA[
public class MyReducer extends TableReducer<Text, Text, ImmutableBytesWritable> {
public static final byte[] CF = "CF".getBytes();
public static final byte[] QUALIFIER = "CQ-1".getBytes();
public void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
//bunch of processing to extract data to be inserted, in our case, lets say we are simply
//appending all the records we receive from the mapper for this particular
//key and insert one record into HBase
StringBuffer data = new StringBuffer();
Put put = new Put(Bytes.toBytes(key.toString()));
for (Text val : values) {
data = data.append(val);
}
put.add(CF, QUALIFIER, Bytes.toBytes(data.toString()));
//write to HBase
context.write(new ImmutableBytesWritable(Bytes.toBytes(key.toString())), put);
}
} ]]>
</programlisting>
<para>To test this code, the first step is to add a dependency to MRUnit to your Maven POM
file. </para>
<programlisting language="xml"><![CDATA[
<dependency>
<groupId>org.apache.mrunit</groupId>
<artifactId>mrunit</artifactId>
<version>1.0.0 </version>
<scope>test</scope>
</dependency>
]]></programlisting>
<para>Next, use the ReducerDriver provided by MRUnit, in your Reducer job.</para>
<programlisting language="java"><![CDATA[
public class MyReducerTest {
ReduceDriver<Text, Text, ImmutableBytesWritable, Writable> reduceDriver;
byte[] CF = "CF".getBytes();
byte[] QUALIFIER = "CQ-1".getBytes();
@Before
public void setUp() {
MyReducer reducer = new MyReducer();
reduceDriver = ReduceDriver.newReduceDriver(reducer);
}
@Test
public void testHBaseInsert() throws IOException {
String strKey = "RowKey-1", strValue = "DATA", strValue1 = "DATA1",
strValue2 = "DATA2";
List<Text> list = new ArrayList<Text>();
list.add(new Text(strValue));
list.add(new Text(strValue1));
list.add(new Text(strValue2));
//since in our case all that the reducer is doing is appending the records that the mapper
//sends it, we should get the following back
String expectedOutput = strValue + strValue1 + strValue2;
//Setup Input, mimic what mapper would have passed
//to the reducer and run test
reduceDriver.withInput(new Text(strKey), list);
//run the reducer and get its output
List<Pair<ImmutableBytesWritable, Writable>> result = reduceDriver.run();
//extract key from result and verify
assertEquals(Bytes.toString(result.get(0).getFirst().get()), strKey);
//extract value for CF/QUALIFIER and verify
Put a = (Put)result.get(0).getSecond();
String c = Bytes.toString(a.get(CF, QUALIFIER).get(0).getValue());
assertEquals(expectedOutput,c );
}
}
]]></programlisting>
<para>Your MRUnit test verifies that the output is as expected, the Put that is inserted
into HBase has the correct value, and the ColumnFamily and ColumnQualifier have the
correct values.</para>
<para>MRUnit includes a MapperDriver to test mapping jobs, and you can use MRUnit to test
other operations, including reading from HBase, processing data, or writing to
HDFS,</para>
</section>
<section>
<title>Integration Testing with a HBase Mini-Cluster</title>
<para>HBase ships with HBaseTestingUtility, which makes it easy to write integration tests
using a <firstterm>mini-cluster</firstterm>. The first step is to add some dependencies
to your Maven POM file. Check the versions to be sure they are appropriate.</para>
<programlisting language="xml"><![CDATA[
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.0.0</version>
<type>test-jar</type>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase</artifactId>
<version>0.98.3</version>
<type>test-jar</type>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>2.0.0</version>
<type>test-jar</type>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>2.0.0</version>
<scope>test</scope>
</dependency>
]]></programlisting>
<para>This code represents an integration test for the MyDAO insert shown in <xref
linkend="unit.tests"/>.</para>
<programlisting language="java">
public class MyHBaseIntegrationTest {
private static HBaseTestingUtility utility;
byte[] CF = "CF".getBytes();
byte[] QUALIFIER = "CQ-1".getBytes();
@Before
public void setup() throws Exception {
utility = new HBaseTestingUtility();
utility.startMiniCluster();
}
@Test
public void testInsert() throws Exception {
HTableInterface table = utility.createTable(Bytes.toBytes("MyTest"),
Bytes.toBytes("CF"));
HBaseTestObj obj = new HBaseTestObj();
obj.setRowKey("ROWKEY-1");
obj.setData1("DATA-1");
obj.setData2("DATA-2");
MyHBaseDAO.insertRecord(table, obj);
Get get1 = new Get(Bytes.toBytes(obj.getRowKey()));
get1.addColumn(CF, CQ1);
Result result1 = table.get(get1);
assertEquals(Bytes.toString(result1.getRow()), obj.getRowKey());
assertEquals(Bytes.toString(result1.value()), obj.getData1());
Get get2 = new Get(Bytes.toBytes(obj.getRowKey()));
get2.addColumn(CF, CQ2);
Result result2 = table.get(get2);
assertEquals(Bytes.toString(result2.getRow()), obj.getRowKey());
assertEquals(Bytes.toString(result2.value()), obj.getData2());
}
}
</programlisting>
<para>This code creates an HBase mini-cluster and starts it. Next, it creates a table called
<literal>MyTest</literal> with one column family, <literal>CF</literal>. A record is
inserted, a Get is performed from the same table, and the insertion is verified.</para>
<note>
<para>Starting the mini-cluster takes about 20-30 seconds, but that should be
appropriate for integration testing. </para>
</note>
<para>To use an HBase mini-cluster on Microsoft Windows, you need to use a Cygwin
environment.</para>
<para>See the paper at <link
xlink:href="http://blog.sematext.com/2010/08/30/hbase-case-study-using-hbasetestingutility-for-local-testing-development/"
>HBase Case-Study: Using HBaseTestingUtility for Local Testing and
Development</link> (2010) for more information about HBaseTestingUtility.</para>
</section>
</chapter>