HBASE-11317 [docs] Expand unit testing to cover Mockito and MRUnit and give more examples (Misty Stanley-Jones)

This commit is contained in:
Jonathan M Hsieh 2014-07-10 01:43:42 -07:00
parent 779fcc51f4
commit 21d37b3a59
1 changed files with 314 additions and 25 deletions

View File

@ -1136,33 +1136,322 @@ pecularity that is probably fixable but we've not spent the time trying to figur
<section xml:id="developing">
<title>Developing</title>
<section xml:id="codelines"><title>Codelines</title>
<para>Most development is done on the master branch (TRUNK).
However, there are branches for minor releases (e.g., 0.90.1, 0.90.2, and 0.90.3 are on the 0.90 branch).</para>
<para>Most development is done on the master branch, which is named
<literal>master</literal> in the Git repository. Previously, HBase used Subversion, in
which the master branch was called <literal>TRUNK</literal>. Branches exist for minor
releases, and important features and bug fixes are often back-ported.</para>
</section>
<section xml:id="unit.tests">
<title>Unit Tests</title>
<para>In HBase we use <link xlink:href="http://junit.org">JUnit</link> 4.
If you need to run miniclusters of HDFS, ZooKeeper, HBase, or MapReduce testing,
be sure to checkout the <classname>HBaseTestingUtility</classname>.
Alex Baranau of Sematext describes how it can be used in
<link xlink:href="http://blog.sematext.com/2010/08/30/hbase-case-study-using-hbasetestingutility-for-local-testing-development/">HBase Case-Study: Using HBaseTestingUtility for Local Testing and Development</link> (2010).
</para>
<section xml:id="mockito">
<title>Mockito</title>
<para>Sometimes you don't need a full running server
unit testing. For example, some methods can make do with a
a <classname>org.apache.hadoop.hbase.Server</classname> instance
or a <classname>org.apache.hadoop.hbase.master.MasterServices</classname>
Interface reference rather than a full-blown
<classname>org.apache.hadoop.hbase.master.HMaster</classname>.
In these cases, you maybe able to get away with a mocked
<classname>Server</classname> instance. For example:
<programlisting>
TODO...
</programlisting>
</para>
</section>
<section
xml:id="unit.tests">
<title>Unit Tests</title>
<para>The following information is from <link
xlink:href="http://blog.cloudera.com/blog/2013/09/how-to-test-hbase-applications-using-popular-tools/">http://blog.cloudera.com/blog/2013/09/how-to-test-hbase-applications-using-popular-tools/</link>.
The following sections discuss JUnit, Mockito, MRUnit, and HBaseTestingUtility. </para>
<section>
<title>JUnit</title>
<para>HBase uses <link
xlink:href="http://junit.org">JUnit</link> 4 for unit tests</para>
<para>This example will add unit tests to the following example class:</para>
<programlisting>
public class MyHBaseDAO {
public static void insertRecord(HTableInterface table, HBaseTestObj obj)
throws Exception {
Put put = createPut(obj);
table.put(put);
}
private static Put createPut(HBaseTestObj obj) {
Put put = new Put(Bytes.toBytes(obj.getRowKey()));
put.add(Bytes.toBytes("CF"), Bytes.toBytes("CQ-1"),
Bytes.toBytes(obj.getData1()));
put.add(Bytes.toBytes("CF"), Bytes.toBytes("CQ-2"),
Bytes.toBytes(obj.getData2()));
return put;
}
}
</programlisting>
<para>The first step is to add JUnit dependencies to your Maven POM file:</para>
<programlisting><![CDATA[
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.11</version>
<scope>test</scope>
</dependency>
]]></programlisting>
<para>Next, add some unit tests to your code. Tests are annotated with
<literal>@Test</literal>. Here, the unit tests are in bold.</para>
<programlisting>
public class TestMyHbaseDAOData {
@Test
public void testCreatePut() throws Exception {
HBaseTestObj obj = new HBaseTestObj();
obj.setRowKey("ROWKEY-1");
obj.setData1("DATA-1");
obj.setData2("DATA-2");
Put put = MyHBaseDAO.createPut(obj);
<userinput>assertEquals(obj.getRowKey(), Bytes.toString(put.getRow()));
assertEquals(obj.getData1(), Bytes.toString(put.get(Bytes.toBytes("CF"), Bytes.toBytes("CQ-1")).get(0).getValue()));
assertEquals(obj.getData2(), Bytes.toString(put.get(Bytes.toBytes("CF"), Bytes.toBytes("CQ-2")).get(0).getValue()));</userinput>
}
}
</programlisting>
<para>These tests ensure that your <code>createPut</code> method creates, populates,
and returns a <code>Put</code> object with expected values. Of course, JUnit can
do much more than this. For an introduction to JUnit, see <link
xlink:href="https://github.com/junit-team/junit/wiki/Getting-started">https://github.com/junit-team/junit/wiki/Getting-started</link>.
</para>
</section>
<section
xml:id="mockito">
<title>Mockito</title>
<para>Mockito is a mocking framework. It goes further than JUnit by allowing you to
test the interactions between objects without having to replicate the entire
environment. You can read more about Mockito at its project site, <link
xlink:href="https://code.google.com/p/mockito/">https://code.google.com/p/mockito/</link>.</para>
<para>You can use Mockito to do unit testing on smaller units. For instance, you can
mock a <classname>org.apache.hadoop.hbase.Server</classname> instance or a
<classname>org.apache.hadoop.hbase.master.MasterServices</classname>
interface reference rather than a full-blown
<classname>org.apache.hadoop.hbase.master.HMaster</classname>.</para>
<para>This example builds upon the example code in <xref
linkend="unit.tests" />, to test the <code>insertRecord</code>
method.</para>
<para>First, add a dependency for Mockito to your Maven POM file.</para>
<programlisting><![CDATA[
<dependency>
<groupId>org.mockito</groupId>
<artifactId>mockito-all</artifactId>
<version>1.9.5</version>
<scope>test</scope>
</dependency>
]]></programlisting>
<para>Next, add a <code>@RunWith</code> annotation to your test class, to direct it
to use Mockito.</para>
<programlisting>
<userinput>@RunWith(MockitoJUnitRunner.class)</userinput>
public class TestMyHBaseDAO{
@Mock
private HTableInterface table;
@Mock
private HTablePool hTablePool;
@Captor
private ArgumentCaptor putCaptor;
@Test
public void testInsertRecord() throws Exception {
//return mock table when getTable is called
when(hTablePool.getTable("tablename")).thenReturn(table);
//create test object and make a call to the DAO that needs testing
HBaseTestObj obj = new HBaseTestObj();
obj.setRowKey("ROWKEY-1");
obj.setData1("DATA-1");
obj.setData2("DATA-2");
MyHBaseDAO.insertRecord(table, obj);
verify(table).put(putCaptor.capture());
Put put = putCaptor.getValue();
assertEquals(Bytes.toString(put.getRow()), obj.getRowKey());
assert(put.has(Bytes.toBytes("CF"), Bytes.toBytes("CQ-1")));
assert(put.has(Bytes.toBytes("CF"), Bytes.toBytes("CQ-2")));
assertEquals(Bytes.toString(put.get(Bytes.toBytes("CF"),Bytes.toBytes("CQ-1")).get(0).getValue()), "DATA-1");
assertEquals(Bytes.toString(put.get(Bytes.toBytes("CF"),Bytes.toBytes("CQ-2")).get(0).getValue()), "DATA-2");
}
}
</programlisting>
<para>This code populates <code>HBaseTestObj</code> with “ROWKEY-1”, “DATA-1”,
“DATA-2” as values. It then inserts the record into the mocked table. The Put
that the DAO would have inserted is captured, and values are tested to verify
that they are what you expected them to be.</para>
<para>The key here is to manage htable pool and htable instance creation outside the
DAO. This allows you to mock them cleanly and test Puts as shown above.
Similarly, you can now expand into other operations such as Get, Scan, or
Delete.</para>
</section>
<section>
<title>MRUnit</title>
<para><link
xlink:href="http://mrunit.apache.org/">Apache MRUnit</link> is a library
that allows you to unit-test MapReduce jobs. You can use it to test HBase jobs
in the same way as other MapReduce jobs.</para>
<para>Given a MapReduce job that writes to an HBase table called
<literal>MyTest</literal>, which has one column family called
<literal>CF</literal>, the reducer of such a job could look like the
following:</para>
<programlisting><![CDATA[
public class MyReducer extends TableReducer<Text, Text, ImmutableBytesWritable> {
public static final byte[] CF = "CF".getBytes();
public static final byte[] QUALIFIER = "CQ-1".getBytes();
public void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
//bunch of processing to extract data to be inserted, in our case, lets say we are simply
//appending all the records we receive from the mapper for this particular
//key and insert one record into HBase
StringBuffer data = new StringBuffer();
Put put = new Put(Bytes.toBytes(key.toString()));
for (Text val : values) {
data = data.append(val);
}
put.add(CF, QUALIFIER, Bytes.toBytes(data.toString()));
//write to HBase
context.write(new ImmutableBytesWritable(Bytes.toBytes(key.toString())), put);
}
} ]]>
</programlisting>
<para>To test this code, the first step is to add a dependency to MRUnit to your
Maven POM file. </para>
<programlisting><![CDATA[
<dependency>
<groupId>org.apache.mrunit</groupId>
<artifactId>mrunit</artifactId>
<version>1.0.0 </version>
<scope>test</scope>
</dependency>
]]></programlisting>
<para>Next, use the ReducerDriver provided by MRUnit, in your Reducer job.</para>
<programlisting><![CDATA[
public class MyReducerTest {
ReduceDriver<Text, Text, ImmutableBytesWritable, Writable> reduceDriver;
byte[] CF = "CF".getBytes();
byte[] QUALIFIER = "CQ-1".getBytes();
@Before
public void setUp() {
MyReducer reducer = new MyReducer();
reduceDriver = ReduceDriver.newReduceDriver(reducer);
}
@Test
public void testHBaseInsert() throws IOException {
String strKey = "RowKey-1", strValue = "DATA", strValue1 = "DATA1",
strValue2 = "DATA2";
List<Text> list = new ArrayList<Text>();
list.add(new Text(strValue));
list.add(new Text(strValue1));
list.add(new Text(strValue2));
//since in our case all that the reducer is doing is appending the records that the mapper
//sends it, we should get the following back
String expectedOutput = strValue + strValue1 + strValue2;
//Setup Input, mimic what mapper would have passed
//to the reducer and run test
reduceDriver.withInput(new Text(strKey), list);
//run the reducer and get its output
List<Pair<ImmutableBytesWritable, Writable>> result = reduceDriver.run();
//extract key from result and verify
assertEquals(Bytes.toString(result.get(0).getFirst().get()), strKey);
//extract value for CF/QUALIFIER and verify
Put a = (Put)result.get(0).getSecond();
String c = Bytes.toString(a.get(CF, QUALIFIER).get(0).getValue());
assertEquals(expectedOutput,c );
}
}
]]></programlisting>
<para>Your MRUnit test verifies that the output is as expected, the Put that is
inserted into HBase has the correct value, and the ColumnFamily and
ColumnQualifier have the correct values.</para>
<para>MRUnit includes a MapperDriver to test mapping jobs, and you can use MRUnit to
test other operations, including reading from HBase, processing data, or writing
to HDFS,</para>
</section>
<section>
<title>Integration Testing with a HBase Mini-Cluster</title>
<para>HBase ships with HBaseTestingUtility, which makes it easy to write integration
tests using a <firstterm>mini-cluster</firstterm>. The first step is to add some
dependencies to your Maven POM file. Check the versions to be sure they are
appropriate.</para>
<programlisting><![CDATA[
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.0.0</version>
<type>test-jar</type>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase</artifactId>
<version>0.98.3</version>
<type>test-jar</type>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>2.0.0</version>
<type>test-jar</type>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>2.0.0</version>
<scope>test</scope>
</dependency>
]]></programlisting>
<para>This code represents an integration test for the MyDAO insert shown in <xref
linkend="unit.tests" />.</para>
<programlisting>
public class MyHBaseIntegrationTest {
private static HBaseTestingUtility utility;
byte[] CF = "CF".getBytes();
byte[] QUALIFIER = "CQ-1".getBytes();
@Before
public void setup() throws Exception {
utility = new HBaseTestingUtility();
utility.startMiniCluster();
}
@Test
public void testInsert() throws Exception {
HTableInterface table = utility.createTable(Bytes.toBytes("MyTest"),
Bytes.toBytes("CF"));
HBaseTestObj obj = new HBaseTestObj();
obj.setRowKey("ROWKEY-1");
obj.setData1("DATA-1");
obj.setData2("DATA-2");
MyHBaseDAO.insertRecord(table, obj);
Get get1 = new Get(Bytes.toBytes(obj.getRowKey()));
get1.addColumn(CF, CQ1);
Result result1 = table.get(get1);
assertEquals(Bytes.toString(result1.getRow()), obj.getRowKey());
assertEquals(Bytes.toString(result1.value()), obj.getData1());
Get get2 = new Get(Bytes.toBytes(obj.getRowKey()));
get2.addColumn(CF, CQ2);
Result result2 = table.get(get2);
assertEquals(Bytes.toString(result2.getRow()), obj.getRowKey());
assertEquals(Bytes.toString(result2.value()), obj.getData2());
}
}
</programlisting>
<para>This code creates an HBase mini-cluster and starts it. Next, it creates a
table called <literal>MyTest</literal> with one column family,
<literal>CF</literal>. A record is inserted, a Get is performed from the
same table, and the insertion is verified.</para>
<note>
<para>Starting the mini-cluster takes about 20-30 seconds, but that should be
appropriate for integration testing. </para>
</note>
<para>To use an HBase mini-cluster on Microsoft Windows, you need to use a Cygwin
environment.</para>
<para>See the paper at <link
xlink:href="http://blog.sematext.com/2010/08/30/hbase-case-study-using-hbasetestingutility-for-local-testing-development/">HBase
Case-Study: Using HBaseTestingUtility for Local Testing and
Development</link> (2010) for more information about
HBaseTestingUtility.</para>
</section>
</section> <!-- unit tests -->
<section