HBASE-23552 Format Javadocs on ITBLL

We have this nice description in the java doc on ITBLL but it's
unformatted and thus illegible. Add some formatting so that it can be
read by humans.

Signed-off-by: Jan Hentschel <janh@apache.org>
Signed-off-by: Josh Elser <elserj@apache.org>
This commit is contained in:
Nick Dimiduk 2019-12-09 11:04:47 -08:00 committed by Nick Dimiduk
parent d561130e82
commit a580b1d2e9
1 changed files with 70 additions and 40 deletions

View File

@ -122,73 +122,103 @@ import org.apache.hbase.thirdparty.org.apache.commons.cli.Options;
import org.apache.hbase.thirdparty.org.apache.commons.cli.ParseException; import org.apache.hbase.thirdparty.org.apache.commons.cli.ParseException;
/** /**
* <p>
* This is an integration test borrowed from goraci, written by Keith Turner, * This is an integration test borrowed from goraci, written by Keith Turner,
* which is in turn inspired by the Accumulo test called continous ingest (ci). * which is in turn inspired by the Accumulo test called continous ingest (ci).
* The original source code can be found here: * The original source code can be found here:
* https://github.com/keith-turner/goraci * <ul>
* https://github.com/enis/goraci/ * <li>https://github.com/keith-turner/goraci</li>
* * <li>https://github.com/enis/goraci/</li>
* </ul>
* </p>
* <p>
* Apache Accumulo [0] has a simple test suite that verifies that data is not * Apache Accumulo [0] has a simple test suite that verifies that data is not
* lost at scale. This test suite is called continuous ingest. This test runs * lost at scale. This test suite is called continuous ingest. This test runs
* many ingest clients that continually create linked lists containing 25 * many ingest clients that continually create linked lists containing 25
* million nodes. At some point the clients are stopped and a map reduce job is * million nodes. At some point the clients are stopped and a map reduce job is
* run to ensure no linked list has a hole. A hole indicates data was lost.·· * run to ensure no linked list has a hole. A hole indicates data was lost.
* * </p>
* <p>
* The nodes in the linked list are random. This causes each linked list to * The nodes in the linked list are random. This causes each linked list to
* spread across the table. Therefore if one part of a table loses data, then it * spread across the table. Therefore if one part of a table loses data, then it
* will be detected by references in another part of the table. * will be detected by references in another part of the table.
* * </p>
* THE ANATOMY OF THE TEST * <p>
* <h3>THE ANATOMY OF THE TEST</h3>
* *
* Below is rough sketch of how data is written. For specific details look at * Below is rough sketch of how data is written. For specific details look at
* the Generator code. * the Generator code.
* * </p>
* 1 Write out 1 million nodes· 2 Flush the client· 3 Write out 1 million that * <p>
* reference previous million· 4 If this is the 25th set of 1 million nodes, * <ol>
* then update 1st set of million to point to last· 5 goto 1 * <li>Write out 1 million nodes</li>
* * <li>Flush the client</li>
* <li>Write out 1 million that reference previous million</li>
* <li>If this is the 25th set of 1 million nodes, then update 1st set of
* million to point to last</li>
* <li>goto 1</li>
* </ol>
* </p>
* <p>
* The key is that nodes only reference flushed nodes. Therefore a node should * The key is that nodes only reference flushed nodes. Therefore a node should
* never reference a missing node, even if the ingest client is killed at any * never reference a missing node, even if the ingest client is killed at any
* point in time. * point in time.
* * </p>
* <p>
* When running this test suite w/ Accumulo there is a script running in * When running this test suite w/ Accumulo there is a script running in
* parallel called the Aggitator that randomly and continuously kills server * parallel called the Aggitator that randomly and continuously kills server
* processes.·· The outcome was that many data loss bugs were found in Accumulo * processes. The outcome was that many data loss bugs were found in Accumulo
* by doing this.· This test suite can also help find bugs that impact uptime * by doing this. This test suite can also help find bugs that impact uptime
* and stability when· run for days or weeks.·· * and stability when run for days or weeks.
* * </p>
* This test suite consists the following· - a few Java programs· - a little * <p>
* helper script to run the java programs - a maven script to build it.·· * This test suite consists the following
* * <ul>
* <li>a few Java programs</li>
* <li>a little helper script to run the java programs</li>
* <li>a maven script to build it</li>
* </ul>
* </p>
* <p>
* When generating data, its best to have each map task generate a multiple of * When generating data, its best to have each map task generate a multiple of
* 25 million. The reason for this is that circular linked list are generated * 25 million. The reason for this is that circular linked list are generated
* every 25M. Not generating a multiple in 25M will result in some nodes in the * every 25M. Not generating a multiple in 25M will result in some nodes in the
* linked list not having references. The loss of an unreferenced node can not * linked list not having references. The loss of an unreferenced node can not
* be detected. * be detected.
* * </p>
* * <p>
* Below is a description of the Java programs * <h3>Below is a description of the Java programs</h3>
* * <ul>
* Generator - A map only job that generates data. As stated previously,·its best to generate data * <li>
* in multiples of 25M. An option is also available to allow concurrent walkers to select and walk * {@code Generator} - A map only job that generates data. As stated previously, its best to
* random flushed loops during this phase. * generate data in multiples of 25M. An option is also available to allow concurrent walkers to
* * select and walk random flushed loops during this phase.
* Verify - A map reduce job that looks for holes. Look at the counts after running. REFERENCED and * </li>
* UNREFERENCED are· ok, any UNDEFINED counts are bad. Do not run at the· same * <li>
* time as the Generator. * {@code Verify} - A map reduce job that looks for holes. Look at the counts after running.
* * {@code REFERENCED} and {@code UNREFERENCED} are ok, any {@code UNDEFINED} counts are bad. Do not
* Walker - A standalone program that start following a linked list· and emits timing info.·· * run at the same time as the Generator.
* * </li>
* Print - A standalone program that prints nodes in the linked list * <li>
* * {@code Walker} - A standalone program that start following a linked list and emits timing info.
* Delete - A standalone program that deletes a single node * </li>
* <li>
* {@code Print} - A standalone program that prints nodes in the linked list
* </li>
* <li>
* {@code Delete} - A standalone program that deletes a single node
* </li>
* </ul>
* *
* This class can be run as a unit test, as an integration test, or from the command line * This class can be run as a unit test, as an integration test, or from the command line
* * </p>
* <p>
* ex: * ex:
* <pre>
* ./hbase org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList * ./hbase org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList
* loop 2 1 100000 /temp 1 1000 50 1 0 * loop 2 1 100000 /temp 1 1000 50 1 0
* * </pre>
* </p>
*/ */
@Category(IntegrationTests.class) @Category(IntegrationTests.class)
public class IntegrationTestBigLinkedList extends IntegrationTestBase { public class IntegrationTestBigLinkedList extends IntegrationTestBase {
@ -1879,7 +1909,7 @@ public class IntegrationTestBigLinkedList extends IntegrationTestBase {
System.err.println(" walker " + System.err.println(" walker " +
"Standalone program that starts following a linked list & emits timing info."); "Standalone program that starts following a linked list & emits timing info.");
System.err.println(" print Standalone program that prints nodes in the linked list."); System.err.println(" print Standalone program that prints nodes in the linked list.");
System.err.println(" delete Standalone program that deletes a·single node."); System.err.println(" delete Standalone program that deletes a single node.");
System.err.println(" loop Program to Loop through Generator and Verify steps"); System.err.println(" loop Program to Loop through Generator and Verify steps");
System.err.println(" clean Program to clean all left over detritus."); System.err.println(" clean Program to clean all left over detritus.");
System.err.println(" search Search for missing keys."); System.err.println(" search Search for missing keys.");