From d55f4aee4ff7e952eedbd04565e1b5f7b67379f5 Mon Sep 17 00:00:00 2001 From: Misty Stanley-Jones Date: Mon, 10 Aug 2015 09:40:12 +1000 Subject: [PATCH] HBASE-12615 Document GC conserving guidelines for contributors Signed-off-by: Jonathan Hsieh --- src/main/asciidoc/_chapters/developer.adoc | 258 +++++++++++---------- 1 file changed, 136 insertions(+), 122 deletions(-) diff --git a/src/main/asciidoc/_chapters/developer.adoc b/src/main/asciidoc/_chapters/developer.adoc index f96d87e332c..5d1af6c07e5 100644 --- a/src/main/asciidoc/_chapters/developer.adoc +++ b/src/main/asciidoc/_chapters/developer.adoc @@ -40,14 +40,14 @@ See link:http://search-hadoop.com/m/DHED43re96[What label Before you get started submitting code to HBase, please refer to <>. -As Apache HBase is an Apache Software Foundation project, see <> for more information about how the ASF functions. +As Apache HBase is an Apache Software Foundation project, see <> for more information about how the ASF functions. [[mailing.list]] === Mailing Lists Sign up for the dev-list and the user-list. See the link:http://hbase.apache.org/mail-lists.html[mailing lists] page. -Posing questions - and helping to answer other people's questions - is encouraged! There are varying levels of experience on both lists so patience and politeness are encouraged (and please stay on topic.) +Posing questions - and helping to answer other people's questions - is encouraged! There are varying levels of experience on both lists so patience and politeness are encouraged (and please stay on topic.) [[irc]] === Internet Relay Chat (IRC) @@ -58,7 +58,7 @@ FreeNode offers a web-based client, but most people prefer a native client, and === Jira Check for existing issues in link:https://issues.apache.org/jira/browse/HBASE[Jira]. -If it's either a new feature request, enhancement, or a bug, file a ticket. +If it's either a new feature request, enhancement, or a bug, file a ticket. To check for existing issues which you can tackle as a beginner, search for link:https://issues.apache.org/jira/issues/?jql=project%20%3D%20HBASE%20AND%20labels%20in%20(beginner)[issues in JIRA tagged with the label 'beginner']. @@ -137,26 +137,26 @@ Eclipse Indigo or newer includes +m2eclipse+, or you can download it from link:h To import the project, click and select the HBase root directory. `m2eclipse` locates all the hbase modules for you. -If you install +m2eclipse+ and import HBase in your workspace, do the following to fix your eclipse Build Path. +If you install +m2eclipse+ and import HBase in your workspace, do the following to fix your eclipse Build Path. . Remove _target_ folder . Add _target/generated-jamon_ and _target/generated-sources/java_ folders. . Remove from your Build Path the exclusions on the _src/main/resources_ and _src/test/resources_ to avoid error message in the console, such as the following: + ---- -Failed to execute goal +Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.6:run (default) on project hbase: -'An Ant BuildException has occured: Replace: source file .../target/classes/hbase-default.xml +'An Ant BuildException has occured: Replace: source file .../target/classes/hbase-default.xml doesn't exist ---- + -This will also reduce the eclipse build cycles and make your life easier when developing. +This will also reduce the eclipse build cycles and make your life easier when developing. [[eclipse.commandline]] ==== HBase Project Setup in Eclipse Using the Command Line -Instead of using `m2eclipse`, you can generate the Eclipse files from the command line. +Instead of using `m2eclipse`, you can generate the Eclipse files from the command line. . First, run the following command, which builds HBase. You only need to do this once. @@ -181,7 +181,7 @@ mvn eclipse:eclipse The `$M2_REPO` classpath variable needs to be set up for the project. This needs to be set to your local Maven repository, which is usually _~/.m2/repository_ -If this classpath variable is not configured, you will see compile errors in Eclipse like this: +If this classpath variable is not configured, you will see compile errors in Eclipse like this: ---- @@ -209,14 +209,14 @@ Access restriction: The method getLong(Object, long) from the type Unsafe is not [[eclipse.more]] ==== Eclipse - More Information -For additional information on setting up Eclipse for HBase development on Windows, see link:http://michaelmorello.blogspot.com/2011/09/hbase-subversion-eclipse-windows.html[Michael Morello's blog] on the topic. +For additional information on setting up Eclipse for HBase development on Windows, see link:http://michaelmorello.blogspot.com/2011/09/hbase-subversion-eclipse-windows.html[Michael Morello's blog] on the topic. === IntelliJ IDEA You can set up IntelliJ IDEA for similar functinoality as Eclipse. Follow these steps. -. Select +. Select . You do not need to select a profile. Be sure [label]#Maven project required# is selected, and click btn:[Next]. @@ -244,13 +244,13 @@ To check your Maven version, run the command +mvn -version+. [NOTE] ==== Starting with HBase 1.0 you must use Java 7 or later to build from source code. -See <> for more complete information about supported JDK versions. +See <> for more complete information about supported JDK versions. ==== [[maven.build.commands]] ==== Maven Build Commands -All commands are executed from the local HBase project directory. +All commands are executed from the local HBase project directory. ===== Package @@ -269,7 +269,7 @@ mvn clean package -DskipTests ---- With Eclipse set up as explained above in <>, you can also use the menu:Build[] command in Eclipse. -To create the full installable HBase package takes a little bit more work, so read on. +To create the full installable HBase package takes a little bit more work, so read on. [[maven.build.commands.compile]] ===== Compile @@ -335,7 +335,7 @@ This seems to be a maven pecularity that is probably fixable but we've not spent ==== Similarly, for 3.0, you would just replace the profile value. -Note that Hadoop-3.0.0-SNAPSHOT does not currently have a deployed maven artificat - you will need to build and install your own in your local maven repository if you want to run against this profile. +Note that Hadoop-3.0.0-SNAPSHOT does not currently have a deployed maven artificat - you will need to build and install your own in your local maven repository if you want to run against this profile. In earilier versions of Apache HBase, you can build against older versions of Apache Hadoop, notably, Hadoop 0.22.x and 0.23.x. If you are running, for example HBase-0.94 and wanted to build against Hadoop 0.23.x, you would run with: @@ -367,7 +367,7 @@ You may also want to define `protoc.path` for the protoc binary, using the follo mvn compile -Pcompile-protobuf -Dprotoc.path=/opt/local/bin/protoc ---- -Read the _hbase-protocol/README.txt_ for more details. +Read the _hbase-protocol/README.txt_ for more details. [[build.thrift]] ==== Build Thrift @@ -417,7 +417,7 @@ mvn -DskipTests package assembly:single deploy If you see `Unable to find resource 'VM_global_library.vm'`, ignore it. Its not an error. It is link:http://jira.codehaus.org/browse/MSITE-286[officially - ugly] though. + ugly] though. [[build.snappy]] ==== Building in snappy compression support @@ -440,7 +440,7 @@ See <> for Java requirements per HBase release. HBase 0.96.x will run on Hadoop 1.x or Hadoop 2.x. HBase 0.98 still runs on both, but HBase 0.98 deprecates use of Hadoop 1. HBase 1.x will _not_ run on Hadoop 1. -In the following procedures, we make a distinction between HBase 1.x builds and the awkward process involved building HBase 0.96/0.98 for either Hadoop 1 or Hadoop 2 targets. +In the following procedures, we make a distinction between HBase 1.x builds and the awkward process involved building HBase 0.96/0.98 for either Hadoop 1 or Hadoop 2 targets. You must choose which Hadoop to build against. It is not possible to build a single HBase binary that runs against both Hadoop 1 and Hadoop 2. @@ -507,22 +507,22 @@ For the build to sign them for you, you a properly configured _settings.xml_ in NOTE: These instructions are for building HBase 1.0.x. For building earlier versions, the process is different. -See this section under the respective release documentation folders. +See this section under the respective release documentation folders. .Point Releases If you are making a point release (for example to quickly address a critical incompatability or security problem) off of a release branch instead of a development branch, the tagging instructions are slightly different. -I'll prefix those special steps with _Point Release Only_. +I'll prefix those special steps with _Point Release Only_. .Before You Begin Before you make a release candidate, do a practice run by deploying a snapshot. Before you start, check to be sure recent builds have been passing for the branch from where you are going to take your release. -You should also have tried recent branch tips out on a cluster under load, perhaps by running the `hbase-it` integration test suite for a few hours to 'burn in' the near-candidate bits. +You should also have tried recent branch tips out on a cluster under load, perhaps by running the `hbase-it` integration test suite for a few hours to 'burn in' the near-candidate bits. .Point Release Only [NOTE] ==== At this point you should tag the previous release branch (ex: 0.96.1) with the new point release tag (e.g. -0.96.1.1 tag). Any commits with changes for the point release should be appled to the new tag. +0.96.1.1 tag). Any commits with changes for the point release should be appled to the new tag. ==== The Hadoop link:http://wiki.apache.org/hadoop/HowToRelease[How To @@ -590,7 +590,7 @@ Extract the tarball and make sure it looks good. A good test for the src tarball being 'complete' is to see if you can build new tarballs from this source bundle. If the source tarball is good, save it off to a _version directory_, a directory somewhere where you are collecting all of the tarballs you will publish as part of the release candidate. For example if you were building a hbase-0.96.0 release candidate, you might call the directory _hbase-0.96.0RC0_. -Later you will publish this directory as our release candidate up on http://people.apache.org/~YOU. +Later you will publish this directory as our release candidate up on http://people.apache.org/~YOU. . Build the binary tarball. + @@ -621,7 +621,7 @@ It seems that you need the install goal in both steps. + Extract the generated tarball and check it out. Look at the documentation, see if it runs, etc. -If good, copy the tarball to the above mentioned _version directory_. +If good, copy the tarball to the above mentioned _version directory_. . Create a new tag. + @@ -647,7 +647,7 @@ $ mvn deploy -DskipTests -Papache-release -Prelease ---- + This command copies all artifacts up to a temporary staging Apache mvn repository in an 'open' state. -More work needs to be done on these maven artifacts to make them generally available. +More work needs to be done on these maven artifacts to make them generally available. + We do not release HBase tarball to the Apache Maven repository. To avoid deploying the tarball, do not include the `assembly:single` goal in your `mvn deploy` command. Check the deployed artifacts as described in the next section. @@ -672,7 +672,7 @@ If the published artifacts are incomplete or have problems, just delete the 'ope See the link:https://github.com/saintstack/hbase-downstreamer[hbase-downstreamer] test for a simple example of a project that is downstream of HBase an depends on it. Check it out and run its simple test to make sure maven artifacts are properly deployed to the maven repository. Be sure to edit the pom to point to the proper staging repository. -Make sure you are pulling from the repository when tests run and that you are not getting from your local repository, by either passing the `-U` flag or deleting your local repo content and check maven is pulling from remote out of the staging repository. +Make sure you are pulling from the repository when tests run and that you are not getting from your local repository, by either passing the `-U` flag or deleting your local repo content and check maven is pulling from remote out of the staging repository. ==== + See link:http://www.apache.org/dev/publishing-maven-artifacts.html[Publishing Maven Artifacts] for some pointers on this maven staging process. @@ -680,7 +680,7 @@ See link:http://www.apache.org/dev/publishing-maven-artifacts.html[Publishing Ma NOTE: We no longer publish using the maven release plugin. Instead we do +mvn deploy+. It seems to give us a backdoor to maven release publishing. -If there is no _-SNAPSHOT_ on the version string, then we are 'deployed' to the apache maven repository staging directory from which we can publish URLs for candidates and later, if they pass, publish as release (if a _-SNAPSHOT_ on the version string, deploy will put the artifacts up into apache snapshot repos). +If there is no _-SNAPSHOT_ on the version string, then we are 'deployed' to the apache maven repository staging directory from which we can publish URLs for candidates and later, if they pass, publish as release (if a _-SNAPSHOT_ on the version string, deploy will put the artifacts up into apache snapshot repos). + If the HBase version ends in `-SNAPSHOT`, the artifacts go elsewhere. They are put into the Apache snapshots repository directly and are immediately available. @@ -694,7 +694,7 @@ These are publicly accessible in a temporary staging repository whose URL you sh The above mentioned script, _make_rc.sh_ does all of the above for you minus the check of the artifacts built, the closing of the staging repository up in maven, and the tagging of the release. If you run the script, do your checks at this stage verifying the src and bin tarballs and checking what is up in staging using hbase-downstreamer project. Tag before you start the build. -You can always delete it if the build goes haywire. +You can always delete it if the build goes haywire. . Sign, upload, and 'stage' your version directory to link:http://people.apache.org[people.apache.org] (TODO: There is a new location to stage releases using svnpubsub. See @@ -702,7 +702,7 @@ You can always delete it if the build goes haywire. + If all checks out, next put the _version directory_ up on link:http://people.apache.org[people.apache.org]. You will need to sign and fingerprint them before you push them up. -In the _version directory_ run the following commands: +In the _version directory_ run the following commands: + [source,bourne] ---- @@ -715,7 +715,7 @@ $ rsync -av 0.96.0RC0 people.apache.org:public_html ---- + Make sure the link:http://people.apache.org[people.apache.org] directory is showing and that the mvn repo URLs are good. -Announce the release candidate on the mailing list and call a vote. +Announce the release candidate on the mailing list and call a vote. [[maven.snapshot]] @@ -734,7 +734,7 @@ Following is an example of publishing SNAPSHOTS of a release that had an hbase v The _make_rc.sh_ script mentioned above (see <>) can help you publish `SNAPSHOTS`. Make sure your `hbase.version` has a `-SNAPSHOT` suffix before running the script. -It will put a snapshot up into the apache snapshot repository for you. +It will put a snapshot up into the apache snapshot repository for you. [[hbase.rc.voting]] == Voting on Release Candidates @@ -749,7 +749,7 @@ PMC members, please read this WIP doc on policy voting for a release candidate, requirements of the ASF policy on releases._ Regards the latter, run +mvn apache-rat:check+ to verify all files are suitably licensed. See link:http://search-hadoop.com/m/DHED4dhFaU[HBase, mail # dev - On recent discussion clarifying ASF release policy]. -for how we arrived at this process. +for how we arrived at this process. [[documentation]] == Generating the HBase Reference Guide @@ -757,7 +757,7 @@ for how we arrived at this process. The manual is marked up using Asciidoc. We then use the link:http://asciidoctor.org/docs/asciidoctor-maven-plugin/[Asciidoctor maven plugin] to transform the markup to html. This plugin is run when you specify the +site+ goal as in when you run +mvn site+. -See <> for more information on building the documentation. +See <> for more information on building the documentation. [[hbase.org]] == Updating link:http://hbase.apache.org[hbase.apache.org] @@ -811,7 +811,7 @@ For any other module, for example `hbase-common`, the tests must be strict unit The HBase shell and its tests are predominantly written in jruby. In order to make these tests run as a part of the standard build, there is a single JUnit test, `TestShell`, that takes care of loading the jruby implemented tests and running them. -You can run all of these tests from the top level with: +You can run all of these tests from the top level with: [source,bourne] ---- @@ -821,7 +821,7 @@ You can run all of these tests from the top level with: Alternatively, you may limit the shell tests that run using the system variable `shell.test`. This value should specify the ruby literal equivalent of a particular test case by name. -For example, the tests that cover the shell commands for altering tables are contained in the test case `AdminAlterTableTest` and you can run them with: +For example, the tests that cover the shell commands for altering tables are contained in the test case `AdminAlterTableTest` and you can run them with: [source,bourne] ---- @@ -831,7 +831,7 @@ For example, the tests that cover the shell commands for altering tables are con You may also use a link:http://docs.ruby-doc.com/docs/ProgrammingRuby/html/language.html#UJ[Ruby Regular Expression literal] (in the `/pattern/` style) to select a set of test cases. -You can run all of the HBase admin related tests, including both the normal administration and the security administration, with the command: +You can run all of the HBase admin related tests, including both the normal administration and the security administration, with the command: [source,bourne] ---- @@ -839,7 +839,7 @@ You can run all of the HBase admin related tests, including both the normal admi mvn clean test -Dtest=TestShell -Dshell.test=/.*Admin.*Test/ ---- -In the event of a test failure, you can see details by examining the XML version of the surefire report results +In the event of a test failure, you can see details by examining the XML version of the surefire report results [source,bourne] ---- @@ -897,7 +897,7 @@ public class TestHRegionInfo { ---- The above example shows how to mark a unit test as belonging to the `small` category. -All unit tests in HBase have a categorization. +All unit tests in HBase have a categorization. The first three categories, `small`, `medium`, and `large`, are for tests run when you type `$ mvn test`. In other words, these three categorizations are for HBase unit tests. @@ -905,9 +905,9 @@ The `integration` category is not for unit tests, but for integration tests. These are run when you invoke `$ mvn verify`. Integration tests are described in <>. -HBase uses a patched maven surefire plugin and maven profiles to implement its unit test characterizations. +HBase uses a patched maven surefire plugin and maven profiles to implement its unit test characterizations. -Keep reading to figure which annotation of the set small, medium, and large to put on your new HBase unit test. +Keep reading to figure which annotation of the set small, medium, and large to put on your new HBase unit test. .Categorizing Tests Small Tests (((SmallTests))):: @@ -919,28 +919,28 @@ Medium Tests (((MediumTests))):: _Medium_ tests represent tests that must be executed before proposing a patch. They are designed to run in less than 30 minutes altogether, and are quite stable in their results. They are designed to last less than 50 seconds individually. - They can use a cluster, and each of them is executed in a separate JVM. + They can use a cluster, and each of them is executed in a separate JVM. Large Tests (((LargeTests))):: _Large_ tests are everything else. They are typically large-scale tests, regression tests for specific bugs, timeout tests, performance tests. They are executed before a commit on the pre-integration machines. - They can be run on the developer machine as well. + They can be run on the developer machine as well. Integration Tests (((IntegrationTests))):: _Integration_ tests are system level tests. - See <> for more info. + See <> for more info. [[hbase.unittests.cmds]] === Running tests [[hbase.unittests.cmds.test]] -==== Default: small and medium category tests +==== Default: small and medium category tests Running `mvn test` will execute all small tests in a single JVM (no fork) and then medium tests in a separate JVM for each test instance. Medium tests are NOT executed if there is an error in a small test. Large tests are NOT executed. -There is one report for small tests, and one report for medium tests if they are executed. +There is one report for small tests, and one report for medium tests if they are executed. [[hbase.unittests.cmds.test.runalltests]] ==== Running all tests @@ -948,38 +948,38 @@ There is one report for small tests, and one report for medium tests if they are Running `mvn test -P runAllTests` will execute small tests in a single JVM then medium and large tests in a separate JVM for each test. Medium and large tests are NOT executed if there is an error in a small test. Large tests are NOT executed if there is an error in a small or medium test. -There is one report for small tests, and one report for medium and large tests if they are executed. +There is one report for small tests, and one report for medium and large tests if they are executed. [[hbase.unittests.cmds.test.localtests.mytest]] ==== Running a single test or all tests in a package -To run an individual test, e.g. `MyTest`, rum `mvn test -Dtest=MyTest` You can also pass multiple, individual tests as a comma-delimited list: +To run an individual test, e.g. `MyTest`, rum `mvn test -Dtest=MyTest` You can also pass multiple, individual tests as a comma-delimited list: [source,bash] ---- mvn test -Dtest=MyTest1,MyTest2,MyTest3 ---- -You can also pass a package, which will run all tests under the package: +You can also pass a package, which will run all tests under the package: [source,bash] ---- mvn test '-Dtest=org.apache.hadoop.hbase.client.*' ----- +---- When `-Dtest` is specified, the `localTests` profile will be used. It will use the official release of maven surefire, rather than our custom surefire plugin, and the old connector (The HBase build uses a patched version of the maven surefire plugin). Each junit test is executed in a separate JVM (A fork per test class). There is no parallelization when tests are running in this mode. You will see a new message at the end of the -report: `"[INFO] Tests are skipped"`. It's harmless. -However, you need to make sure the sum of `Tests run:` in the `Results:` section of test reports matching the number of tests you specified because no error will be reported when a non-existent test case is specified. +However, you need to make sure the sum of `Tests run:` in the `Results:` section of test reports matching the number of tests you specified because no error will be reported when a non-existent test case is specified. [[hbase.unittests.cmds.test.profiles]] ==== Other test invocation permutations -Running `mvn test -P runSmallTests` will execute "small" tests only, using a single JVM. +Running `mvn test -P runSmallTests` will execute "small" tests only, using a single JVM. -Running `mvn test -P runMediumTests` will execute "medium" tests only, launching a new JVM for each test-class. +Running `mvn test -P runMediumTests` will execute "medium" tests only, launching a new JVM for each test-class. -Running `mvn test -P runLargeTests` will execute "large" tests only, launching a new JVM for each test-class. +Running `mvn test -P runLargeTests` will execute "large" tests only, launching a new JVM for each test-class. -For convenience, you can run `mvn test -P runDevTests` to execute both small and medium tests, using a single JVM. +For convenience, you can run `mvn test -P runDevTests` to execute both small and medium tests, using a single JVM. [[hbase.unittests.test.faster]] ==== Running tests faster @@ -1001,7 +1001,7 @@ $ sudo mkdir /ram2G sudo mount -t tmpfs -o size=2048M tmpfs /ram2G ---- -You can then use it to run all HBase tests on 2.0 with the command: +You can then use it to run all HBase tests on 2.0 with the command: ---- mvn test @@ -1009,7 +1009,7 @@ mvn test -Dtest.build.data.basedirectory=/ram2G ---- -On earlier versions, use: +On earlier versions, use: ---- mvn test @@ -1028,7 +1028,7 @@ It must be executed from the directory which contains the _pom.xml_. For example running +./dev-support/hbasetests.sh+ will execute small and medium tests. Running +./dev-support/hbasetests.sh runAllTests+ will execute all tests. -Running +./dev-support/hbasetests.sh replayFailed+ will rerun the failed tests a second time, in a separate jvm and without parallelisation. +Running +./dev-support/hbasetests.sh replayFailed+ will rerun the failed tests a second time, in a separate jvm and without parallelisation. [[hbase.unittests.resource.checker]] ==== Test Resource Checker(((Test ResourceChecker))) @@ -1038,7 +1038,7 @@ Check the _*-out.txt_ files). The resources counted are the number of threads, t If the number has increased, it adds a _LEAK?_ comment in the logs. As you can have an HBase instance running in the background, some threads can be deleted/created without any specific action in the test. However, if the test does not work as expected, or if the test should not impact these resources, it's worth checking these log lines [computeroutput]+...hbase.ResourceChecker(157): before...+ and [computeroutput]+...hbase.ResourceChecker(157): after...+. -For example: +For example: ---- 2012-09-26 09:22:15,315 INFO [pool-1-thread-1] @@ -1081,10 +1081,10 @@ This allows understanding what the test is waiting for. Moreover, the test will work whatever the machine performance is. Sleep should be minimal to be as fast as possible. Waiting for a variable should be done in a 40ms sleep loop. -Waiting for a socket operation should be done in a 200 ms sleep loop. +Waiting for a socket operation should be done in a 200 ms sleep loop. [[hbase.tests.cluster]] -==== Tests using a cluster +==== Tests using a cluster Tests using a HRegion do not have to start a cluster: A region can use the local file system. Start/stopping a cluster cost around 10 seconds. @@ -1092,7 +1092,7 @@ They should not be started per test method but per test class. Started cluster must be shutdown using [method]+HBaseTestingUtility#shutdownMiniCluster+, which cleans the directories. As most as possible, tests should use the default settings for the cluster. When they don't, they should document it. -This will allow to share the cluster later. +This will allow to share the cluster later. [[integration.tests]] === Integration Tests @@ -1100,16 +1100,16 @@ This will allow to share the cluster later. HBase integration/system tests are tests that are beyond HBase unit tests. They are generally long-lasting, sizeable (the test can be asked to 1M rows or 1B rows), targetable (they can take configuration that will point them at the ready-made cluster they are to run against; integration tests do not include cluster start/stop code), and verifying success, integration tests rely on public APIs only; they do not attempt to examine server internals asserting success/fail. Integration tests are what you would run when you need to more elaborate proofing of a release candidate beyond what unit tests can do. -They are not generally run on the Apache Continuous Integration build server, however, some sites opt to run integration tests as a part of their continuous testing on an actual cluster. +They are not generally run on the Apache Continuous Integration build server, however, some sites opt to run integration tests as a part of their continuous testing on an actual cluster. Integration tests currently live under the _src/test_ directory in the hbase-it submodule and will match the regex: _**/IntegrationTest*.java_. -All integration tests are also annotated with `@Category(IntegrationTests.class)`. +All integration tests are also annotated with `@Category(IntegrationTests.class)`. Integration tests can be run in two modes: using a mini cluster, or against an actual distributed cluster. Maven failsafe is used to run the tests using the mini cluster. IntegrationTestsDriver class is used for executing the tests against a distributed cluster. Integration tests SHOULD NOT assume that they are running against a mini cluster, and SHOULD NOT use private API's to access cluster state. -To interact with the distributed or mini cluster uniformly, `IntegrationTestingUtility`, and `HBaseCluster` classes, and public client API's can be used. +To interact with the distributed or mini cluster uniformly, `IntegrationTestingUtility`, and `HBaseCluster` classes, and public client API's can be used. On a distributed cluster, integration tests that use ChaosMonkey or otherwise manipulate services thru cluster manager (e.g. restart regionservers) use SSH to do it. @@ -1123,15 +1123,15 @@ The argument 1 (%1$s) is SSH options set the via opts setting or via environment ---- /usr/bin/ssh %1$s %2$s%3$s%4$s "su hbase - -c \"%5$s\"" ---- -That way, to kill RS (for example) integration tests may run: +That way, to kill RS (for example) integration tests may run: [source,bash] ---- {/usr/bin/ssh some-hostname "su hbase - -c \"ps aux | ... | kill ...\""} ---- -The command is logged in the test logs, so you can verify it is correct for your environment. +The command is logged in the test logs, so you can verify it is correct for your environment. To disable the running of Integration Tests, pass the following profile on the command line `-PskipIntegrationTests`. -For example, +For example, [source] ---- $ mvn clean install test -Dtest=TestZooKeeper -PskipIntegrationTests @@ -1153,9 +1153,9 @@ mvn verify ---- If you just want to run the integration tests in top-level, you need to run two commands. -First: +mvn failsafe:integration-test+ This actually runs ALL the integration tests. +First: +mvn failsafe:integration-test+ This actually runs ALL the integration tests. -NOTE: This command will always output `BUILD SUCCESS` even if there are test failures. +NOTE: This command will always output `BUILD SUCCESS` even if there are test failures. At this point, you could grep the output by hand looking for failed tests. However, maven will do this for us; just use: +mvn @@ -1167,18 +1167,18 @@ However, maven will do this for us; just use: +mvn This is very similar to how you specify running a subset of unit tests (see above), but use the property `it.test` instead of `test`. To just run `IntegrationTestClassXYZ.java`, use: +mvn failsafe:integration-test -Dit.test=IntegrationTestClassXYZ+ The next thing you might want to do is run groups of integration tests, say all integration tests that are named IntegrationTestClassX*.java: +mvn failsafe:integration-test -Dit.test=*ClassX*+ This runs everything that is an integration test that matches *ClassX*. This means anything matching: "**/IntegrationTest*ClassX*". You can also run multiple groups of integration tests using comma-delimited lists (similar to unit tests). Using a list of matches still supports full regex matching for each of the groups.This would look something like: +mvn - failsafe:integration-test -Dit.test=*ClassX*, *ClassY+ + failsafe:integration-test -Dit.test=*ClassX*, *ClassY+ [[maven.build.commands.integration.tests.distributed]] ==== Running integration tests against distributed cluster If you have an already-setup HBase cluster, you can launch the integration tests by invoking the class `IntegrationTestsDriver`. You may have to run test-compile first. -The configuration will be picked by the bin/hbase script. +The configuration will be picked by the bin/hbase script. [source,bourne] ---- mvn test-compile ----- +---- Then launch the tests with: [source,bourne] @@ -1191,14 +1191,14 @@ Running the IntegrationTestsDriver without any argument will launch tests found See the usage, by passing -h, to see how to filter test classes. You can pass a regex which is checked against the full class name; so, part of class name can be used. IntegrationTestsDriver uses Junit to run the tests. -Currently there is no support for running integration tests against a distributed cluster using maven (see link:https://issues.apache.org/jira/browse/HBASE-6201[HBASE-6201]). +Currently there is no support for running integration tests against a distributed cluster using maven (see link:https://issues.apache.org/jira/browse/HBASE-6201[HBASE-6201]). The tests interact with the distributed cluster by using the methods in the `DistributedHBaseCluster` (implementing `HBaseCluster`) class, which in turn uses a pluggable `ClusterManager`. Concrete implementations provide actual functionality for carrying out deployment-specific and environment-dependent tasks (SSH, etc). The default `ClusterManager` is `HBaseClusterManager`, which uses SSH to remotely execute start/stop/kill/signal commands, and assumes some posix commands (ps, etc). Also assumes the user running the test has enough "power" to start/stop servers on the remote machines. By default, it picks up `HBASE_SSH_OPTS`, `HBASE_HOME`, `HBASE_CONF_DIR` from the env, and uses `bin/hbase-daemon.sh` to carry out the actions. Currently tarball deployments, deployments which uses _hbase-daemons.sh_, and link:http://incubator.apache.org/ambari/[Apache Ambari] deployments are supported. _/etc/init.d/_ scripts are not supported for now, but it can be easily added. -For other deployment options, a ClusterManager can be implemented and plugged in. +For other deployment options, a ClusterManager can be implemented and plugged in. [[maven.build.commands.integration.tests.destructive]] ==== Destructive integration / system tests @@ -1206,7 +1206,7 @@ For other deployment options, a ClusterManager can be implemented and plugged in In 0.96, a tool named `ChaosMonkey` has been introduced. It is modeled after the link:http://techblog.netflix.com/2012/07/chaos-monkey-released-into-wild.html[same-named tool by Netflix]. Some of the tests use ChaosMonkey to simulate faults in the running cluster in the way of killing random servers, disconnecting servers, etc. -ChaosMonkey can also be used as a stand-alone tool to run a (misbehaving) policy while you are running other tests. +ChaosMonkey can also be used as a stand-alone tool to run a (misbehaving) policy while you are running other tests. ChaosMonkey defines Action's and Policy's. Actions are sequences of events. @@ -1223,7 +1223,7 @@ We have at least the following actions: Policies on the other hand are responsible for executing the actions based on a strategy. The default policy is to execute a random action every minute based on predefined action weights. ChaosMonkey executes predefined named policies until it is stopped. -More than one policy can be active at any time. +More than one policy can be active at any time. To run ChaosMonkey as a standalone tool deploy your HBase cluster as usual. ChaosMonkey uses the configuration from the bin/hbase script, thus no extra configuration needs to be done. @@ -1234,7 +1234,7 @@ You can invoke the ChaosMonkey by running: bin/hbase org.apache.hadoop.hbase.util.ChaosMonkey ---- -This will output smt like: +This will output something like: ---- @@ -1276,7 +1276,7 @@ This will output smt like: ---- As you can see from the log, ChaosMonkey started the default PeriodicRandomActionPolicy, which is configured with all the available actions, and ran RestartActiveMaster and RestartRandomRs actions. -ChaosMonkey tool, if run from command line, will keep on running until the process is killed. +ChaosMonkey tool, if run from command line, will keep on running until the process is killed. [[chaos.monkey.properties]] ==== Passing individual Chaos Monkey per-test Settings/Properties @@ -1348,15 +1348,15 @@ NOTE: End-of-life releases are not included in this list. [[code.standards]] === Code Standards -See <> and <>. +See <> and <>. ==== Interface Classifications Interfaces are classified both by audience and by stability level. These labels appear at the head of a class. -The conventions followed by HBase are inherited by its parent project, Hadoop. +The conventions followed by HBase are inherited by its parent project, Hadoop. -The following interface classifications are commonly used: +The following interface classifications are commonly used: .InterfaceAudience `@InterfaceAudience.Public`:: @@ -1380,7 +1380,7 @@ No `@InterfaceAudience` Classification:: .Excluding Non-Public Interfaces from API Documentation [NOTE] ==== -Only interfaces classified `@InterfaceAudience.Public` should be included in API documentation (Javadoc). Committers must add new package excludes `ExcludePackageNames` section of the _pom.xml_ for new packages which do not contain public classes. +Only interfaces classified `@InterfaceAudience.Public` should be included in API documentation (Javadoc). Committers must add new package excludes `ExcludePackageNames` section of the _pom.xml_ for new packages which do not contain public classes. ==== .@InterfaceStability @@ -1398,7 +1398,7 @@ Only interfaces classified `@InterfaceAudience.Public` should be included in API No `@InterfaceStability` Label:: Public classes with no `@InterfaceStability` label are discouraged, and should be considered implicitly unstable. -If you are unclear about how to mark packages, ask on the development list. +If you are unclear about how to mark packages, ask on the development list. [[common.patch.feedback]] ==== Code Formatting Conventions @@ -1501,7 +1501,7 @@ Don't forget Javadoc! Javadoc warnings are checked during precommit. If the precommit tool gives you a '-1', please fix the javadoc issue. -Your patch won't be committed if it adds such warnings. +Your patch won't be committed if it adds such warnings. [[common.patch.feedback.findbugs]] ===== Findbugs @@ -1521,7 +1521,7 @@ value="HE_EQUALS_USE_HASHCODE", justification="I know what I'm doing") ---- -It is important to use the Apache-licensed version of the annotations. +It is important to use the Apache-licensed version of the annotations. [[common.patch.feedback.javadoc.defaults]] ===== Javadoc - Useless Defaults @@ -1545,14 +1545,14 @@ The preference is to add something descriptive and useful. [[common.patch.feedback.onething]] ===== One Thing At A Time, Folks -If you submit a patch for one thing, don't do auto-reformatting or unrelated reformatting of code on a completely different area of code. +If you submit a patch for one thing, don't do auto-reformatting or unrelated reformatting of code on a completely different area of code. -Likewise, don't add unrelated cleanup or refactorings outside the scope of your Jira. +Likewise, don't add unrelated cleanup or refactorings outside the scope of your Jira. [[common.patch.feedback.tests]] ===== Ambigious Unit Tests -Make sure that you're clear about what you are testing in your unit tests and why. +Make sure that you're clear about what you are testing in your unit tests and why. [[common.patch.feedback.writable]] ===== Implementing Writable @@ -1560,17 +1560,31 @@ Make sure that you're clear about what you are testing in your unit tests and wh .Applies pre-0.96 only [NOTE] ==== -In 0.96, HBase moved to protocol buffers (protobufs). The below section on Writables applies to 0.94.x and previous, not to 0.96 and beyond. +In 0.96, HBase moved to protocol buffers (protobufs). The below section on Writables applies to 0.94.x and previous, not to 0.96 and beyond. ==== Every class returned by RegionServers must implement the `Writable` interface. -If you are creating a new class that needs to implement this interface, do not forget the default constructor. +If you are creating a new class that needs to implement this interface, do not forget the default constructor. + +==== Garbage-Collection Conserving Guidelines + +The following guidelines were borrowed from http://engineering.linkedin.com/performance/linkedin-feed-faster-less-jvm-garbage. +Keep them in mind to keep preventable garbage collection to a minimum. Have a look +at the blog post for some great examples of how to refactor your code according to +these guidelines. + +- Be careful with Iterators +- Estimate the size of a collection when initializing +- Defer expression evaluation +- Compile the regex patterns in advance +- Cache it if you can +- String Interns are useful but dangerous [[design.invariants]] === Invariants We don't have many but what we have we list below. -All are subject to challenge of course but until then, please hold to the rules of the road. +All are subject to challenge of course but until then, please hold to the rules of the road. [[design.invariants.zk.data]] ==== No permanent state in ZooKeeper @@ -1591,14 +1605,14 @@ Follow progress on this issue at link:https://issues.apache.org/jira/browse/HBAS If you are developing Apache HBase, frequently it is useful to test your changes against a more-real cluster than what you find in unit tests. In this case, HBase can be run directly from the source in local-mode. -All you need to do is run: +All you need to do is run: [source,bourne] ---- ${HBASE_HOME}/bin/start-hbase.sh ---- -This will spin up a full local-cluster, just as if you had packaged up HBase and installed it on your machine. +This will spin up a full local-cluster, just as if you had packaged up HBase and installed it on your machine. Keep in mind that you will need to have installed HBase into your local maven repository for the in-situ cluster to work properly. That is, you will need to run: @@ -1619,21 +1633,21 @@ HBase exposes metrics using the Hadoop Metrics 2 system, so adding a new metric Unfortunately the API of metrics2 changed from hadoop 1 to hadoop 2. In order to get around this a set of interfaces and implementations have to be loaded at runtime. To get an in-depth look at the reasoning and structure of these classes you can read the blog post located link:https://blogs.apache.org/hbase/entry/migration_to_the_new_metrics[here]. -To add a metric to an existing MBean follow the short guide below: +To add a metric to an existing MBean follow the short guide below: ==== Add Metric name and Function to Hadoop Compat Interface. Inside of the source interface the corresponds to where the metrics are generated (eg MetricsMasterSource for things coming from HMaster) create new static strings for metric name and description. -Then add a new method that will be called to add new reading. +Then add a new method that will be called to add new reading. ==== Add the Implementation to Both Hadoop 1 and Hadoop 2 Compat modules. Inside of the implementation of the source (eg. MetricsMasterSourceImpl in the above example) create a new histogram, counter, gauge, or stat in the init method. -Then in the method that was added to the interface wire up the parameter passed in to the histogram. +Then in the method that was added to the interface wire up the parameter passed in to the histogram. Now add tests that make sure the data is correctly exported to the metrics 2 system. -For this the MetricsAssertHelper is provided. +For this the MetricsAssertHelper is provided. [[git.best.practices]] === Git Best Practices @@ -1674,7 +1688,7 @@ It provides a nice overview that applies equally to the Apache HBase Project. ==== Create Patch The script _dev-support/make_patch.sh_ has been provided to help you adhere to patch-creation guidelines. -The script has the following syntax: +The script has the following syntax: ---- $ make_patch.sh [-a] [-p ] @@ -1752,7 +1766,7 @@ Also, see <>. If you are creating a new unit test class, notice how other unit test classes have classification/sizing annotations at the top and a static method on the end. Be sure to include these in any new unit test files you generate. -See <> for more on how the annotations work. +See <> for more on how the annotations work. ==== Integration Tests @@ -1760,13 +1774,13 @@ Significant new features should provide an integration test in addition to unit ==== ReviewBoard -Patches larger than one screen, or patches that will be tricky to review, should go through link:http://reviews.apache.org[ReviewBoard]. +Patches larger than one screen, or patches that will be tricky to review, should go through link:http://reviews.apache.org[ReviewBoard]. .Procedure: Use ReviewBoard . Register for an account if you don't already have one. It does not use the credentials from link:http://issues.apache.org[issues.apache.org]. Log in. -. Click [label]#New Review Request#. +. Click [label]#New Review Request#. . Choose the `hbase-git` repository. Click Choose File to select the diff and optionally a parent diff. Click btn:[Create @@ -1782,39 +1796,39 @@ Patches larger than one screen, or patches that will be tricky to review, should . To cancel the request, click . For more information on how to use ReviewBoard, see link:http://www.reviewboard.org/docs/manual/1.5/[the ReviewBoard - documentation]. + documentation]. ==== Guide for HBase Committers ===== New committers -New committers are encouraged to first read Apache's generic committer documentation: +New committers are encouraged to first read Apache's generic committer documentation: -* link:http://www.apache.org/dev/new-committers-guide.html[Apache New Committer Guide] -* link:http://www.apache.org/dev/committers.html[Apache Committer FAQ] +* link:http://www.apache.org/dev/new-committers-guide.html[Apache New Committer Guide] +* link:http://www.apache.org/dev/committers.html[Apache Committer FAQ] ===== Review HBase committers should, as often as possible, attempt to review patches submitted by others. Ideally every submitted patch will get reviewed by a committer _within a few days_. -If a committer reviews a patch they have not authored, and believe it to be of sufficient quality, then they can commit the patch, otherwise the patch should be cancelled with a clear explanation for why it was rejected. +If a committer reviews a patch they have not authored, and believe it to be of sufficient quality, then they can commit the patch, otherwise the patch should be cancelled with a clear explanation for why it was rejected. The list of submitted patches is in the link:https://issues.apache.org/jira/secure/IssueNavigator.jspa?mode=hide&requestId=12312392[HBase Review Queue], which is ordered by time of last modification. -Committers should scan the list from top to bottom, looking for patches that they feel qualified to review and possibly commit. +Committers should scan the list from top to bottom, looking for patches that they feel qualified to review and possibly commit. For non-trivial changes, it is required to get another committer to review your own patches before commit. -Use the btn:[Submit Patch] button in JIRA, just like other contributors, and then wait for a `+1` response from another committer before committing. +Use the btn:[Submit Patch] button in JIRA, just like other contributors, and then wait for a `+1` response from another committer before committing. ===== Reject Patches which do not adhere to the guidelines in link:https://wiki.apache.org/hadoop/Hbase/HowToCommit/hadoop/Hbase/HowToContribute#[HowToContribute] and to the link:https://wiki.apache.org/hadoop/Hbase/HowToCommit/hadoop/CodeReviewChecklist#[code review checklist] should be rejected. Committers should always be polite to contributors and try to instruct and encourage them to contribute better patches. -If a committer wishes to improve an unacceptable patch, then it should first be rejected, and a new patch should be attached by the committer for review. +If a committer wishes to improve an unacceptable patch, then it should first be rejected, and a new patch should be attached by the committer for review. [[committing.patches]] ===== Commit -Committers commit patches to the Apache HBase GIT repository. +Committers commit patches to the Apache HBase GIT repository. .Before you commit!!!! [NOTE] @@ -1822,13 +1836,13 @@ Committers commit patches to the Apache HBase GIT repository. Make sure your local configuration is correct, especially your identity and email. Examine the output of the +$ git config --list+ command and be sure it is correct. -See this GitHub article, link:https://help.github.com/articles/set-up-git[Set Up Git] if you need pointers. +See this GitHub article, link:https://help.github.com/articles/set-up-git[Set Up Git] if you need pointers. ==== -When you commit a patch, please: +When you commit a patch, please: . Include the Jira issue id in the commit message, along with a short description of the change and the name of the contributor if it is not you. - Be sure to get the issue ID right, as this causes Jira to link to the change in Git (use the issue's "All" tab to see these). + Be sure to get the issue ID right, as this causes Jira to link to the change in Git (use the issue's "All" tab to see these). . Commit the patch to a new branch based off master or other intended branch. It's a good idea to call this branch by the JIRA ID. Then check out the relevant target branch where you want to commit, make sure your local branch has all remote changes, by doing a +git pull --rebase+ or another similar command, cherry-pick the change into each relevant branch (such as master), and do +git push @@ -1868,7 +1882,7 @@ The only command that actually writes anything to the remote repository is +git The extra +git pull+ commands are usually redundant, but better safe than sorry. -The first example shows how to apply a patch that was generated with +git format-patch+ and apply it to the `master` and `branch-1` branches. +The first example shows how to apply a patch that was generated with +git format-patch+ and apply it to the `master` and `branch-1` branches. The directive to use +git format-patch+ rather than +git diff+, and not to use `--no-prefix`, is a new one. See the second example for how to apply a patch created with +git @@ -1896,7 +1910,7 @@ This example shows how to commit a patch that was created using +git diff+ witho If the patch was created with `--no-prefix`, add `-p0` to the +git apply+ command. ---- -$ git apply ~/Downloads/HBASE-XXXX-v2.patch +$ git apply ~/Downloads/HBASE-XXXX-v2.patch $ git commit -m "HBASE-XXXX Really Good Code Fix (Joe Schmo)" -a # This extra step is needed for patches created with 'git diff' $ git checkout master $ git pull --rebase @@ -1915,7 +1929,7 @@ $ git branch -D HBASE-XXXX ==== . Resolve the issue as fixed, thanking the contributor. - Always set the "Fix Version" at this point, but please only set a single fix version for each branch where the change was committed, the earliest release in that branch in which the change will appear. + Always set the "Fix Version" at this point, but please only set a single fix version for each branch where the change was committed, the earliest release in that branch in which the change will appear. ====== Commit Message Format @@ -1941,24 +1955,24 @@ The resulting commit retains the original author. When the amending author is different from the original committer, add notice of this at the end of the commit message as: `Amending-Author: Author ` See discussion at link:http://search-hadoop.com/m/DHED4wHGYS[HBase, mail # dev - [DISCUSSION] Best practice when amending commits cherry picked - from master to branch]. + from master to branch]. [[committer.tests]] ====== Committers are responsible for making sure commits do not break thebuild or tests If a committer commits a patch, it is their responsibility to make sure it passes the test suite. It is helpful if contributors keep an eye out that their patch does not break the hbase build and/or tests, but ultimately, a contributor cannot be expected to be aware of all the particular vagaries and interconnections that occur in a project like HBase. -A committer should. +A committer should. [[git.patch.flow]] ====== Patching Etiquette In the thread link:http://search-hadoop.com/m/DHED4EiwOz[HBase, mail # dev - ANNOUNCEMENT: Git Migration In Progress (WAS => - Re: Git Migration)], it was agreed on the following patch flow + Re: Git Migration)], it was agreed on the following patch flow . Develop and commit the patch against master first. . Try to cherry-pick the patch when backporting if possible. -. If this does not work, manually commit the patch to the branch. +. If this does not work, manually commit the patch to the branch. ====== Merge Commits @@ -1971,11 +1985,11 @@ See <