Some OMME can not cause the JVM to exit, like "java.lang.OutOfMemoryError: Direct buffer memory", "java.lang.OutOfMemoryError: unable to create new native thread", as they dont call vmError#next_OnError_command. So abort HMaster when uncaught exception occurs in TimeoutExecutor, the new active Hmaster will resume the suspended procedure.
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: stack <stack@apache.com>
Signed-off-by: Pankaj Kumar<pankajkumar@apache.org>
(cherry picked from commit 600be60a4b)
(cherry picked from commit ae77f81e7e)
Add being able to configure netty thread counts. Enable socket reuse
(should not have any impact).
hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/BlockingRpcConnection.java
Rename the threads we create in here so they are NOT named same was
threads created by Hadoop RPC.
hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/DefaultNettyEventLoopConfig.java
hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/NettyRpcClient.java
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AsyncFSWAL.java
Allow configuring eventloopgroup thread count (so can override for
tests)
hbase-examples/src/main/java/org/apache/hadoop/hbase/client/example/HttpProxyExample.java
Enable socket resuse.
hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/NettyRpcServer.java
Enable socket resuse and config for how many threads to use.
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
hbase-server/src/main/java/org/apache/hadoop/hbase/util/ModifyRegionUtils.java
Thread name edit; drop the redundant 'Thread' suffix.
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/HFileReplicator.java
Make closeable and shutdown executor when called.
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java
Call close on HFileReplicator
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplicationBase.java
HDFS creates lots of threads. Use less of it so less threads overall.
hbase-server/src/test/resources/hbase-site.xml
hbase-server/src/test/resources/hdfs-site.xml
Constrain resources when running in test context.
hbase-server/src/test/resources/log4j.properties
Enable debug on netty to see netty configs in our log
pom.xml
Add system properties when we launch JVMs to constrain thread counts in
tests
Signed-off-by: Duo Zhang <zhangduo@apache.org>
A miscellaney. Add extra logging to help w/ debug to a bunch of tests.
Fix some issues particular where we ran into mismatched filesystem
complaint. Some modernizations, removal of unnecessary deletes
(especially after seeing tests fail in table delete), and cleanup.
Recategorized one tests because it starts four clusters in the one
JVM from medium to large. Finally, zk standalone server won't come
on occasion; added debug and thread dumping to help figure why (
manifests as test failing in startup saying master didn't launch).
hbase-mapreduce/src/test/java/org/apache/hadoop/hbase/snapshot/TestExportSnapshot.java
Fixes occasional mismatched filesystems where the difference is file:// vs file:///
or we pick up hdfs schema when it a local fs test. Had to do this
vetting of how we do make qualified on a Path in a few places, not
just here as a few tests failed with this same issue. Code in here is
used by a lot of tests that each in turn suffered this mismatch.
Refactor for clarity
hbase-mapreduce/src/test/java/org/apache/hadoop/hbase/snapshot/TestExportSnapshotV1NoCluster.java
Unused import.
hbase-procedure/src/test/java/org/apache/hadoop/hbase/procedure2/store/wal/TestWALProcedureStore.java
This test fails if tmp dir is not where it expects because tries to
make rootdir there. Give it a rootdir under test data dir.
hbase-server/src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java
This change is probably useless. I think the issue is actually
a problem addressed later where our test for zk server being
up gets stuck and never times out.
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestSplitOrMergeStatus.java
Move off deprecated APIs.
hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/BalancerTestBase.java
Log when we fail balance check for DEBUG Currently just says 'false'
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestSplitWALProcedure.java
NPEs on way out if setup failed.
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
Add logging when assert fails to help w/ DEBUG
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerAbortTimeout.java
Don't bother removing stuff on teardown. All gets thrown away anyways.
Saw a few hangs in here in the teardown where hdfs was down before
expected messing up shutdown.
hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/MiniZooKeeperCluster.java
Add timeout on socket; was seeing check for zk server getting stuck
and never timing out (test time out in startup)
hbase-mapreduce/src/test/java/org/apache/hadoop/hbase/snapshot/TestExportSnapshotWithTemporaryDirectory.java
Write to test data dir instead.
Be careful about how we make qualified paths.
hbase-mapreduce/src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableInputFormatScanBase.java
Remove snowflake configs.
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplicationStatus.java
Add a hacky pause. Tried adding barriers but didn't work. Needs deep
dive.
hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/MiniZooKeeperCluster.java
Remove code copied from zk and use zk methods directly instead.
A general problem is that zk cluster doesn't come up occasionally but
no clue why. Add thread dumping and state check.
These classifications come of running at various fork counts.. A test
may complete quick if low fork count but if it is accessing disk, it
will run much slower if fork count is high. This edit accommodates
some of this phenomenon.
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Jan Hentschel <janh@apache.org>
HBASE-20981 - Rollback stateCount accounting thrown-off when exception out of rollbackState
Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: Sakthi <sakthi@apache.org>
Signed-off-by: Peter Somogyi <psomogyi@apache.org>
(cherry picked from commit 8e8b9b698f)
* Add a bit of javadoc around SerialReplicationChecker.
* Miniscule edit to the profiler jsp page and then a bit of doc on how to make it work that might help.
* Add some detail if NPE getting BitSetNode to help w/ debug.
* Change HbckChore to log region names instead of encoded names; helps doing diagnostics; can take region name and query in shell to find out all about the region according to hbase:meta.
* Add some fix-it help inline in the HBCK Report page – how to fix.
* Add counts in procedures page so can see if making progress; move listing of WALs to end of the page.
The original fix is provided by Sergey Shelukhin, the UT is added by Duo Zhang
Amending-Author: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>