On failed RPC we expire the server and suspend expecting the
resultant ServerCrashProcedure to wake us back up again. In tests,
TestRSGroup hung because it failed to schedule a server expiration
because the server was already expired undergoing processing (the
test was shutting down). Deal with this case by having expire
servers return false if unable to expire. Callers will then know
where a ServerCrashProcedure has been scheduled or not.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
Have expireServer return true if successful.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
The log that included an exception whose message was the current
procedure as a String totally baffled me. Make it more obvious what
exception is.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java
If failed expire of a server, wake our procedure -- do not suspend --
and presume ok to move region to CLOSED state (because going down or
concurrent crashed server processing ongoing).
* rely on git plumbing commands when checking if we've built the site for a particular commit already
* switch to forcing '-e' for bash
* add command line switches for: path to hbase, working directory, and publishing
* only export JAVA/MAVEN HOME if they aren't already set.
* add some docs about assumptions
* Update javadoc plugin to consistently be version 3.0.0
* avoid duplicative site invocations on reactor modules
* update use of cp command so it works both on linux and mac
* manually skip enforcer plugin during build
* still doing install of all jars due to MJAVADOC-490, but then skip rebuilding during aggregate reports.
* avoid the pager on git-diff by teeing to a log file, which also helps later reviewing in the case of big changesets.
Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: Misty Stanley-Jones <misty@apache.org>
Conflicts:
hbase-backup/pom.xml
hbase-spark-it/pom.xml
Revert of the revert, i.e. reapply. Thought this had broken the build
but it was a bad nightly. Putting it back.
Revert "Revert "HBASE-20069 fix existing findbugs errors in hbase-server; ADDENDUM Address review""
This reverts commit 07eae00ec1.
Reapplication of a patch temporarily removed...
I thought this was causing issue but now I don't think it the culprit.
Revert "Revert "HBASE-20092 Fix TestRegionMetrics#testRegionMetrics""
This reverts commit 367d316781.
Allow OPEN as a possible state when update region transition state.
Usually state is OPENING but if crash before finish step is completed,
on replay, master may have read that the state is OPEN from meta table
and so will think it open... When we replay the procedure finish, allow
that the region is already OPEN.
Fix MoveRandomRegionOfTableAction. It depended on old AM behavior.
Make it do explicit move as is required in AMv3; w/o it, it was just
closing region causing test to fail.
Fix pom so hadoop3 profile specifies a different netty3 version.
Bunch of logging format change that came of trying trying to read
the spew from this test.
This reverts commit bc080e7500.
Conflicts:
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
patch reverted changes that happened in parallel without explanation. see jira.
- Added ADMIN permission check for following rpc calls:
normalize, setNormalizerRunning, runCatalogScan, enableCatalogJanitor, runCleanerChore,
setCleanerChoreRunning, execMasterService, execProcedure, execProcedureWithRet
- Moved authorizationEnabled check to start of AccessChecker's functions. Currently, and IDK why,
we call authManager.authorize() first and then discard its result if authorizationEnabled is false. Weird.
Negative Remaining KVs and progress percent greater than 100 is because CompactionProgress#totalCompactingKVs is sometimes less than CompactionProgress#currentCompactedKVs.
Changes add a getter to CompactionProgress#totalCompactingKVs and from inside getter warning is logged. currentCompactedKVs are return when totalCompactingKVs are less than current.
Signed-off-by: Michael Stack <stack@apache.org>
Revert "Revert "HBASE-2004 TestClientClusterStatus is flakey""
This is a revert of a revert, i.e. a reapplication, just so I can fix
the JIRA number.
This reverts commit 6796b8e21f.
LocalHBaseCluster forces random port assignment for sake of concurrent unit test
execution friendliness, but we still need a positive test for RPC and info port
assignment.
The 1.x functionality of Master DDL operations is that "post" observer hooks
are only invoked when the DDL action was successful. With the async-ness of
ProcV2, we find ourselves in a case where the post-hook may be invoked before
the Procedure runs and fails. We need to introduce some blocking to wait and
see if the Procedure is going to fail on a precondition before invoking the hook.
Signed-off-by: Michael Stack <stack@apache.org>
Remove assert in splittableregionprocedure. It was in the prepare.
Was causing fail in legit case where a region split follows a
table split BEFORE the parent has been GC'd. The region split
finds the parent in SPLIT state which is right. The assert was
having us fail. No need.
Also disabled TestHTrace since not supported in 2.0.0 and flakey.
Only call server.checkIfShouldMoveSystemRegionAsync if a node has been
added. Do not call it if only one regionserver in cluster. Make it
so ServerCrashProcedure runs before it. Add logging if
server.checkIfShouldMoveSystemRegionAsync was responsible for
MOVE (Previous was a mystery when it cut in).
Previous we'd call it when there was a nodeChildrenChanged. These
happen before nodeDeleted. If a server crashed,
checkIfShouldMoveSystemRegionAsync could run first, find the
server that had not yet registered as crashed, find system
tables on it and then try to move them. It would fail because
server would not respond to RPC. The region move would then
be waiting on the servercrashprocedure to wake it up when
done processing but this move had locked the region so
SCP couldn't run....
Serializing, if appropriate, write the hbase-1.x version of the
Comparator to the hfile trailer so hbase-1.x files can read hbase-2.x
hfiles (they are the same format).
This patch adds mirroring of table state out to zookeeper. HBase-1.x
clients look for table state in zookeeper, not in hbase:meta where
hbase-2.x maintains table state.
The patch also moves and refactors the 'migration' code that was put in
place by HBASE-13032.
D hbase-client/src/main/java/org/apache/hadoop/hbase/CoordinatedStateException.java
Unused.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
Move table state migration code from Master startup out to
TableStateManager where it belongs. Also start
MirroringTableStateManager dependent on config.
A hbase-server/src/main/java/org/apache/hadoop/hbase/master/MirroringTableStateManager.java
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/TableStateManager.java
Move migration from zookeeper of table state in here. Also plumb in
mechanism so subclass can get a chance to look at table state as we do
the startup fixup full-table scan of meta.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
Bug-fix. Now we create regions in CLOSED state but we fail to check
table state; were presuming table always enabled. Meant on startup
there'd be an unassigned region that never got assigned.
A hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMirroringTableStateManager.java
Test migration and mirroring.
Value of 'htd' is null as it is initialized in the constructor but when the object is deserialized its null. Got rid of member variable htd and made it local to method.
* Added a comment in MergeTableRegionsProcedure and SplitTableRegionProcedure explaining specific rollbacks has side effect that AssignProcedure/s are submitted asynchronously and those procedures may continue to execute even after rollback() is done.
* Updated comments in tests with correct rollback state to abort
* Added overloaded method MasterProcedureTestingUtility#testRollbackAndDoubleExecution which takes additional argument for waiting on all procedures to finish before asserting conditions
* Updated TestMergeTableRegionsProcedure#testRollbackAndDoubleExecution and TestSplitTableRegionProcedure#testRollbackAndDoubleExecution to use newly added method
Signed-off-by: Michael Stack <stack@apache.org>