bypass on an assign/unassign leaves region in RIT and the
RegionStateNode loaded with the bypassed procedure. First
implementation had assign/unassign cleanup leftover state.
Second implementation, on feedback, keeps the state in place
as a fence against other Procedures assuming the region entity,
and instead adds an 'override' function that hbck2 can set on
assigns/unassigns to override the fencing.
Note that the below also converts ProcedureExceptions that
come out of the Pv2 system into DoNotRetryIOEs. It is a
little awkward because DNRIOE is in client-module, not
in procedure module. Previous, we'd just keep retrying
the bypass, etc.
M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java
Have bypass take an environment like all other methods so subclasses.
Fix javadoc issues.
M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
Javadoc issues. Pass environment when we invoke bypass.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
Rename waitUntilNamespace... etc. to align with how these method types
are named elsehwere .. i.e. waitFor rather than waitUntil..
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
Cleanup message we emit when we find an exisitng procedure working
against this entity.
Add support for a force function which allows Assigns/Unassigns force
ownership of the Region entity.
A hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestRegionBypass.java
Test bypass and force.
M hbase-shell/src/main/ruby/shell/commands/list_procedures.rb
Minor cleanup of the json output... do iso8601 timestamps.
Add backoff when stuck in RegionTransitionProcedure, the subclass of
AssignProcedure and UnassignProcedure. Can happen when we go to
transition but the current Region state is not what we expect.
M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java
Add doc on being able to suspend and wait on a timeout.
M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
Add 'attempt' counter so we can do backoff when we get stuck.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignProcedure.java
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java
Add persistence of new 'attempt' counter
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
Doc data members that are persisted by subclasses given this is 'odd'.
Add a counter for 'attempts' used when 'stuck' to implement backoff.
Add suspend with timeout when 'stuck'. Add callback when timeout is
exhausted which does wakeup of this procedure.
A hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestUnexpectedStateException.java
Test of backoff.
Remove commons-cli and commons-collections4 use. Account
for the newer internal protobuf version of 3.5.1.
Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: Mike Drob <mdrob@apache.org>
Adds a prepare step to RecoverMetaProcedure in which we test for
cluster up and master being up. If not up, we fail the run.
Modified hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/HFileCleaner.java
Modified hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ChunkCreator.java
Minor log cleanup.
Modified hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/RecoverMetaProcedure.java
Add pepare step.
Modified hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManagerMetrics.java
Debug for the failing test....
Added hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestRecoverMetaProcedure.java
Test the prepare step goes down if master or cluster are down.
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/DoNotRetryRegionException.java
M hbase-client/src/main/java/org/apache/hadoop/hbase/exceptions/MergeRegionException.java
Allow passing cause to Constructor.
M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
Add prepare step to move procedure.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MergeTableRegionsProcedure.java
Add check that regions to merge are actually online to the Constructor
so we can fail fast if they are offline
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MoveRegionProcedure.java
Add prepare step. Check regions and context and skip move if not right.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/SplitTableRegionProcedure.java
Add check parent region is online to constructor.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/AbstractStateMachineTableProcedure.java
Add generic check region is online utility function for use by subclasses.
M hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionMove.java
Add test that we fail if we try to move an offlined region.
Allow that DisableTableProcedue can grab a region lock before
ServerCrashProcedure can. Cater to this cricumstance where SCP
was not unable to make progress by running the search for RIT
against the crashed server a second time, post creation of all
crashed-server assignemnts. The second run will uncover such as
the above DisableTableProcedure unassign and will interrupt its
suspend allowing both procedures to make progress.
M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
Add new procedure step post-assigns that reruns the RIT finder method.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
Make this important log more specific as to what is going on.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java
Better explanation as to what is going on.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerCrashProcedure.java
Add extra step and run handleRIT a second time after we've queued up
all SCP assigns. Also fix a but. SCP was adding an assign of a RIT
that was actually trying to unassign (made the deadlock more likely).
* rely on git plumbing commands when checking if we've built the site for a particular commit already
* switch to forcing '-e' for bash
* add command line switches for: path to hbase, working directory, and publishing
* only export JAVA/MAVEN HOME if they aren't already set.
* add some docs about assumptions
* Update javadoc plugin to consistently be version 3.0.0
* avoid duplicative site invocations on reactor modules
* update use of cp command so it works both on linux and mac
* manually skip enforcer plugin during build
* still doing install of all jars due to MJAVADOC-490, but then skip rebuilding during aggregate reports.
* avoid the pager on git-diff by teeing to a log file, which also helps later reviewing in the case of big changesets.
Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: Misty Stanley-Jones <misty@apache.org>
Conflicts:
hbase-backup/pom.xml
hbase-spark-it/pom.xml
This patch adds mirroring of table state out to zookeeper. HBase-1.x
clients look for table state in zookeeper, not in hbase:meta where
hbase-2.x maintains table state.
The patch also moves and refactors the 'migration' code that was put in
place by HBASE-13032.
D hbase-client/src/main/java/org/apache/hadoop/hbase/CoordinatedStateException.java
Unused.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
Move table state migration code from Master startup out to
TableStateManager where it belongs. Also start
MirroringTableStateManager dependent on config.
A hbase-server/src/main/java/org/apache/hadoop/hbase/master/MirroringTableStateManager.java
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/TableStateManager.java
Move migration from zookeeper of table state in here. Also plumb in
mechanism so subclass can get a chance to look at table state as we do
the startup fixup full-table scan of meta.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
Bug-fix. Now we create regions in CLOSED state but we fail to check
table state; were presuming table always enabled. Meant on startup
there'd be an unassigned region that never got assigned.
A hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMirroringTableStateManager.java
Test migration and mirroring.
Some manual cleanup of changing package names in pom files and getting
rid of the no-longer-needed netty system property.
This commit will break compilation, package renames in source code are
done in follow-on commits using straightforward find and replace.
's/org.apache.hadoop.hbase.shaded.com.google/org.apache.hbase.thirdparty.com.google/'
's/org.apache.hadoop.hbase.shaded.io.netty/org.apache.hbase.thirdparty.io.netty/'
Changes:
- replaced commons-logging to slf4j everywhere
- log.XXX(Throwable) calls were replaced with log.XXX(t.toString(), t)
- log.XXX(Object) calls were replaced with log.XXX(Objects.toString(obj))
- log.fatal() calls were replaced with log.error(HBaseMarkers.FATAL, ...)
- programmatic log4j configuration was removed from the unit test
This commit does not affect the current logging configurations, because log4j
is still on the classpath. slf4j-log4j12 binds log4j to slf4j.
Signed-off-by: Michael Stack <stack@apache.org>