Commit Graph

137 Commits

Author SHA1 Message Date
tianjingyun 422e98957b
HBASE-21322 Add a scheduleServerCrashProcedure() API to HbckService
Signed-off-by: Michael Stack <stack@apache.org>
2018-10-29 20:56:44 -07:00
Michael Stack 8fc90a23ae
HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign
Adds override to assigns and unassigns. Changes bypass 'force'
to align calling the param 'override' instead.

Adds recursive to 'bypass', a means of calling bypass on
parent and its subprocedures (usually bypass works on
leaf nodes rippling the bypass up to parent -- recursive
has us work in the opposite direction): EXPERIMENTAL.

bypass on an assign/unassign leaves region in RIT and the
RegionStateNode loaded with the bypassed procedure. First
implementation had assign/unassign cleanup leftover state.
Second implementation, on feedback, keeps the state in place
as a fence against other Procedures assuming the region entity,
and instead adds an 'override' function that hbck2 can set on
assigns/unassigns to override the fencing.

Note that the below also converts ProcedureExceptions that
come out of the Pv2 system into DoNotRetryIOEs. It is a
little awkward because DNRIOE is in client-module, not
in procedure module. Previous, we'd just keep retrying
the bypass, etc.

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java
 Have bypass take an environment like all other methods so subclasses.
 Fix javadoc issues.

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
 Javadoc issues. Pass environment when we invoke bypass.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
 Rename waitUntilNamespace... etc. to align with how these method types
 are named elsehwere .. i.e. waitFor rather than waitUntil..

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
 Cleanup message we emit when we find an exisitng procedure working
 against this entity.
 Add support for a force function which allows Assigns/Unassigns force
 ownership of the Region entity.

A hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestRegionBypass.java
 Test bypass and force.

M hbase-shell/src/main/ruby/shell/commands/list_procedures.rb
 Minor cleanup of the json output... do iso8601 timestamps.
2018-10-04 16:37:37 -07:00
Michael Stack 259d12f739 Revert "Revert "Revert "HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign"""
This reverts commit 2174461cf7.

Revert because not ready to port to other branches.
2018-09-29 04:06:46 -07:00
Michael Stack 2174461cf7 Revert "Revert "HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign""
This reverts commit b96905d1df.

i.e. a revert of a revert so a reapplication!

Revert so I can add signed-off-by....

Signed-off-by: Allan Yang <allan163@apache.org>
2018-09-29 03:34:36 -07:00
Michael Stack b96905d1df Revert "HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign"
This reverts commit b42d7978cb.
2018-09-29 03:34:10 -07:00
Michael Stack b42d7978cb HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign
bypass on an assign/unassign leaves region in RIT and the
RegionStateNode loaded with the bypassed procedure. First
implementation had assign/unassign cleanup leftover state.
Second implementation, on feedback, keeps the state in place
as a fence against other Procedures assuming the region entity,
and instead adds an 'override' function that hbck2 can set on
assigns/unassigns to override the fencing.

Note that the below also converts ProcedureExceptions that
come out of the Pv2 system into DoNotRetryIOEs. It is a
little awkward because DNRIOE is in client-module, not
in procedure module. Previous, we'd just keep retrying
the bypass, etc.

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java
 Have bypass take an environment like all other methods so subclasses.
 Fix javadoc issues.

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
 Javadoc issues. Pass environment when we invoke bypass.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
 Rename waitUntilNamespace... etc. to align with how these method types
 are named elsehwere .. i.e. waitFor rather than waitUntil..

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
 Cleanup message we emit when we find an exisitng procedure working
 against this entity.
 Add support for a force function which allows Assigns/Unassigns force
 ownership of the Region entity.

A hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestRegionBypass.java
 Test bypass and force.

M hbase-shell/src/main/ruby/shell/commands/list_procedures.rb
 Minor cleanup of the json output... do iso8601 timestamps.
2018-09-29 03:33:07 -07:00
tianjingyun c5af7b654b HBASE-21204 NPE when scan raw DELETE_FAMILY_VERSION and codec is not set
Signed-off-by: tedyu <yuzhihong@gmail.com>
2018-09-20 06:59:43 -07:00
Umesh Agashe e6c7ed34e0
HBASE-21023 Added bypassProcedure() API to HbckService 2018-09-19 15:01:29 -07:00
Michael Stack 37cc07a772
HBASE-21156 [hbck2] Queue an assign of hbase:meta and bulk assign/unassign
Adds 'raw' assigns and unassigns methods to Hbck Service.

Fixes HbckService so it works when cluster is Kerberized.
2018-09-19 09:02:43 -07:00
Umesh Agashe 589c1e4078
HBASE-20941 Created and implemented HbckService in master
Added API setTableStateInMeta() to update table state only in Meta. This will be used by hbck2 tool.
2018-09-12 21:31:13 -07:00
Allan Yang e33591515c
HBASE-21083 Introduce a mechanism to bypass the execution of a stuck procedure 2018-08-28 20:18:47 -07:00
zhangduo 833657c46d HBASE-20846 Restore procedure locks when master restarts 2018-07-25 14:37:36 +08:00
Allan Yang 44bf7076b7 HBASE-20878 Data loss if merging regions while ServerCrashProcedure executing 2018-07-24 09:51:46 +08:00
Allan Yang af2742fcf2 HBASE-20893 Data loss if splitting region while ServerCrashProcedure executing 2018-07-23 14:35:27 +08:00
zhangduo a86141b625 HBASE-20752 Make sure the regions are truly reopened after ReopenTableRegionsProcedure 2018-06-22 14:06:29 +08:00
zhangduo 3e33aecea2 HBASE-20708 Remove the usage of RecoverMetaProcedure in master startup 2018-06-19 15:09:11 +08:00
zhangduo b785896cbd HBASE-20659 Implement a reopen table regions procedure 2018-05-30 20:03:35 +08:00
meiyi f40c10a211 HBASE-20518 Need to serialize the enabled field for UpdatePeerConfigProcedure
Signed-off-by: zhangduo <zhangduo@apache.org>
2018-05-25 14:45:49 +08:00
Michael Stack da3e06afab HBASE-20492 UnassignProcedure is stuck in retry loop on region stuck in OPENING state
Add backoff when stuck in RegionTransitionProcedure, the subclass of
AssignProcedure and UnassignProcedure. Can happen when we go to
transition but the current Region state is not what we expect.

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java
 Add doc on being able to suspend and wait on a timeout.

M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
 Add 'attempt' counter so we can do backoff when we get stuck.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignProcedure.java
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java
 Add persistence of new 'attempt' counter

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
 Doc data members that are persisted by subclasses given this is 'odd'.
 Add a counter for 'attempts' used when 'stuck' to implement backoff.
 Add suspend with timeout when 'stuck'. Add callback when timeout is
 exhausted which does wakeup of this procedure.

A hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestUnexpectedStateException.java
 Test of backoff.
2018-04-30 17:58:27 -07:00
zhangduo 3eee7d37f4 HBASE-20367 Write a replication barrier for regions when disabling a table 2018-04-11 20:36:59 +08:00
zhangduo ead569c951 HBASE-20285 Delete all last pushed sequence ids when removing a peer or removing the serial flag for a peer 2018-04-09 15:18:44 +08:00
zhangduo 9369cf26eb HBASE-20147 Serial replication will be stuck if we create a table with serial replication but add it to a peer after there are region moves 2018-04-09 15:18:44 +08:00
zhangduo cea5199ea1 HBASE-20148 Make serial replication as a option for a peer instead of a table 2018-04-09 15:18:44 +08:00
Chia-Ping Tsai 6aba045aae HBASE-19504 Add TimeRange support into checkAndMutate
Signed-off-by: Michael Stack <stack@apache.org>
2018-03-24 00:05:22 +08:00
Michael Stack 2bc99e4b5e HBASE-20237 Put back getClosestRowBefore and throw UnsupportedOperation instead... for asynchbase client Throw exception if an old client connects. 2018-03-21 21:48:15 -07:00
Michael Stack fabb1d97cc HBASE-20169 NPE when calling HBTU.shutdownMiniCluster
Adds a prepare step to RecoverMetaProcedure in which we test for
cluster up and master being up. If not up, we fail the run.

Modified hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/HFileCleaner.java
Modified hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ChunkCreator.java
 Minor log cleanup.

Modified hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/RecoverMetaProcedure.java
 Add pepare step.

Modified hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManagerMetrics.java
 Debug for the failing test....

Added hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestRecoverMetaProcedure.java
 Test the prepare step goes down if master or cluster are down.
2018-03-20 13:09:43 -07:00
Michael Stack 79d47dd57a HBASE-20202 [AMv2] Don't move region if its a split parent or offlined
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/DoNotRetryRegionException.java
M hbase-client/src/main/java/org/apache/hadoop/hbase/exceptions/MergeRegionException.java
 Allow passing cause to Constructor.

M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
 Add prepare step to move procedure.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MergeTableRegionsProcedure.java
 Add check that regions to merge are actually online to the Constructor
so we can fail fast if they are offline

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MoveRegionProcedure.java
 Add prepare step. Check regions and context and skip move if not right.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/SplitTableRegionProcedure.java
 Add check parent region is online to constructor.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/AbstractStateMachineTableProcedure.java
 Add generic check region is online utility function for use by subclasses.

M hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionMove.java
 Add test that we fail if we try to move an offlined region.
2018-03-16 09:34:15 -07:00
Michael Stack 260ee0da60 HBASE-20173 [AMv2] DisableTableProcedure concurrent to ServerCrashProcedure can deadlock
Allow that DisableTableProcedue can grab a region lock before
ServerCrashProcedure can. Cater to this cricumstance where SCP
was not unable to make progress by running the search for RIT
against the crashed server a second time, post creation of all
crashed-server assignemnts. The second run will uncover such as
the above DisableTableProcedure unassign and will interrupt its
suspend allowing both procedures to make progress.

M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
 Add new procedure step post-assigns that reruns the RIT finder method.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
 Make this important log more specific as to what is going on.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java
 Better explanation as to what is going on.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerCrashProcedure.java
 Add extra step and run handleRIT a second time after we've queued up
 all SCP assigns. Also fix a but. SCP was adding an assign of a RIT
 that was actually trying to unassign (made the deadlock more likely).
2018-03-13 05:44:43 -07:00
zhangduo d1e775e35e HBASE-19936 Introduce a new base class for replication peer procedure 2018-03-09 20:55:48 +08:00
zhangduo 4c6942df58 HBASE-19635 Introduce a thread at RS side to call reportProcedureDone 2018-03-09 20:55:48 +08:00
Guanghao Zhang 59cad95b58 HBASE-19579 Add peer lock test for shell command list_locks
Signed-off-by: zhangduo <zhangduo@apache.org>
2018-03-09 20:55:48 +08:00
zhangduo 5e410d8140 HBASE-19524 Master side changes for moving peer modification from zk watcher to procedure 2018-03-09 20:55:48 +08:00
zhangduo 95af14fea6 HBASE-19216 Implement a general framework to execute remote procedure on RS 2018-03-09 20:55:48 +08:00
zhangduo 8e8e50683d HBASE-20066 Region sequence id may go backward after split or merge 2018-02-27 15:37:32 +08:00
tedyu 8a22e4119f HBASE-20049 Region replicas of SPLIT and MERGED regions are kept in in-memory states until restarting master (Toshihiro Suzuki) 2018-02-22 20:11:11 -08:00
Reid Chan 4ef6319af0 HBASE-19950 Introduce a ColumnValueFilter
Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>
2018-02-20 05:05:19 +08:00
Michael Stack c7473df2c3 HBASE-16060 1.x clients cannot access table state talking to 2.0 cluster
This patch adds mirroring of table state out to zookeeper. HBase-1.x
clients look for table state in zookeeper, not in hbase:meta where
hbase-2.x maintains table state.

The patch also moves and refactors the 'migration' code that was put in
place by HBASE-13032.

D hbase-client/src/main/java/org/apache/hadoop/hbase/CoordinatedStateException.java
 Unused.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
 Move table state migration code from Master startup out to
TableStateManager where it belongs. Also start
MirroringTableStateManager dependent on config.

A hbase-server/src/main/java/org/apache/hadoop/hbase/master/MirroringTableStateManager.java

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/TableStateManager.java
 Move migration from zookeeper of table state in here. Also plumb in
mechanism so subclass can get a chance to look at table state as we do
the startup fixup full-table scan of meta.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
 Bug-fix. Now we create regions in CLOSED state but we fail to check
table state; were presuming table always enabled. Meant on startup
there'd be an unassigned region that never got assigned.

A hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMirroringTableStateManager.java
 Test migration and mirroring.
2018-02-12 08:22:14 -08:00
Jan Hentschel 9f610e5930 HBASE-19604 Fixed Checkstyle errors in hbase-protocol-shaded and enabled Checkstyle to fail on violations 2018-01-03 13:47:39 +03:00
Guanghao Zhang 60cd494d1c HBASE-19492 Add EXCLUDE_NAMESPACE and EXCLUDE_TABLECFS support to replication peer config 2017-12-19 16:57:38 +08:00
Guangxu Cheng 015b66103a HBASE-19000 Group multiple block cache clear requests per server
Signed-off-by: tedyu <yuzhihong@gmail.com>
2017-12-13 07:47:33 -08:00
Michael Stack 79ac70ac86 HBASE-19407 [branch-2] Remove backup/restore 2017-12-01 17:22:37 -08:00
Guanghao Zhang ca6e7e68f4 HBASE-16868 Add a replicate_all flag to avoid misuse the namespaces and table-cfs config of replication peer
Signed-off-by: Guanghao Zhang <zghao@apache.org>
2017-11-23 15:08:03 +08:00
Guanghao Zhang c978f8ab23 HBASE-19293 Support add a disabled state replication peer directly 2017-11-21 15:37:33 +08:00
Yi Liang 07b0ac4161 HBASE-19127: Set State.SPLITTING, MERGING, MERGING_NEW, SPLITTING_NEW properly in RegionStatesNode
Signed-off-by: Michael Stack <stack@apache.org>
2017-11-09 11:34:53 -08:00
Zach York 77e7c5ff27 HBASE-18624 Added support for clearing BlockCache based on tablename
Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>
2017-11-09 04:03:35 +08:00
Apekshit Sharma a6d8023ff5 HBASE-19128 Purge Distributed Log Replay from codebase, configurations, text; mark the feature as unsupported, broken. 2017-11-07 17:48:52 -08:00
QilinCao 1110910b3a HBASE-19103 Add BigDecimalComparator for filter
Signed-off-by: tedyu <yuzhihong@gmail.com>
2017-11-07 03:30:43 -08:00
Chia-Ping Tsai d592b29619 HBASE-19131 Add the ClusterStatus hook and cleanup other hooks which can be replaced by ClusterStatus hook 2017-11-05 09:56:04 +08:00
Chia-Ping Tsai 2a28ff840e HBASE-18754 Get rid of Writable from TimeRangeTracker 2017-10-24 15:14:14 +08:00
Mike Drob c0144e200d HBASE-18893 remove add/delete/modify column 2017-10-23 20:03:09 -05:00