Commit Graph

179 Commits

Author SHA1 Message Date
Duo Zhang 55fa8f4b33 HBASE-21463 The checkOnlineRegionsReport can accidentally complete a TRSP 2018-11-13 11:31:03 +08:00
Duo Zhang c8574ba3c5 HBASE-21420 Use procedure event to wake up the SyncReplicationReplayWALProcedures which wait for worker 2018-11-05 21:43:18 +08:00
tianjingyun 116eee6747
HBASE-21322 Add a scheduleServerCrashProcedure() API to HbckService
Signed-off-by: Michael Stack <stack@apache.org>
2018-10-29 20:55:37 -07:00
jingyuntian 5fbb227deb
HBASE-21269 Forward-port HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign 2018-10-18 06:22:52 -07:00
tianjingyun cd161d976e HBASE-21204 NPE when scan raw DELETE_FAMILY_VERSION and codec is not set
Signed-off-by: tedyu <yuzhihong@gmail.com>
2018-09-20 06:57:50 -07:00
Umesh Agashe dc767c06d2
HBASE-21023 Added bypassProcedure() API to HbckService 2018-09-19 15:18:16 -07:00
Michael Stack 3a0fcd56cf
HBASE-21156 [hbck2] Queue an assign of hbase:meta and bulk assign/unassign
Adds 'raw' assigns and unassigns methods to Hbck Service.

Fixes HbckService so it works when cluster is Kerberized.
2018-09-19 10:06:16 -07:00
Allan Yang 7c1fad4992
HBASE-21083 Introduce a mechanism to bypass the execution of a stuck procedure 2018-08-30 12:23:24 -07:00
Umesh Agashe 3813f0ac3d HBASE-20941 Created and implemented HbckService in master
Added API setTableStateInMeta() to update table state only in Meta. This will be used by hbck2 tool.
2018-08-27 12:11:52 -07:00
zhangduo bb3494134e HBASE-20881 Introduce a region transition procedure to handle all the state transition for a region 2018-08-21 06:12:09 +08:00
Mike Drob 4bcaf495c2
HBASE-20894 Use proto for BucketCache persistence 2018-08-01 16:54:25 -05:00
zhangduo f3f17fa111 HBASE-20846 Restore procedure locks when master restarts 2018-07-25 14:37:26 +08:00
Allan Yang b631727bdf HBASE-20878 Data loss if merging regions while ServerCrashProcedure executing 2018-07-24 10:00:28 +08:00
Allan Yang 4804483f7e HBASE-20893 Data loss if splitting region while ServerCrashProcedure executing 2018-07-23 14:48:43 +08:00
Mohit Goel 950d6e6fb0 HBASE-6028 Start/Stop compactions at region server level
Add switching on/off of compactions.

Switching off compactions will also interrupt any currently ongoing compactions.
Adds a "compaction_switch" to hbase shell. Switching off compactions will
interrupt any currently ongoing compactions. State set from shell will be
lost on restart. To persist the changes across region servers modify
hbase.regionserver.compaction.enabled in hbase-site.xml and restart.

Signed-off-by: Umesh Agashe <uagashe@cloudera.com>
Signed-off-by: Michael Stack <stack@apache.org>
2018-07-19 06:20:44 -07:00
Guanghao Zhang 44ca13fe07 HBASE-20569 NPE in RecoverStandbyProcedure.execute 2018-06-28 18:08:43 +08:00
zhangduo f67763ffa0 HBASE-20424 Allow writing WAL to local and remote cluster concurrently 2018-06-28 18:08:43 +08:00
zhangduo ae6c90b4ec HBASE-20426 Give up replicating anything in S state 2018-06-28 18:08:43 +08:00
Guanghao Zhang 183b8d0581 HBASE-19973 Implement a procedure to replay sync replication wal for standby cluster 2018-06-28 18:07:44 +08:00
zhangduo a41c549ca4 HBASE-19082 Reject read/write from client but accept write from replication in state S 2018-06-28 18:07:44 +08:00
zhangduo 39dd81a7c6 HBASE-19957 General framework to transit sync replication state 2018-06-28 18:07:44 +08:00
Guanghao Zhang 1481bd9481 HBASE-19864 Use protobuf instead of enum.ordinal to store SyncReplicationState
Signed-off-by: zhangduo <zhangduo@apache.org>
2018-06-28 18:07:44 +08:00
Guanghao Zhang 2acebac00e HBASE-19781 Add a new cluster state flag for synchronous replication 2018-06-28 18:07:44 +08:00
Guanghao Zhang b4a1dbf768 HBASE-19078 Add a remote peer cluster wal directory config for synchronous replication
Signed-off-by: zhangduo <zhangduo@apache.org>
2018-06-28 18:07:44 +08:00
tedyu 98245ca6e4 HBASE-20740 StochasticLoadBalancer should consider CoprocessorService request factor when computing cost (chenxu) 2018-06-22 00:26:14 -07:00
zhangduo 7b716c964b HBASE-20752 Make sure the regions are truly reopened after ReopenTableRegionsProcedure 2018-06-22 14:04:33 +08:00
zhangduo 6dbbd78aa0 HBASE-20708 Remove the usage of RecoverMetaProcedure in master startup 2018-06-19 15:02:10 +08:00
Allan Yang b336da925a HBASE-20727 Persist FlushedSequenceId to speed up WAL split after cluster restart 2018-06-19 09:45:47 +08:00
zhangduo 997747076d HBASE-20659 Implement a reopen table regions procedure 2018-05-30 20:03:25 +08:00
meiyi 36f3d9432a HBASE-20518 Need to serialize the enabled field for UpdatePeerConfigProcedure
Signed-off-by: zhangduo <zhangduo@apache.org>
2018-05-25 14:36:16 +08:00
Michael Stack 5a071dbe2b HBASE-20492 UnassignProcedure is stuck in retry loop on region stuck in OPENING state
Add backoff when stuck in RegionTransitionProcedure, the subclass of
AssignProcedure and UnassignProcedure. Can happen when we go to
transition but the current Region state is not what we expect.

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java
 Add doc on being able to suspend and wait on a timeout.

M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
 Add 'attempt' counter so we can do backoff when we get stuck.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignProcedure.java
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java
 Add persistence of new 'attempt' counter

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
 Doc data members that are persisted by subclasses given this is 'odd'.
 Add a counter for 'attempts' used when 'stuck' to implement backoff.
 Add suspend with timeout when 'stuck'. Add callback when timeout is
 exhausted which does wakeup of this procedure.

A hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestUnexpectedStateException.java
 Test of backoff.
2018-04-30 20:40:22 -07:00
Mike Drob da7776d428 HBASE-20410 update protoc to 3.5.1-1 for rhel6 2018-04-13 13:09:20 -05:00
Mike Drob 70d23214fb HBASE-20356 Make skipping protoc possible 2018-04-12 13:31:54 -05:00
zhangduo 37e5b0b1b7 HBASE-20367 Write a replication barrier for regions when disabling a table 2018-04-11 20:36:51 +08:00
zhangduo 056c3395d9 HBASE-20285 Delete all last pushed sequence ids when removing a peer or removing the serial flag for a peer 2018-03-27 12:20:51 +08:00
Josh Elser 15c398f7d2 HBASE-20223 Update to hbase-thirdparty 2.1.0
Remove commons-cli and commons-collections4 use. Account
for the newer internal protobuf version of 3.5.1.

Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: Mike Drob <mdrob@apache.org>
2018-03-26 22:05:19 -04:00
Chia-Ping Tsai ad47c2daf4 HBASE-19504 Add TimeRange support into checkAndMutate
Signed-off-by: Michael Stack <stack@apache.org>
2018-03-24 00:12:38 +08:00
zhangduo 64061f896f HBASE-20147 Serial replication will be stuck if we create a table with serial replication but add it to a peer after there are region moves 2018-03-23 14:31:20 +08:00
Chia-Ping Tsai a6eeb26cc0 HBASE-20212 Make all Public classes have InterfaceAudience category
Signed-off-by: tedyu <yuzhihong@gmail.com>
Signed-off-by: Michael Stack <stack@apache.org>
2018-03-22 18:10:23 +08:00
Michael Stack 9601ab2272 HBASE-20237 Put back getClosestRowBefore and throw UnsupportedOperation instead... for asynchbase client Throw exception if an old client connects. 2018-03-21 21:51:25 -07:00
Michael Stack acbdb86bb4 HBASE-20169 NPE when calling HBTU.shutdownMiniCluster
Adds a prepare step to RecoverMetaProcedure in which we test for
cluster up and master being up. If not up, we fail the run.

Modified hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/HFileCleaner.java
Modified hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ChunkCreator.java
 Minor log cleanup.

Modified hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/RecoverMetaProcedure.java
 Add pepare step.

Modified hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManagerMetrics.java
 Debug for the failing test....

Added hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestRecoverMetaProcedure.java
 Test the prepare step goes down if master or cluster are down.
2018-03-20 13:11:57 -07:00
Michael Stack 13f3ba3cee HBASE-20202 [AMv2] Don't move region if its a split parent or offlined
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/DoNotRetryRegionException.java
M hbase-client/src/main/java/org/apache/hadoop/hbase/exceptions/MergeRegionException.java
 Allow passing cause to Constructor.

M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
 Add prepare step to move procedure.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MergeTableRegionsProcedure.java
 Add check that regions to merge are actually online to the Constructor
so we can fail fast if they are offline

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MoveRegionProcedure.java
 Add prepare step. Check regions and context and skip move if not right.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/SplitTableRegionProcedure.java
 Add check parent region is online to constructor.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/AbstractStateMachineTableProcedure.java
 Add generic check region is online utility function for use by subclasses.

M hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionMove.java
 Add test that we fail if we try to move an offlined region.
2018-03-16 09:35:33 -07:00
Michael Stack 72c3d27bf6 HBASE-20173 [AMv2] DisableTableProcedure concurrent to ServerCrashProcedure can deadlock
Allow that DisableTableProcedue can grab a region lock before
ServerCrashProcedure can. Cater to this cricumstance where SCP
was not unable to make progress by running the search for RIT
against the crashed server a second time, post creation of all
crashed-server assignemnts. The second run will uncover such as
the above DisableTableProcedure unassign and will interrupt its
suspend allowing both procedures to make progress.

M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
 Add new procedure step post-assigns that reruns the RIT finder method.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
 Make this important log more specific as to what is going on.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java
 Better explanation as to what is going on.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerCrashProcedure.java
 Add extra step and run handleRIT a second time after we've queued up
 all SCP assigns. Also fix a but. SCP was adding an assign of a RIT
 that was actually trying to unassign (made the deadlock more likely).
2018-03-13 06:04:36 -07:00
zhangduo dd6f4525e7 HBASE-20148 Make serial replication as a option for a peer instead of a table 2018-03-10 09:04:44 +08:00
Josh Elser 4a4c012049 HBASE-18135 Implement mechanism for RegionServers to report file archival for space quotas
This de-couples the snapshot size calculation from the
SpaceQuotaObserverChore into another API which both the periodically
invoked Master chore and the Master service endpoint can invoke. This
allows for multiple sources of snapshot size to reported (from the
multiple sources we have in HBase).

When a file is archived, snapshot sizes can be more quickly realized and
the Master can still perform periodical computations of the total
snapshot size to account for any delayed/missing/lost file archival RPCs.

Signed-off-by: Ted Yu <yuzhihong@gmail.com>
2018-03-05 17:32:42 -05:00
Sean Busbey 2a65066b35 HBASE-20070 refactor website generation
* rely on git plumbing commands when checking if we've built the site for a particular commit already
* switch to forcing '-e' for bash
* add command line switches for: path to hbase, working directory, and publishing
* only export JAVA/MAVEN HOME if they aren't already set.
* add some docs about assumptions
* Update javadoc plugin to consistently be version 3.0.0
* avoid duplicative site invocations on reactor modules
* update use of cp command so it works both on linux and mac
* manually skip enforcer plugin during build
* still doing install of all jars due to MJAVADOC-490, but then skip rebuilding during aggregate reports.
* avoid the pager on git-diff by teeing to a log file, which also helps later reviewing in the case of big changesets.

Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: Misty Stanley-Jones <misty@apache.org>
2018-03-02 09:25:10 -06:00
zhangduo f06a89b531 HBASE-20066 Region sequence id may go backward after split or merge 2018-02-27 15:33:07 +08:00
Toshihiro Suzuki 1bc996aa50 HBASE-20049 Region replicas of SPLIT and MERGED regions are kept in in-memory states until restarting master
Signed-off-by: tedyu <yuzhihong@gmail.com>
2018-02-22 20:06:21 -08:00
Reid Chan a9a6eed372 HBASE-19950 Introduce a ColumnValueFilter
Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>
2018-02-20 04:56:13 +08:00
Michael Stack 67b69fb2c7 HBASE-16060 1.x clients cannot access table state talking to 2.0 cluster
This patch adds mirroring of table state out to zookeeper. HBase-1.x
clients look for table state in zookeeper, not in hbase:meta where
hbase-2.x maintains table state.

The patch also moves and refactors the 'migration' code that was put in
place by HBASE-13032.

D hbase-client/src/main/java/org/apache/hadoop/hbase/CoordinatedStateException.java
 Unused.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
 Move table state migration code from Master startup out to
TableStateManager where it belongs. Also start
MirroringTableStateManager dependent on config.

A hbase-server/src/main/java/org/apache/hadoop/hbase/master/MirroringTableStateManager.java

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/TableStateManager.java
 Move migration from zookeeper of table state in here. Also plumb in
mechanism so subclass can get a chance to look at table state as we do
the startup fixup full-table scan of meta.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
 Bug-fix. Now we create regions in CLOSED state but we fail to check
table state; were presuming table always enabled. Meant on startup
there'd be an unassigned region that never got assigned.

A hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMirroringTableStateManager.java
 Test migration and mirroring.
2018-02-12 08:47:02 -08:00