Commit Graph

15638 Commits

Author SHA1 Message Date
Michael Stack 0cd23c3dda
Revert "HBASE-21323 Should not skip force updating for a sub procedure even if it has been finished"
This reverts commit fffd9b9b6d.

Revert till we figure why behavior between 2.1 and 2.2 is different.
2018-10-18 20:04:24 -07:00
Michael Stack 8fd3fd0e9c
Revert "HBASE-21323 Should not skip force updating for a sub procedure even if"
This reverts commit 30727764a3.

Revert till we figure why behavior between 2.1 and 2.2 is different.
2018-10-18 20:03:57 -07:00
Allan Yang 1afedc608e
HBASE-21292 IdLock.getLockEntry() may hang if interrupted 2018-10-18 14:41:16 -07:00
tianjingyun 915e87ecf7
HBASE-21291 Add a test for bypassing stuck state-machine procedures
Signed-off-by: Michael Stack <stack@apache.org>
2018-10-18 14:26:47 -07:00
Michael Stack 30727764a3
HBASE-21323 Should not skip force updating for a sub procedure even if
it has been finished; ADDENDUM

Fix broke unit test.
2018-10-18 13:48:02 -07:00
Allan Yang b3c3393c19 HBASE-21288 HostingServer in UnassignProcedure is not accurate
Signed-off-by: Allan Yang <allan163@apache.org>
2018-10-18 21:10:53 +08:00
zhangduo fffd9b9b6d HBASE-21323 Should not skip force updating for a sub procedure even if it has been finished 2018-10-18 14:44:31 +08:00
haxiaolin 34a88fca76 HBASE-21055 NullPointerException when balanceOverall() but server balance info is null
Signed-off-by: huzheng <openinx@gmail.com>
2018-10-18 14:08:04 +08:00
Sahil Aggarwal b972b9a2d9
HBASE-20716: Changes the bytes[] conversion done in Bytes and ByteBufferUtils. Instead of doing check unsafe_aligned available everytime, choose the best converter at startup. 2018-10-17 21:03:39 -07:00
Artem Ervits c744dd84cc HBASE-21198 Exclude dependency on net.minidev:json-smart
Signed-off-by: Michael Stack <stack@apache.org>
2018-10-17 11:34:12 -07:00
Duo Zhang 46227c2275
HBASE-21310 & HBASE-21311 Addendum fix failed UTs, some UTs are not present on branch-2.1 and some are a bit different in the implementation 2018-10-17 10:53:13 -07:00
Michael Stack 47364d4db6
HBASE-21327 Fix minor logging issue where we don't report servername if no associated SCP
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2018-10-17 09:34:58 -07:00
Michael Stack 999a3c67d4
HBASE-21320 [canary] Cleanup of usage and add commentary
Signed-off-by: Peter Somogyi <psomogyi@cloudera.com>
2018-10-16 22:12:13 -07:00
zhangduo b0846fb762 HBASE-21311 Split TestRestoreSnapshotFromClient 2018-10-17 11:19:10 +08:00
subrat.mishra dd836aae12
HBASE-21263 Mention compression algorithm along with other storefile details
Signed-off-by: Andrew Purtell <apurtell@apache.org>
Amending-Author: Andrew Purtell <apurtell@apache.org>
2018-10-16 12:47:18 -07:00
Duo Zhang 85c3ec3fb4 HBASE-21315 The getActiveMinProcId and getActiveMaxProcId of BitSetNode are incorrect if there are no active procedure 2018-10-16 15:42:10 +08:00
zhangduo cfe875d3d2 HBASE-21310 Split TestCloneSnapshotFromClient 2018-10-16 15:34:50 +08:00
Andrew Purtell 467323396a
HBASE-21266 Not running balancer because processing dead regionservers, but empty dead rs list 2018-10-15 22:27:52 -07:00
Guanghao Zhang a81d9be876 HBASE-21290 No need to instantiate BlockCache for master which not carry table 2018-10-15 17:30:29 +08:00
haxiaolin 31dec21538 HBASE-21260 The whole balancer plans might be aborted if there are more than one plans to move a same region
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2018-10-15 15:54:34 +08:00
zhangduo 3d0b253248 HBASE-21309 Increase the waiting timeout for TestProcedurePriority 2018-10-15 15:27:11 +08:00
Michael Stack ac31ebf53a
HBASE-21271 [amv2] Don't throw UnsupportedOperationException when rollback called on Assign/Unassign; spiral of death 2018-10-12 22:25:15 -07:00
Michael Stack 72af27b8c9
HBASE-21259 [amv2] Revived deadservers; recreated serverstatenode
Remove a bunch of places where we create ServerStateNode. We were
creating a SSN even though the server was long dead and processed.
The revived SSN was messing up the little dance we do unassigning
procedures. In particular, in UnassignProcedure, the check for a
dead server inside in isLogSplittingDone returns true -- we can
proceed because server is dead -- fails if an SSN exists.

We were creating SSN when we didn't need it as well as inadvertently.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
 Print serverstatenode when reporting expiration. Helps debugging.
 Make moveFromOnlineToDeadServers return if server online or not.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
 Make do w/ serverName in place of serverNode in a few places.
 In waitServerReportEvent, create a ServerStateNode if none though we
 should not have to at this point; to figure out later: TODO.
 addRegionToServer no longer automatically calls create SSN
 so do explicit create processing load meta and the region
 is OPEN so we can associate OPEN regions with the SSN.
 Do not schedule an SCP if server is not online, not in fs, and not in
 dead servers. No point (and there may be cases where server is long
 gone but hbase:meta still refers to it though it has not carried
 regions in a long time; running an assign/unassign against such a
 server will fail because it is not there but SCP won't clean up
 the outstanding hung RPC because our region is not on the long-gone
 server).

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStates.java
 Just cleanup. Make it so addRegionToServer and remove can deal if no SSN.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterWalManager.java
 Add isWALDirectoryNameWithWALS utility.
2018-10-12 17:40:11 -07:00
Michael Stack 4d50f6db5a
HBASE-21242 [amv2] Miscellaneous minor log and assign procedure create improvements; ADDENDUM Fix TestHRegionInfo AND TestRegionInfoDisplay 2018-10-12 16:22:10 -07:00
Michael Stack 266544370d
Revert "HBASE-21242 [amv2] Miscellaneous minor log and assign procedure create improvements; ADDENDUM Fix TestHRegionInfo AND TestRegionInfoDisplay"
This reverts commit 7f3ca4643d.

Bad commit.
2018-10-12 16:15:57 -07:00
Michael Stack 7f3ca4643d
HBASE-21242 [amv2] Miscellaneous minor log and assign procedure create improvements; ADDENDUM Fix TestHRegionInfo AND TestRegionInfoDisplay 2018-10-12 16:09:54 -07:00
Michael Stack 5762f879d2
Revert "HBASE-21242 [amv2] Miscellaneous minor log and assign procedure create improvements; ADDENDUM Fix TestHRegionInfo"
This reverts commit a9d3ac23d84dcd728ee08f4262e3d9b31df26b7e.

Let me do a better fix, one that does TestHRegionInfo and
TestHRegionInfoDisplay
2018-10-12 16:09:53 -07:00
Michael Stack 19cb105a7e HBASE-21242 [amv2] Miscellaneous minor log and assign procedure create improvements; ADDENDUM Fix TestHRegionInfo 2018-10-12 12:42:05 -07:00
Michael Stack fecaf4737c HBASE-21303 [shell] clear_deadservers with no args fails 2018-10-12 11:19:24 -07:00
Michael Stack 714127b4a5 HBASE-21299 List counts of actual region states in master UI tables section 2018-10-12 10:59:44 -07:00
Guanghao Zhang 9b38da685c HBASE-21289 Remove the log "'hbase.regionserver.maxlogs' was deprecated." in AbstractFSWAL 2018-10-12 21:22:31 +08:00
Duo Zhang c3401d4327 HBASE-21254 Need to find a way to limit the number of proc wal files 2018-10-12 11:47:48 +08:00
Sean Busbey c9c7436482 HBASE-21103 nightly job should make sure cached yetus will run.
Signed-off-by: Mike Drob <mdrob@apache.org>
(cherry picked from commit 42d7ddc678)
2018-10-11 10:30:33 -05:00
Mike Drob e726a89f5f HBASE-21287 Allow configuring test master initialization wait time. 2018-10-11 09:50:57 -05:00
Guanghao Zhang e283963533 HBASE-21251 Refactor RegionMover 2018-10-10 15:28:33 +08:00
Michael Stack 976c7ea2ef Revert "HBASE-21271 [amv2] Don't throw UnsupportedOperationException when rollback called on Assign/Unassign; spiral of death"
This reverts commit c96ecbde67.
2018-10-09 22:46:26 -07:00
Michael Stack b51aae9432 HBASE-21280 Add anchors for each heading in UI
Signed-off-by: Ted Yu <tedyu@apache.org>
2018-10-09 22:44:57 -07:00
Mike Drob 2c43656a16
HBASE-20764 build broken when latest commit is gpg signed
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2018-10-08 19:51:37 -05:00
Michael Stack c96ecbde67
HBASE-21271 [amv2] Don't throw UnsupportedOperationException when rollback called on Assign/Unassign; spiral of death 2018-10-09 00:55:02 +09:00
Duo Zhang 9a3b7f16f9 HBASE-21250 Addendum remove unused modification in hbase-server module 2018-10-08 14:56:30 +08:00
zhangduo 5a300f3fc9 HBASE-21250 Refactor WALProcedureStore and add more comments for better understanding the implementation 2018-10-07 17:16:09 +08:00
Michael Stack 9d34b4581c
HBASE-21242 [amv2] Miscellaneous minor log and assign procedure create improvements
For RIT Duration, do better than print ms/seconds. Remove redundant UI
column dedicated to duration when we log it in the status field too.

Make bypass log at INFO level.

Make it so on complete of subprocedure, we note count of outstanding
siblings so we have a clue how much further the parent has to go before
it is done (Helpful when hundreds of servers doing SCP).

Have the SCP run the AP preflight check before creating an AP; saves
creation of thousands of APs during fixup.

Don't log tablename three times when reporting remote call failed.

If lock is held already, note who has it. Also log after we get lock
or if we have to wait rather than log on entrance though we may
later have to wait (or we may have just picked up the lock).

Signed-off-by: Mike Drob <mdrob@apache.org>
2018-10-04 17:18:13 -07:00
Michael Stack 8fc90a23ae
HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign
Adds override to assigns and unassigns. Changes bypass 'force'
to align calling the param 'override' instead.

Adds recursive to 'bypass', a means of calling bypass on
parent and its subprocedures (usually bypass works on
leaf nodes rippling the bypass up to parent -- recursive
has us work in the opposite direction): EXPERIMENTAL.

bypass on an assign/unassign leaves region in RIT and the
RegionStateNode loaded with the bypassed procedure. First
implementation had assign/unassign cleanup leftover state.
Second implementation, on feedback, keeps the state in place
as a fence against other Procedures assuming the region entity,
and instead adds an 'override' function that hbck2 can set on
assigns/unassigns to override the fencing.

Note that the below also converts ProcedureExceptions that
come out of the Pv2 system into DoNotRetryIOEs. It is a
little awkward because DNRIOE is in client-module, not
in procedure module. Previous, we'd just keep retrying
the bypass, etc.

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java
 Have bypass take an environment like all other methods so subclasses.
 Fix javadoc issues.

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
 Javadoc issues. Pass environment when we invoke bypass.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
 Rename waitUntilNamespace... etc. to align with how these method types
 are named elsehwere .. i.e. waitFor rather than waitUntil..

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
 Cleanup message we emit when we find an exisitng procedure working
 against this entity.
 Add support for a force function which allows Assigns/Unassigns force
 ownership of the Region entity.

A hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestRegionBypass.java
 Test bypass and force.

M hbase-shell/src/main/ruby/shell/commands/list_procedures.rb
 Minor cleanup of the json output... do iso8601 timestamps.
2018-10-04 16:37:37 -07:00
Wellington Chevreuil b0ac1c6aba HBASE-21185 - WALPrettyPrinter: Additional useful info to be printed by wal printer tool, for debugability purposes
Signed-off-by: Allan Yang <allan163@apache.org>
2018-10-04 03:28:21 -07:00
Xu Cang 3df8b6f7bb
HBASE-18549 Add metrics for failed replication queue recovery
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2018-10-01 18:39:07 -07:00
Andrew Purtell f9d7ac2d5e
HBASE-21261 Add log4j.properties for hbase-rsgroup tests 2018-10-01 18:09:00 -07:00
Xu Cang 76a487c062
HBASE-19275 TestSnapshotFileCache never worked properly
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2018-10-01 17:12:21 -07:00
Michael Stack 259d12f739 Revert "Revert "Revert "HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign"""
This reverts commit 2174461cf7.

Revert because not ready to port to other branches.
2018-09-29 04:06:46 -07:00
Michael Stack 2174461cf7 Revert "Revert "HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign""
This reverts commit b96905d1df.

i.e. a revert of a revert so a reapplication!

Revert so I can add signed-off-by....

Signed-off-by: Allan Yang <allan163@apache.org>
2018-09-29 03:34:36 -07:00
Michael Stack b96905d1df Revert "HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign"
This reverts commit b42d7978cb.
2018-09-29 03:34:10 -07:00