Commit Graph

7218 Commits

Author SHA1 Message Date
Allan Yang 9e52e9eb7d HBASE-21395 Abort split/merge procedure if there is a table procedure of the same table going on 2018-11-05 20:12:00 +08:00
Michael Stack 8df5878932 HBASE-21425 2.1.1 fails to start over 1.x data; namespace not assigned 2018-11-03 09:45:36 -07:00
zhangduo 46eb8f1d0d HBASE-21351 The force update thread may have race with PE worker when the procedure is rolling back 2018-11-03 08:25:43 +08:00
jingyuntian 3836967e05
HBASE-21407 Resolve NPE in backup Master UI
Signed-off-by: Michael Stack <stack@apache.org>
2018-11-02 11:46:15 -07:00
Guanghao Zhang 29e3eec703 HBASE-21388 No need to instantiate MemStoreLAB for master which not carry table 2018-11-01 16:27:23 +08:00
Michael Stack 29d6eeb6e8
HBASE-21322 Add a scheduleServerCrashProcedure() API to HbckService
ADDENDUM
2018-10-31 10:15:04 -07:00
Duo Zhang bddd488c34 HBASE-21237 Use CompatRemoteProcedureResolver to dispatch open/close region requests to RS 2018-10-30 17:36:44 +08:00
tianjingyun 422e98957b
HBASE-21322 Add a scheduleServerCrashProcedure() API to HbckService
Signed-off-by: Michael Stack <stack@apache.org>
2018-10-29 20:56:44 -07:00
zhangduo 2466032fdd HBASE-21375 Revisit the lock and queue implementation in MasterProcedureScheduler 2018-10-29 20:18:10 +08:00
Michael Stack 066082dff4
HBASE-21397 Set version to 2.1.1 on branch-2.1 in prep for first RC 2018-10-26 12:56:24 -07:00
Duo Zhang 24f5f7afa8 HBASE-21391 RefreshPeerProcedure should also wait master initialized before executing 2018-10-26 21:45:08 +08:00
Michael Stack 940326d8f5
Revert "HBASE-21376 Add some verbose log to MasterProcedureScheduler"
This reverts commit 71224ee530.
2018-10-26 06:10:47 -07:00
Allan Yang 71224ee530
HBASE-21376 Add some verbose log to MasterProcedureScheduler 2018-10-26 05:54:50 -07:00
Mike Drob 127de9e637
HBASE-21380 Filter finished SCP at start 2018-10-25 20:28:51 -07:00
Michael Stack 7de5f1d60d
Revert "HBASE-21380 Completed SCPs shouldn't add to dead servers in processing"
This reverts commit 1add6e9ca4.
2018-10-25 20:26:43 -07:00
Mike Drob 1add6e9ca4
HBASE-21380 Completed SCPs shouldn't add to dead servers in processing
Signed-off-by: Michael Stack <stack@apache.org>
2018-10-25 19:18:15 -07:00
Guanghao Zhang 7c3033d704 HBASE-21385 HTable.delete request use rpc call directly instead of AsyncProcess
Signed-off-by: Michael Stack <stack@apache.org>
2018-10-25 16:21:58 -07:00
Michael Stack 2e9381a839 HBASE-21372) Set hbase.assignment.maximum.attempts to Long.MAX
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Allan Yang <allan163@apache.org>
2018-10-24 09:07:01 -07:00
Allan Yang 6c9e3d0670 HBASE-21364 Procedure holds the lock should put to front of the queue after restart 2018-10-24 10:52:52 +08:00
mazhenlin d35f65f396 HBASE-21342 FileSystem in use may get closed by other bulk load call in secure bulkLoad
Signed-off-by: Mike Drob <mdrob@apache.org>
Signed-off-by: Ted Yu <tyu@apache.org>
2018-10-23 16:46:28 -05:00
xcang ae13a5c6ea
HBASE-21349 Do not run CatalogJanitor or Nomalizer when cluster is shutting down
Signed-off-by: Michael Stack <stack@apache.org>
2018-10-23 14:44:22 -07:00
xcang 3979aebebf
HBASE-21338 Warn if balancer is an ill-fit for cluster size
Signed-off-by: Michael Stack <stack@apache.org>
2018-10-23 13:22:41 -07:00
zhangduo 7c04a95f4a
HBASE-21321 Backport HBASE-21278 to branch-2.1 and branch-2.0 ("Do not rollback successful sub procedures when rolling back a procedure")
Signed-off-by: Michael Stack <stack@apache.org>
2018-10-22 21:10:11 -07:00
huzheng 2173770305 HBASE-21356 bulkLoadHFile API should ensure that rs has the source hfile's write permissionls 2018-10-23 10:22:30 +08:00
Mike Drob 65d698439f HBASE-21073 Redo concept of maintenance mode
Instead of being an ephemeral state set by hbck, maintenance mode is now
an explicit toggle set by either configuration property or environment
variable. In maintenance mode, master will host system tables and not
assign any user-space tables to RSs. This gives operators the ability to
affect repairs to meta table with fewer moving parts.
2018-10-22 20:12:38 -05:00
tedyu b232746d4c HBASE-21281 Update bouncycastle dependency - addendum adds dependency for hbase-server module 2018-10-22 17:12:11 -04:00
Josh Elser fedaedd6a1 HBASE-21281 Upgrade bouncycastle to latest
BC 1.47 introduced some incompatible API changes which came in via
a new Maven artifact. We don't use any changed API in HBase. This
also removes some unnecessary dependencies on bcprov in other
modules (presumably, they are vestiges)

Signed-off-by: Mike Drob <mdrob@apache.org>
Signed-off-by: Ted Yu <tedyu@apache.org>
2018-10-22 17:12:11 -04:00
zhangduo afa7d6ed43 HBASE-21336 Addendum remove unused code in HBTU 2018-10-22 20:27:24 +08:00
huzheng fc1ef790ac HBASE-21355 (addendum) replace the expensive reload storefiles with reading the merge result of compacted storefiles and current storefiles 2018-10-22 19:31:02 +08:00
zhangduo 4ded75357b HBASE-21336 Simplify the implementation of WALProcedureMap 2018-10-22 18:36:39 +08:00
zhangduo 6e5d1a4896 HBASE-21334 TestMergeTableRegionsProcedure is flakey 2018-10-22 14:19:08 +08:00
huzheng 492172505a HBASE-21355 HStore's storeSize is calculated repeatedly which causing the confusing region split 2018-10-22 10:12:52 +08:00
Michael Stack b3a11b78f7
HBASE-21348 Fix failing TestRegionBypass, broke by HBASE-21291 2018-10-19 21:27:54 -07:00
Michael Stack 4ad63d77be
HBASE-21345 [hbck2] Allow version check to proceed even though master is 'initializing'.
Just remove the check state from the getClusterStatus call.
2018-10-19 17:40:03 -07:00
Toshihiro Suzuki a08c2c269d HBASE-21200 Memstore flush doesn't finish because of seekToPreviousRow() in memstore scanner. 2018-10-20 08:36:41 +09:00
Allan Yang b3c3393c19 HBASE-21288 HostingServer in UnassignProcedure is not accurate
Signed-off-by: Allan Yang <allan163@apache.org>
2018-10-18 21:10:53 +08:00
haxiaolin 34a88fca76 HBASE-21055 NullPointerException when balanceOverall() but server balance info is null
Signed-off-by: huzheng <openinx@gmail.com>
2018-10-18 14:08:04 +08:00
Duo Zhang 46227c2275
HBASE-21310 & HBASE-21311 Addendum fix failed UTs, some UTs are not present on branch-2.1 and some are a bit different in the implementation 2018-10-17 10:53:13 -07:00
Michael Stack 47364d4db6
HBASE-21327 Fix minor logging issue where we don't report servername if no associated SCP
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2018-10-17 09:34:58 -07:00
Michael Stack 999a3c67d4
HBASE-21320 [canary] Cleanup of usage and add commentary
Signed-off-by: Peter Somogyi <psomogyi@cloudera.com>
2018-10-16 22:12:13 -07:00
zhangduo b0846fb762 HBASE-21311 Split TestRestoreSnapshotFromClient 2018-10-17 11:19:10 +08:00
subrat.mishra dd836aae12
HBASE-21263 Mention compression algorithm along with other storefile details
Signed-off-by: Andrew Purtell <apurtell@apache.org>
Amending-Author: Andrew Purtell <apurtell@apache.org>
2018-10-16 12:47:18 -07:00
zhangduo cfe875d3d2 HBASE-21310 Split TestCloneSnapshotFromClient 2018-10-16 15:34:50 +08:00
Andrew Purtell 467323396a
HBASE-21266 Not running balancer because processing dead regionservers, but empty dead rs list 2018-10-15 22:27:52 -07:00
Guanghao Zhang a81d9be876 HBASE-21290 No need to instantiate BlockCache for master which not carry table 2018-10-15 17:30:29 +08:00
haxiaolin 31dec21538 HBASE-21260 The whole balancer plans might be aborted if there are more than one plans to move a same region
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2018-10-15 15:54:34 +08:00
zhangduo 3d0b253248 HBASE-21309 Increase the waiting timeout for TestProcedurePriority 2018-10-15 15:27:11 +08:00
Michael Stack ac31ebf53a
HBASE-21271 [amv2] Don't throw UnsupportedOperationException when rollback called on Assign/Unassign; spiral of death 2018-10-12 22:25:15 -07:00
Michael Stack 72af27b8c9
HBASE-21259 [amv2] Revived deadservers; recreated serverstatenode
Remove a bunch of places where we create ServerStateNode. We were
creating a SSN even though the server was long dead and processed.
The revived SSN was messing up the little dance we do unassigning
procedures. In particular, in UnassignProcedure, the check for a
dead server inside in isLogSplittingDone returns true -- we can
proceed because server is dead -- fails if an SSN exists.

We were creating SSN when we didn't need it as well as inadvertently.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
 Print serverstatenode when reporting expiration. Helps debugging.
 Make moveFromOnlineToDeadServers return if server online or not.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
 Make do w/ serverName in place of serverNode in a few places.
 In waitServerReportEvent, create a ServerStateNode if none though we
 should not have to at this point; to figure out later: TODO.
 addRegionToServer no longer automatically calls create SSN
 so do explicit create processing load meta and the region
 is OPEN so we can associate OPEN regions with the SSN.
 Do not schedule an SCP if server is not online, not in fs, and not in
 dead servers. No point (and there may be cases where server is long
 gone but hbase:meta still refers to it though it has not carried
 regions in a long time; running an assign/unassign against such a
 server will fail because it is not there but SCP won't clean up
 the outstanding hung RPC because our region is not on the long-gone
 server).

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStates.java
 Just cleanup. Make it so addRegionToServer and remove can deal if no SSN.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterWalManager.java
 Add isWALDirectoryNameWithWALS utility.
2018-10-12 17:40:11 -07:00
Michael Stack 4d50f6db5a
HBASE-21242 [amv2] Miscellaneous minor log and assign procedure create improvements; ADDENDUM Fix TestHRegionInfo AND TestRegionInfoDisplay 2018-10-12 16:22:10 -07:00
Michael Stack 266544370d
Revert "HBASE-21242 [amv2] Miscellaneous minor log and assign procedure create improvements; ADDENDUM Fix TestHRegionInfo AND TestRegionInfoDisplay"
This reverts commit 7f3ca4643d.

Bad commit.
2018-10-12 16:15:57 -07:00
Michael Stack 7f3ca4643d
HBASE-21242 [amv2] Miscellaneous minor log and assign procedure create improvements; ADDENDUM Fix TestHRegionInfo AND TestRegionInfoDisplay 2018-10-12 16:09:54 -07:00
Michael Stack 5762f879d2
Revert "HBASE-21242 [amv2] Miscellaneous minor log and assign procedure create improvements; ADDENDUM Fix TestHRegionInfo"
This reverts commit a9d3ac23d84dcd728ee08f4262e3d9b31df26b7e.

Let me do a better fix, one that does TestHRegionInfo and
TestHRegionInfoDisplay
2018-10-12 16:09:53 -07:00
Michael Stack 19cb105a7e HBASE-21242 [amv2] Miscellaneous minor log and assign procedure create improvements; ADDENDUM Fix TestHRegionInfo 2018-10-12 12:42:05 -07:00
Michael Stack 714127b4a5 HBASE-21299 List counts of actual region states in master UI tables section 2018-10-12 10:59:44 -07:00
Guanghao Zhang 9b38da685c HBASE-21289 Remove the log "'hbase.regionserver.maxlogs' was deprecated." in AbstractFSWAL 2018-10-12 21:22:31 +08:00
Duo Zhang c3401d4327 HBASE-21254 Need to find a way to limit the number of proc wal files 2018-10-12 11:47:48 +08:00
Mike Drob e726a89f5f HBASE-21287 Allow configuring test master initialization wait time. 2018-10-11 09:50:57 -05:00
Guanghao Zhang e283963533 HBASE-21251 Refactor RegionMover 2018-10-10 15:28:33 +08:00
Michael Stack 976c7ea2ef Revert "HBASE-21271 [amv2] Don't throw UnsupportedOperationException when rollback called on Assign/Unassign; spiral of death"
This reverts commit c96ecbde67.
2018-10-09 22:46:26 -07:00
Michael Stack b51aae9432 HBASE-21280 Add anchors for each heading in UI
Signed-off-by: Ted Yu <tedyu@apache.org>
2018-10-09 22:44:57 -07:00
Michael Stack c96ecbde67
HBASE-21271 [amv2] Don't throw UnsupportedOperationException when rollback called on Assign/Unassign; spiral of death 2018-10-09 00:55:02 +09:00
Duo Zhang 9a3b7f16f9 HBASE-21250 Addendum remove unused modification in hbase-server module 2018-10-08 14:56:30 +08:00
zhangduo 5a300f3fc9 HBASE-21250 Refactor WALProcedureStore and add more comments for better understanding the implementation 2018-10-07 17:16:09 +08:00
Michael Stack 9d34b4581c
HBASE-21242 [amv2] Miscellaneous minor log and assign procedure create improvements
For RIT Duration, do better than print ms/seconds. Remove redundant UI
column dedicated to duration when we log it in the status field too.

Make bypass log at INFO level.

Make it so on complete of subprocedure, we note count of outstanding
siblings so we have a clue how much further the parent has to go before
it is done (Helpful when hundreds of servers doing SCP).

Have the SCP run the AP preflight check before creating an AP; saves
creation of thousands of APs during fixup.

Don't log tablename three times when reporting remote call failed.

If lock is held already, note who has it. Also log after we get lock
or if we have to wait rather than log on entrance though we may
later have to wait (or we may have just picked up the lock).

Signed-off-by: Mike Drob <mdrob@apache.org>
2018-10-04 17:18:13 -07:00
Michael Stack 8fc90a23ae
HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign
Adds override to assigns and unassigns. Changes bypass 'force'
to align calling the param 'override' instead.

Adds recursive to 'bypass', a means of calling bypass on
parent and its subprocedures (usually bypass works on
leaf nodes rippling the bypass up to parent -- recursive
has us work in the opposite direction): EXPERIMENTAL.

bypass on an assign/unassign leaves region in RIT and the
RegionStateNode loaded with the bypassed procedure. First
implementation had assign/unassign cleanup leftover state.
Second implementation, on feedback, keeps the state in place
as a fence against other Procedures assuming the region entity,
and instead adds an 'override' function that hbck2 can set on
assigns/unassigns to override the fencing.

Note that the below also converts ProcedureExceptions that
come out of the Pv2 system into DoNotRetryIOEs. It is a
little awkward because DNRIOE is in client-module, not
in procedure module. Previous, we'd just keep retrying
the bypass, etc.

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java
 Have bypass take an environment like all other methods so subclasses.
 Fix javadoc issues.

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
 Javadoc issues. Pass environment when we invoke bypass.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
 Rename waitUntilNamespace... etc. to align with how these method types
 are named elsehwere .. i.e. waitFor rather than waitUntil..

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
 Cleanup message we emit when we find an exisitng procedure working
 against this entity.
 Add support for a force function which allows Assigns/Unassigns force
 ownership of the Region entity.

A hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestRegionBypass.java
 Test bypass and force.

M hbase-shell/src/main/ruby/shell/commands/list_procedures.rb
 Minor cleanup of the json output... do iso8601 timestamps.
2018-10-04 16:37:37 -07:00
Wellington Chevreuil b0ac1c6aba HBASE-21185 - WALPrettyPrinter: Additional useful info to be printed by wal printer tool, for debugability purposes
Signed-off-by: Allan Yang <allan163@apache.org>
2018-10-04 03:28:21 -07:00
Xu Cang 3df8b6f7bb
HBASE-18549 Add metrics for failed replication queue recovery
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2018-10-01 18:39:07 -07:00
Xu Cang 76a487c062
HBASE-19275 TestSnapshotFileCache never worked properly
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2018-10-01 17:12:21 -07:00
Michael Stack 259d12f739 Revert "Revert "Revert "HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign"""
This reverts commit 2174461cf7.

Revert because not ready to port to other branches.
2018-09-29 04:06:46 -07:00
Michael Stack 2174461cf7 Revert "Revert "HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign""
This reverts commit b96905d1df.

i.e. a revert of a revert so a reapplication!

Revert so I can add signed-off-by....

Signed-off-by: Allan Yang <allan163@apache.org>
2018-09-29 03:34:36 -07:00
Michael Stack b96905d1df Revert "HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign"
This reverts commit b42d7978cb.
2018-09-29 03:34:10 -07:00
Michael Stack b42d7978cb HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign
bypass on an assign/unassign leaves region in RIT and the
RegionStateNode loaded with the bypassed procedure. First
implementation had assign/unassign cleanup leftover state.
Second implementation, on feedback, keeps the state in place
as a fence against other Procedures assuming the region entity,
and instead adds an 'override' function that hbck2 can set on
assigns/unassigns to override the fencing.

Note that the below also converts ProcedureExceptions that
come out of the Pv2 system into DoNotRetryIOEs. It is a
little awkward because DNRIOE is in client-module, not
in procedure module. Previous, we'd just keep retrying
the bypass, etc.

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java
 Have bypass take an environment like all other methods so subclasses.
 Fix javadoc issues.

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
 Javadoc issues. Pass environment when we invoke bypass.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
 Rename waitUntilNamespace... etc. to align with how these method types
 are named elsehwere .. i.e. waitFor rather than waitUntil..

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
 Cleanup message we emit when we find an exisitng procedure working
 against this entity.
 Add support for a force function which allows Assigns/Unassigns force
 ownership of the Region entity.

A hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestRegionBypass.java
 Test bypass and force.

M hbase-shell/src/main/ruby/shell/commands/list_procedures.rb
 Minor cleanup of the json output... do iso8601 timestamps.
2018-09-29 03:33:07 -07:00
zhangduo 1f90d00614 HBASE-21248 Implement exponential backoff when retrying for ModifyPeerProcedure 2018-09-29 13:26:28 +08:00
Nihal Jain c41003f5e6
HBASE-21196 HTableMultiplexer clears the meta cache after every put operation
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2018-09-28 16:35:57 -07:00
Kiran Kumar Maturi b7c2b953bc
HBASE-20857 balancer status tag in jmx metrics
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2018-09-28 16:12:11 -07:00
Archana Katiyar 209d0a8a16
HBASE-21207 Add client side sorting functionality in master web UI for table and region server details
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2018-09-28 15:40:43 -07:00
ramie-raufdeen e44ed1b1ef
HBASE-19418 configurable range of delay in PeriodicMemstoreFlusher
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2018-09-28 14:39:52 -07:00
xcang e26a6e0e10
HBASE-18451 PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request, fix logging
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2018-09-28 11:50:24 -07:00
Allan Yang f6c05faccf Revert "HBASE-21237 Use CompatRemoteProcedureResolver to dispatch open/close region requests to RS" 2018-09-28 14:07:40 +08:00
Allan Yang 0290f57c3a HBASE-21237 Use CompatRemoteProcedureResolver to dispatch open/close region requests to RS 2018-09-28 09:41:31 +08:00
Allan Yang eb27251265 HBASE-21228 Memory leak since AbstractFSWAL caches Thread object and never clean later 2018-09-27 15:07:07 +08:00
Michael Stack 5169cfc8c3 HBASE-21232 Show table state in Tables view on Master home page 2018-09-26 10:57:23 -07:00
Zach York 504286d55c HBASE-20734 Colocate recovered edits directory with hbase.wal.dir
Amending-Author: Reid Chan <reidchan@apache.org>
Signed-off-by: Reid Chan <reidchan@apache.org>
2018-09-26 19:37:53 +08:00
Allan Yang ba8a252167 HBASE-21212 Wrong flush time when update flush metric 2018-09-26 19:11:23 +08:00
Mingliang Liu fea75742b4
HBASE-21164 reportForDuty should do backoff rather than retry
Remove unused methods from Sleeper (its ok, its @Private).
Remove notion of startTime from Sleeper handling (it is is unused).
Allow passing in how long to sleep so can maintain externally.
In HRS, use a RetryCounter to calculate backoff sleep time for when
reportForDuty is failing against a struggling Master.
2018-09-25 11:31:39 -07:00
Andrew Purtell 101205345b
Amend HBASE-20704 Sometimes some compacted storefiles are not archived on region close
Forward port small logging improvements from branch-1 version of this change.
2018-09-21 16:12:51 -07:00
Michael Stack a22aec1dad
HBASE-21214 [hbck2] setTableState just sets hbase:meta state, not in-memory state 2018-09-21 16:03:58 -07:00
openinx 5a73a1ab25 HBASE-21206 Scan with batch size may return incomplete cells 2018-09-20 22:20:02 +08:00
tianjingyun c5af7b654b HBASE-21204 NPE when scan raw DELETE_FAMILY_VERSION and codec is not set
Signed-off-by: tedyu <yuzhihong@gmail.com>
2018-09-20 06:59:43 -07:00
Umesh Agashe e6c7ed34e0
HBASE-21023 Added bypassProcedure() API to HbckService 2018-09-19 15:01:29 -07:00
Michael Stack 37cc07a772
HBASE-21156 [hbck2] Queue an assign of hbase:meta and bulk assign/unassign
Adds 'raw' assigns and unassigns methods to Hbck Service.

Fixes HbckService so it works when cluster is Kerberized.
2018-09-19 09:02:43 -07:00
Vasudevan 27b772ddc6 HBASE-21102 ServerCrashProcedure should select target server where no
other replicas exist for the current region (Ram)
2018-09-17 22:36:50 +05:30
Michael Stack 39e0b8515f HBASE-21191 Add a holding-pattern if no assign for meta or namespace (Can happen if masterprocwals have been cleared).
Add a check for hbase:meta being online before we go to read it.
If not online, move into a holding-pattern until rectified, probably
by external operator.

Incorporates bulk of patch made by Allan Yang over on HBASE-21035.

M hbase-common/src/main/java/org/apache/hadoop/hbase/util/RetryCounterFactory.java

 Add a Constructor for case where retries are for ever.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
 Move stuff around so that the first hbase:meta read is the AM#loadMeta.
 Previously, checking table state and/or favored nodes could end up
 trying to read a meta that was not onlined holding up master startup.
 Do similar for the namespace table. Adds new methods isMeta and
 isNamespace which check that the regions/tables are online.. if not,
 we wait logging with a back-off that assigns need to be run.

Signed-off-by: Allan Yang <allan163@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2018-09-16 21:12:59 -07:00
Francis Liu a925a4ce16 HBASE-20704 Sometimes some compacted storefiles are not archived on region close 2018-09-16 18:38:03 -07:00
Ted Yu 842e0c974d HBASE-21097 Flush pressure assertion may fail in testFlushThroughputTuning
Amending-Author: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2018-09-15 18:39:42 +08:00
Umesh Agashe 589c1e4078
HBASE-20941 Created and implemented HbckService in master
Added API setTableStateInMeta() to update table state only in Meta. This will be used by hbck2 tool.
2018-09-12 21:31:13 -07:00
Mike Drob d81e806718 HBASE-21168 Insecure Randomness in BloomFilterUtil
Flagged by Fortify static analysis

Signed-off-by: Andrew Purtell <apurtell@apache.org>
Signed-off-by: Mingliang Liu <liuml07@apache.org>
2018-09-12 09:52:41 -05:00
Duo Zhang 2da6dbe563 HBASE-21172 Reimplement the retry backoff logic for ReopenTableRegionsProcedure 2018-09-12 16:01:55 +08:00
David Manning 75a7643b11 Backport "HBASE-21126 Add ability for HBase Canary to ignore a configurable number of ZooKeeper down nodes" to branch-2.1
Signed-off-by: Josh Elser <elserj@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2018-09-12 10:01:28 +08:00
krish.dey 63ef89bff7 HBASE-21125 Backport 'HBASE-20942 Improve RpcServer TRACE logging' to branch-2.1
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2018-09-12 09:59:28 +08:00
Duo Zhang b9d74f89ff Revert "HBASE-20942 Fix ArrayIndexOutOfBoundsException for RpcServer TRACE logging"
This reverts commit 69756da503.
2018-09-12 09:55:46 +08:00
krish.dey 69756da503 HBASE-20942 Fix ArrayIndexOutOfBoundsException for RpcServer TRACE logging
Also makes the trace log message length configurable.

Signed-off-by: Josh Elser <elserj@apache.org>
2018-09-12 09:44:22 +08:00
Guangxu Cheng 1c8c7e10f8 HBASE-21158 Empty qualifier cell is always returned when using QualifierFilter 2018-09-10 21:40:57 +08:00
Duo Zhang 6ab9997d1f HBASE-21144 AssignmentManager.waitForAssignment is not stable 2018-09-10 17:28:57 +08:00
Guangxu Cheng 12ffa086c7 HBASE-21001 ReplicationObserver fails to load in HBase 2.0.0 2018-09-07 23:43:10 +08:00
Michael Stack 5324911cd8
HBASE-21155 Save on a few log strings and some churn in wal splitter by skipping out early if no logs in dir; ADDENDUM
Address review comments.

Signed-off-by: Mike Drob <mdrob@apache.org>
2018-09-06 17:24:03 -07:00
Michael Stack 205783419c
HBASE-21155 Save on a few log strings and some churn in wal splitter by skipping out early if no logs in dir 2018-09-06 16:36:59 -07:00
Vasudevan 2051b0982d HBASE-20741 Split of a region with replicas creates all daughter regions
and its replica in same server (Ram)
2018-09-06 16:44:59 +05:30
Guangxu Cheng c64814ec96 HBASE-20892 [UI] Start / End keys are empty on table.jsp 2018-09-05 09:37:25 +08:00
Allan Yang e33591515c
HBASE-21083 Introduce a mechanism to bypass the execution of a stuck procedure 2018-08-28 20:18:47 -07:00
Michael Stack 4340930c71
HBASE-20649 Validate HFiles do not have PREFIX_TREE DataBlockEncoding; ADDEDNDUM ADD MISSING FILE 2018-08-28 07:45:27 -07:00
Balazs Meszaros 147694bb08
HBASE-20649 Validate HFiles do not have PREFIX_TREE DataBlockEncoding 2018-08-28 07:09:47 -07:00
Ted Yu c1cd6d5a89
HBASE-21088 HStoreFile should be closed in HStore#hasReferences 2018-08-27 20:31:50 -07:00
Michael Stack e826e3f2b8 HBASE-21120 MoveRegionProcedure makes no progress; goes to STUCK 2018-08-27 14:55:52 -07:00
zhangduo 625be5137e HBASE-21072 Addendum do not write lock file when running TestHBaseFsckReplication 2018-08-27 21:05:16 +08:00
Allan Yang 33fa32d711 HBASE-21113 Apply the branch-2 version of HBASE-21095, The timeout retry logic for several procedures are broken after master restarts(addendum) 2018-08-26 22:15:49 +08:00
Michael Stack d954031d50 HBASE-21078 [amv2] CODE-BUG NPE in RTP doing Unassign 2018-08-24 13:22:16 -07:00
Michael Stack e26ca63f88 Revert "Revert "HBASE-21095 The timeout retry logic for several procedures are broken after master restarts""
HBASE-21113 Apply the branch-2 version of HBASE-21095, The timeout retry
logic for several procedures are broken after master restarts

I applied the patch HBASE-21095 and then reverted it so could apply the
patch as HBASE-21113 (by reverting the HBASE-21095 revert but pushing
with this message!).

This reverts commit 4978db8102.
2018-08-24 12:35:29 -07:00
Michael Stack 4978db8102 Revert "HBASE-21095 The timeout retry logic for several procedures are broken after master restarts"
This reverts commit b82cd670c3.
2018-08-24 12:24:32 -07:00
Allan Yang b82cd670c3 HBASE-21095 The timeout retry logic for several procedures are broken after master restarts 2018-08-24 12:20:43 -07:00
Michael Stack 66add55234 HBASE-21072 Block out HBCK1 in hbase2
Write the hbase-1.x hbck1 lock file to block out hbck1 instances writing
state to an hbase-2.x cluster (could do damage).
Set hbase.write.hbck1.lock.file to false to disable this writing.
2018-08-24 09:22:53 -07:00
Duo Zhang 8a9acd4d2a HBASE-21101 Remove the waitUntilAllRegionsAssigned call after split in TestTruncateTableProcedure 2018-08-24 10:35:10 +08:00
zhangduo bf21a9dc33 HBASE-20193 Move TestCreateTableProcedure.testMRegions to a separated file 2018-08-24 10:09:31 +08:00
Duo Zhang 239d12dae8 HBASE-20194 Remove the explicit timeout config for TestTruncateTableProcedure 2018-08-23 06:27:41 +08:00
Allan Yang 16ab716134 HBASE-21041 Memstore's heap size will be decreased to minus zero after flush 2018-08-22 22:54:14 +08:00
Allan Yang c07afa8875 HBASE-21031 Memory leak if replay edits failed during region opening 2018-08-22 22:13:26 +08:00
Andrey Elenskiy 5f03be4675 HBASE-21032 ScanResponses contain only one cell each
Amending-Author: Duo Zhang <zhangduo@apache.org>
2018-08-21 13:31:18 -07:00
Andrew Purtell 798cb1d793 HBASE-20940 HStore.cansplit should not allow split to happen if it has references (Vishal Khandelwal) 2018-08-17 15:02:26 -07:00
Josh Elser 67ad0e6013 HBASE-21062 Correctly use the defaultProvider value on the Providers enum when constructing a WALProvider 2018-08-17 14:55:42 -04:00
Sakthi 48dee7e44d HBASE-20705 Having RPC quota on a table now no longer prevents Space Quota to be recreate/removed
Just added 2 test cases as the subtasks of this jira solves the issue

Signed-off-by: Josh Elser <elserj@apache.org>
2018-08-17 14:09:26 -04:00
Andrew Purtell b49941012a HBASE-21047 Object creation of StoreFileScanner thru constructor and close may leave refCount to -1 (Vishal Khandelwal) 2018-08-16 11:42:54 -07:00
Nihal Jain 145c92f3d6 HBASE-20469 Directory used for sidelining old recovered edits files should be made configurable
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2018-08-15 18:08:15 -07:00
Michael Stack 2e5efa690a HBASE-20772 Controlled shutdown fills Master log with the disturbing message 'No matching procedure found for rit=OPEN, location=ZZZZ, table=YYYYY, region=XXXX transition to CLOSED'
Look for the particular case where RS does the close of region w/o
involving Master and log special message in this case. Dodgy. But
until we have Master run shutdown of all regions, better than
the message we currently show.
2018-08-13 15:59:39 -07:00
Allan Yang 161c018927 HBASE-21029 Miscount of memstore's heap/offheap size if same cell was put 2018-08-13 20:30:23 +08:00
jingyuntian 95e3dec510 HBASE-20985 add two attributes when we do normalization
Signed-off-by: Guanghao Zhang <zghao@apache.org>
2018-08-13 16:55:19 +08:00
Duo Zhang 846078f9b0 HBASE-21025 Addendum missed a 'succ = true'
Signed-off-by: Guanghao Zhang <zghao@apache.org>
2018-08-13 10:55:18 +08:00
brandboat 873d9f5082 HBASE-21012 Revert the change of serializing TimeRangeTracker
Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: zhangduo <zhangduo@apache.org>
Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>
2018-08-11 22:28:49 +08:00
Wei-Chiu Chuang 5e12d6a98e HBASE-21018 RS crashed because AsyncFS was unable to update HDFS data encryption key 2018-08-10 19:53:22 -07:00
zhangduo ee164fcbc5 HBASE-21025 Add cache for TableStateManager 2018-08-10 21:11:53 +08:00
brandboat 8a9ba0c65b HBASE-18201 add UT and docs for DataBlockEncodingTool
Signed-off-by: Reid Chan <reidchan@apache.org>
2018-08-10 11:19:36 +08:00
meiyi e222686294 HBASE-20965 Separate region server report requests to new handlers
Signed-off-by: Guanghao Zhang <zghao@apache.org>
2018-08-09 18:27:38 +08:00
Sakthi b2fc0f48f6 HBASE-20813 Removed RPC quotas when the associated table/Namespace is dropped off
Signed-off-by: Josh Elser <elserj@apache.org>
2018-08-08 13:46:25 -04:00
jingyuntian 9d594ac86a HBASE-20986 Separate the config of block size when we do log splitting and write Hlog
Signed-off-by: Guanghao Zhang <zghao@apache.org>
2018-08-07 14:03:03 +08:00
Sakthi 7e9f8c60e2 HBASE-20885 Removed entry for RPC quota from hbase:quota when RPC quota is removed
Signed-off-by: Josh Elser <elserj@apache.org>
Signed-off-by: Mike Drob <mdrob@apache.org>
2018-08-03 11:07:01 -04:00
TAK LON WU 2e1c12ca1b HBASE-20856 PITA having to set WAL provider in two places
With this change if hbase.wal.meta_provider is not explicitly set,
it uses whatever set with hbase.wal.provider. this change avoids a use
case of unexpectedly using two different providers when only
hbase.wal.provider is set to non-default but not hbase.wal.meta_provider.

This change also include document (architecture.adoc) update

Also, this is a port from master to branch-2

Signed-off-by: Zach York <zyork@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: Sean Busbey <busbey@apache.org>
Signed-off-by: Duo Zhang <Apache9@apache.org>
2018-08-01 14:45:11 -07:00
Michael Stack 88f3148810 HBASE-20989 Minor, miscellaneous logging fixes
Signed-off-by: Zach York <zyork@amazon.com>
Signed-off-by: Mingliang Liu <liuml07@apache.org>
2018-08-01 11:20:01 -07:00
Xu Cang 9338eaee65 HBASE-20794 add INFO level log to createTable operation 2018-08-01 11:04:00 -07:00
Michael Stack 0f4e857c7a HBASE-20893 Data loss if splitting region while ServerCrashProcedure executing ADDENDUM: Rather than rollback, just do region reopens.
In split, reopen the parent if recovered.edits and in merge, reopen the
parent region or regions that happened to have recovered.edits on close.
2018-08-01 00:33:12 -07:00
Andrew Purtell daeec8657e HBASE-20935 HStore.removeCompactedFiles should log in case it is unable to delete a file (Vishal Khandelwal)
Conflicts:
	hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
2018-07-31 16:06:15 -07:00
zhangduo 1b66839bc4 HBASE-20538 Upgrade our hadoop versions to 2.7.7 and 3.0.3 2018-07-29 20:20:28 +08:00
zhangduo a9346982bf Revert "HBASE-20538 Upgrade our hadoop-two.version to 2.7.7 and 3.0.3"
This reverts commit 3dd83adb51.
2018-07-29 20:20:20 +08:00
zhangduo 3dd83adb51 HBASE-20538 Upgrade our hadoop-two.version to 2.7.7 and 3.0.3 2018-07-29 20:04:48 +08:00
Alex Leblang 31cbd7ab8f
HBASE-19369 Switch to Builder Pattern In WAL
This patch switches to the builder pattern by adding a helper method.
It also checks to ensure that the pattern is available (i.e. that
HBase is running on a hadoop version that supports it).

Amending-Author: Mike Drob <mdrob@apache.org>
Signed-off-by: tedyu <yuzhihong@gmail.com>
Signed-off-by: zhangduo <zhangduo@apache.org>
2018-07-27 23:43:08 -05:00
zhangduo 8bfdb19e85 HBASE-20939 There will be race when we call suspendIfNotReady and then throw ProcedureSuspendedException 2018-07-27 21:30:23 +08:00
Allan Yang 35c598db93 HBASE-20921 Possible NPE in ReopenTableRegionsProcedure 2018-07-27 09:31:12 +08:00
zhangduo 1777ea3aae HBASE-20938 Set version to 2.1.1-SNAPSHOT for branch-2.1 2018-07-25 21:45:09 +08:00
Allan Yang 3251554737 HBASE-20867 RS may get killed while master restarts 2018-07-25 18:11:28 +08:00
zhangduo 833657c46d HBASE-20846 Restore procedure locks when master restarts 2018-07-25 14:37:36 +08:00
huzheng 1dbfe92dbf HBASE-20565 ColumnRangeFilter combined with ColumnPaginationFilter can produce incorrect result 2018-07-24 10:39:36 +08:00
Allan Yang 44bf7076b7 HBASE-20878 Data loss if merging regions while ServerCrashProcedure executing 2018-07-24 09:51:46 +08:00
Allan Yang af2742fcf2 HBASE-20893 Data loss if splitting region while ServerCrashProcedure executing 2018-07-23 14:35:27 +08:00
Reid Chan 9d481f1faa HBASE-20401 Make MAX_WAIT and waitIfNotFinished in CleanerContext configurable (Contributed by Stephen Wu) 2018-07-23 10:33:26 +08:00
Michael Stack 46e5baf670 HBASE-20914 Trim Master memory usage
Add (weak reference) interning of ServerNames.

Correct Balancer regions x racks matrix.

Make smaller defaults when creating ArrayDeques.
2018-07-20 10:08:13 -07:00
Allan Yang 679698a7f2 HBASE-20870 Wrong HBase root dir in ITBLL's Search Tool 2018-07-20 11:31:21 +08:00
Yu Li 9ac26b80b2 HBASE-20907 Fix Intermittent failure on TestProcedurePriority 2018-07-19 12:01:29 +08:00
Michael Stack cecce16fad HBASE-20875 MemStoreLABImp::copyIntoCell uses 7% CPU when writing
Make the #copyCellInto method smaller so it inlines; we do it by
checking for the common type early and then taking a code path
that presumes ByteBufferExtendedCell -- avoids checks.
2018-07-18 20:40:47 -07:00
Toshihiro Suzuki e14b49080b HBASE-20865 CreateTableProcedure is stuck in retry loop in CREATE_TABLE_WRITE_FS_LAYOUT state
Signed-off-by: tedyu <yuzhihong@gmail.com>
2018-07-13 08:31:59 -07:00
Toshihiro Suzuki 881409bd0b HBASE-19572 RegionMover should use the configured default port number and not the one from HConstants
Signed-off-by: Reid Chan <reidchan@apache.org>
2018-07-13 10:46:51 +08:00
Allan Yang 368b1b1060 HBASE-20860 Merged region's RIT state may not be cleaned after master restart 2018-07-12 12:16:49 +08:00
zhangduo 8eab6d7a45 HBASE-20847 Addendum use addFront instead of addBack to add sub procedure 2018-07-12 08:31:40 +08:00
zhangduo 113652eb88 HBASE-20847 The parent procedure of RegionTransitionProcedure may not have the table lock 2018-07-11 17:37:27 +08:00
zhaoyuan 8de69db143 HBASE-20697 Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()
Signed-off-by: Guanghao Zhang <zghao@apache.org>
2018-07-11 11:17:41 +08:00
zhangduo 5e25bc92cf HBASE-20784 Will lose the SNAPSHOT suffix if we get the version of RS from ServerManager 2018-07-10 10:00:15 +08:00
Abhishek Singh Chouhan dddf15ae6b HBASE-20806 Split style journal for flushes and compactions 2018-07-09 12:42:20 -07:00
Balazs Meszaros da7fef6bf5
HBASE-20833 Modify pre-upgrade coprocessor validator to support table level coprocessors
- -jar parameter now accepts multiple jar files and directories of jar files.
- observer classes can be verified by -class option.
- -table parameter was added to check table level coprocessors.
- -config parameter was added to obtain the coprocessor classes from
  HBase cofiguration.
- -scan option was removed.

Signed-off-by: Mike Drob <mdrob@apache.org>
2018-07-09 14:19:12 -05:00
zhangduo 5a40606422 HBASE-20822 TestAsyncNonMetaRegionLocator is flakey 2018-07-09 14:56:45 +08:00
Nihal Jain 927ac8228f HBASE-20808 (Addendum) Remove duplicate calls for cancelling of chores
Signed-off-by: Reid Chan <reidchan@apache.org>
2018-07-07 00:21:08 +08:00
Nihal Jain 3ed9350233 HBASE-20808 Wrong shutdown order between Chores and ChoreService
Signed-off-by: Reid Chan <reidchan@apache.org>
2018-07-07 00:20:08 +08:00
zhangduo a2db3d27ff HBASE-20849 Set version as 2.1.0 in branch-2.1 in prep for first RC 2018-07-06 15:32:23 +08:00
zhangduo 159f1b4686 Revert "HBASE-20808 Wrong shutdown order between Chores and ChoreService"
For cutting 2.1.0RC0

This reverts commit ae2c858c5e.
2018-07-06 15:29:58 +08:00
Nihal Jain ae2c858c5e HBASE-20808 Wrong shutdown order between Chores and ChoreService
Signed-off-by: Reid Chan <reidchan@apache.org>
2018-07-06 11:38:17 +08:00
Yu Li d61bb64e93 HBASE-20691 Change the default WAL storage policy back to "NONE""
This reverts commit 564c193d61 and added more doc
about why we choose "NONE" as the default.
2018-07-04 13:45:54 +08:00
Guangxu Cheng 60ebdd9fd8 HBASE-20474 Show non-RPC tasks on master/regionserver Web UI by default 2018-07-04 10:54:21 +08:00
zhangduo 5dacfe9427 HBASE-20839 Fallback to FSHLog if we can not instantiated AsyncFSWAL when user does not specify AsyncFSWAL explicitly 2018-07-04 10:29:36 +08:00
zhangduo fedbd00ef1 HBASE-20829 Remove the addFront assertion in MasterProcedureScheduler.doAdd 2018-07-04 09:41:02 +08:00
Ted Yu 927a957390 HBASE-20244 NoSuchMethodException when retrieving private method decryptEncryptedDataEncryptionKey from DFSClient
Signed-off-by: zhangduo <zhangduo@apache.org>
2018-07-03 22:16:04 +08:00
huzheng ce99588530 HBASE-20789 TestBucketCache#testCacheBlockNextBlockMetadataMissing is flaky 2018-07-03 18:05:17 +08:00
jingyuntian 6b8cd00ec0 HBASE-20193 Basic Replication Web UI - Regionserver
Signed-off-by: zhangduo <zhangduo@apache.org>
2018-07-03 16:10:31 +08:00
Josh Elser 8f9c322cda HBASE-20826 Truncate really long RpcServer warnings unless TRACE is on
Signed-off-by: zhangduo <zhangduo@apache.org>
Signed-off-by: Ted Yu <tyu@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
2018-07-03 10:14:57 +08:00
Ankit Singhal d22c6de648 HBASE-20825 Fix pre and post hooks of CloneSnapshot and RestoreSnapshot for Access checks
Signed-off-by: Josh Elser <elserj@apache.org>
Signed-off-by: Ted Yu <tyu@apache.org>
Signed-off-by: zhangduo <zhangduo@apache.org>
2018-07-03 10:06:13 +08:00
Ankit Singhal 4a80b19d7f HBASE-20817 Infinite loop when executing ReopenTableRegionsProcedure
Signed-off-by: zhangduo <zhangduo@apache.org>
2018-07-02 21:29:09 +08:00
Josh Elser 44573b54c1 HBASE-20792 info:servername and info:sn inconsistent for OPEN region
Signed-off-by: zhangduo <zhangduo@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
2018-06-29 11:11:22 +08:00
Michael Stack becb638370 HBASE-20781 Save recalculating families in a WALEdit batch of Cells
Pass the Set of families through to the WAL rather than recalculate
a Set already known.

Signed-off-by: zhangduo <zhangduo@apache.org>
2018-06-27 22:04:22 -07:00
Reid Chan 43c0df51ea HBASE-20732 Shutdown scan pool when master is stopped
Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>
2018-06-28 12:54:54 +08:00
Sahil Aggarwal 4ba2abf43b
HBASE-19164: Remove UUID.randomUUID in tests.
Signed-off-by: Mike Drob <mdrob@apache.org>
2018-06-27 10:37:15 -05:00
jingyuntian bd40cba8dd HBASE-20194 Basic Replication WebUI - Master
Signed-off-by: zhangduo <zhangduo@apache.org>
2018-06-26 18:29:01 +08:00
zhangduo 07afb7e32f HBASE-20777 RpcConnection could still remain opened after we shutdown the NettyRpcServer 2018-06-26 09:08:05 +08:00
Michael Stack 3f319bef8d HBASE-20780 ServerRpcConnection logging cleanup Get rid of one of the logging lines in ServerRpcConnection by amalgamating all into one new-style log line. 2018-06-25 16:44:07 -07:00
Todd Lipcon 3673bfc241 HBASE-20403. Fix race between prefetch task and non-pread HFile reads
With prefetch-on-open enabled, the task doing the prefetching was using
non-positional (i.e. streaming) reads. If the main (non-prefetch) thread
was also using non-positional reads, these two would conflict, because
inputstreams are not thread-safe for non-positional reads.

In the case of an encrypted filesystem, this could cause JVM crashes,
etc, as underlying cipher buffers were freed underneath the racing
threads. In the case of a non-encrypted filesystem, less severe errors
would be thrown. The included unit test reproduces the latter case.

(cherry picked from commit 025ddce868)
Signed-off-by: Todd Lipcon <todd@cloudera.com>
2018-06-25 12:12:30 -07:00
Michael Stack d6cea08efe
HBASE-20770 WAL cleaner logs way too much; gets clogged when lots of work to do
General log cleanup; setting stuff that can flood the log to TRACE.
2018-06-25 12:12:03 -07:00
Michael Stack 7c45f02110 HBASE-20778 Make it so WALPE runs on DFS 2018-06-23 23:34:40 -07:00
zhangduo eb67404cef HBASE-20775 TestMultiParallel is flakey 2018-06-24 08:42:53 +08:00
zhangduo f3061a67fc HBASE-18569 Add prefetch support for async region locator 2018-06-22 18:20:21 +08:00
zhangduo a86141b625 HBASE-20752 Make sure the regions are truly reopened after ReopenTableRegionsProcedure 2018-06-22 14:06:29 +08:00
zhangduo 6cebe06225 HBASE-20767 Always close hbaseAdmin along with connection in HBTU 2018-06-22 10:20:06 +08:00
Ankit Singhal 28d0d8c5cc HBASE-20642 Clients should re-use the same nonce across DDL operations
Also changes modify table operations to help the case where a MTP spans
two master, avoiding the sanity-checks propagating back to the client
unnecessarily.

Signed-off-by: Josh Elser <elserj@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
2018-06-20 15:10:52 -07:00
Josh Elser c8b76eb3f1 HBASE-20706 Prevent MTP from trying to reopen non-OPEN regions
ModifyTableProcedure is using MoveRegionProcedure in a way
that was unintended from the original implementation. As such,
we have to guard against certain usages of it. We know we can
re-open OPEN regions, but regions in OPENING will similarly
soon be OPEN (thus, we want to reopen those regions too).

Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: zhangduo <zhangduo@apache.org>
2018-06-20 14:30:36 -07:00
zhangduo 5c2cb15e0b HBASE-20739 Add priority for SCP 2018-06-20 15:18:47 +08:00
zhangduo 83969b0da1 HBASE-20742 Always create WAL directory for region server 2018-06-20 14:21:28 +08:00
Michael Stack 9eeb501825 HBASE-20745 Log when master proc wal rolls 2018-06-19 19:53:29 -07:00
zhangduo 3e33aecea2 HBASE-20708 Remove the usage of RecoverMetaProcedure in master startup 2018-06-19 15:09:11 +08:00
Sean Busbey ee84a8f243 HBASE-20332 shaded mapreduce module shouldn't include hadoop
* modify the jar checking script to take args; make hadoop stuff optional
* separate out checking the artifacts that have hadoop vs those that don't.
* * Unfortunately means we need two modules for checking things
* * put in a safety check that the support script for checking jar contents is maintained in both modules
* * have to carve out an exception for o.a.hadoop.metrics2. :(
* fix duplicated class warning
* clean up dependencies in hbase-server and some modules that depend on it.
* allow Hadoop to have its own htrace where it needs it
* add a precommit check to make sure we're not using old htrace imports

 Conflicts:
	hbase-backup/pom.xml
	hbase-checkstyle/src/main/resources/hbase/checkstyle-suppressions.xml

Signed-off-by: Mike Drob <mdrob@apache.org>
2018-06-18 14:02:48 -07:00
Mike Drob b04c976fe6 HBASE-20478 Update checkstyle to v8.2
Cannot go to latest (8.9) yet due to
  https://github.com/checkstyle/checkstyle/issues/5279

* move hbaseanti import checks to checkstyle
* implment a few missing equals checks, and ignore one
* fix lots of javadoc errors

Signed-off-by: Sean Busbey <busbey@apache.org>
2018-06-18 14:02:40 -07:00
taiynlee 8edd5d948a HBASE-20737 put collection into ArrayList instead of addAll function
Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>
2018-06-17 11:16:16 +08:00
tedyu b2afba580b HBASE-20723 Custom hbase.wal.dir results in data loss because we write recovered edits into a different place than where the recovering region server looks for them 2018-06-16 01:34:53 -07:00
Xu Cang b68746c0b2 HBASE-20695 Implement table level RegionServer replication metrics
Signed-off-by: Guanghao Zhang <zghao@apache.org>
2018-06-15 10:45:13 +08:00
jingyuntian bde9f08a83 HBASE-20625 refactor some WALCellCodec related code
Signed-off-by: Guanghao Zhang <zghao@apache.org>
2018-06-14 19:46:33 +08:00
zhangduo 161dc7c7f3 HBASE-20722 Make RegionServerTracker only depend on children changed event 2018-06-14 08:38:53 +08:00
Guanghao Zhang 075523dd1e HBASE-20561 The way we stop a ReplicationSource may cause the RS down 2018-06-13 18:05:27 +08:00
Balazs Meszaros d44e8a7aff HBASE-20656 Validate pre-2.0 coprocessors against HBase 2.0+
Signed-off-by: Mike Drob <mdrob@apache.org>
2018-06-11 10:32:40 -05:00
Mike Drob 4b0bbd839e HBASE-20707 Move MissingSwitchDefault case check
Perform this check using error-prone instead of checkstyle because the
former can handle enum switches somewhat more intelligently.
2018-06-11 10:13:29 -05:00
zhangduo 6befdc43ba HBASE-20700 Move meta region when server crash can cause the procedure to be stuck 2018-06-11 15:28:21 +08:00
Guanghao Zhang 4d971d0f48 HBASE-20698 (addendum) Master don't record right server version until new started region server call regionServerReport method 2018-06-10 08:32:01 +08:00
Guanghao Zhang 9d15e16946 HBASE-20698 Master don't record right server version until new started region server call regionServerReport method 2018-06-09 14:47:07 +08:00
Nihal Jain 4a5fe54d94 HBASE-20699 QuotaCache should cancel the QuotaRefresherChore service inside its stop()
Signed-off-by: tedyu <yuzhihong@gmail.com>
2018-06-08 07:22:52 -07:00
Michael Stack 858eee20ec HBASE-20702 Processing crash, skip ONLINE'ing empty rows
Signed-off-by: Josh Elser <elserj@apache.org>
2018-06-07 09:54:32 -07:00
eric-maynard 271d93dc73 HBASE-20665: Changed log level of HBASE-8547 warning to debug
Closes #77

Signed-off-by: Josh Elser <elserj@apache.org>
Signed-off-by: Sean Busbey <busbey@apache.org>
2018-06-07 11:34:52 -04:00
Peter Somogyi 00289b8ffa HBASE-20683 Incorrect return value for PreUpgradeValidator
Signed-off-by: Ted Yu <yuzhihong@gmail.com>
Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>
2018-06-06 20:04:22 +02:00
Andrew Purtell d7b09de854 HBASE-20670 NPE in HMaster#isInMaintenanceMode 2018-06-04 15:19:45 -07:00
Michael Stack 063eefe3b0 HBASE-20634 Reopen region while server crash can cause the procedure to be stuck; ADDENDUM 2018-06-04 12:38:56 -07:00
Michael Stack 27e2c8c86b HBASE-20628 SegmentScanner does over-comparing when one flushing
Signed-off-by: eshcar <eshcar@oath.com>
Signed-off-by: anoopsjohn <anoopsamjohn@gmail.com>
2018-06-04 09:50:13 -07:00
zhangduo d834859404 HBASE-20634 Reopen region while server crash can cause the procedure to be stuck
A reattempt at fixing HBASE-20173 [AMv2] DisableTableProcedure concurrent to ServerCrashProcedure can deadlock

The scenario is a SCP after processing WALs, goes to assign regions that
were on the crashed server but a concurrent Procedure gets in there
first and tries to unassign a region that was on the crashed server
(could be part of a move procedure or a disable table, etc.). The
unassign happens to run AFTER SCP has released all RPCs that
were going against the crashed server. The unassign fails because the
server is crashed. The unassign used to suspend itself only it would
never be woken up because the server it was going against had already
been processed. Worse, the SCP could not make progress because the
unassign was suspended with the lock on a region that it wanted to
assign held making it so it could make no progress.

In here, we add to the unassign recognition of the state where it is
running post SCP cleanup of RPCs. If present, unassign moves to finish
instead of suspending itself.

Includes a nice unit test made by Duo Zhang that reproduces nicely the
hung scenario.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/FailedRemoteDispatchException.java
 Moved this class back to hbase-procedure where it belongs.

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/NoNodeDispatchException.java
M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/NoServerDispatchException.java
M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/NullTargetServerDispatchException.java
 Specializiations on FRDE so we can be more particular when we say there
 was a problem.

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/RemoteProcedureDispatcher.java
 Change addOperationToNode so we throw exceptions that give more detail
 on issue rather than a mysterious true/false

M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
 Undo SERVER_CRASH_HANDLE_RIT2. Bad idea (from HBASE-20173)

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
 Have expireServer return true if it actually queued an expiration. Used
 later in this patch.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
 Hide methods that shouldn't be public. Add a particular check used out
 in unassign procedure failure processing.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MoveRegionProcedure.java
 Check that server we're to move from is actually online (might
 catch a few silly move requests early).

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStates.java
 Add doc on ServerState. Wasn't being used really. Now we actually stamp
 a Server OFFLINE after its WAL has been split. Means its safe to assign
 since all WALs have been processed. Add methods to update SPLITTING
 and to set it to OFFLINE after splitting done.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
 Change logging to be new-style and less repetitive of info.
 Cater to new way in which .addOperationToNode returns info (exceptions
 rather than true/false).

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java
 Add looking for the case where we failed assign AND we should not
 suspend because we will never be woken up because SCP is beyond
 doing this for all stuck RPCs.

 Some cleanup of the failure processing grouping where we can proceed.

 TODOs have been handled in this refactor including the TODO that
 wonders if it possible that there are concurrent fails coming in
 (Yes).

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerCrashProcedure.java
 Doc and removing the old HBASE-20173 'fix'.
 Also updating ServerStateNode post WAL splitting so it gets marked
 OFFLINE.

A hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestServerCrashProcedureStuck.java
 Nice test by Duo Zhang.

Signed-off-by: Umesh Agashe <uagashe@cloudera.com>
Signed-off-by: Duo Zhang <palomino219@gmail.com>
Signed-off-by: Mike Drob <mdrob@apache.org>
2018-06-04 09:26:36 -07:00
maoling 4c95b82b61 HBASE-19761:Fix Checkstyle errors in hbase-zookeeper
Signed-off-by: Jan Hentschel <jan.hentschel@ultratendency.com>
2018-06-02 10:17:27 +02:00
Andrew Purtell f46569a742 HBASE-20667 Rename TestGlobalThrottler to TestReplicationGlobalThrottler 2018-06-01 17:01:14 -07:00
Xu Cang d3e2248f12 HBASE-18116 Replication source in-memory accounting should not include bulk transfer hfiles
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2018-06-01 11:16:16 -07:00
Peter Somogyi 53d29d53c4 HBASE-20592 Create a tool to verify tables do not have prefix tree encoding
Signed-off-by: Mike Drob <mdrob@apache.org>
2018-06-01 19:22:49 +02:00
Andrew Purtell b22409d51d Revert "HBASE-18116 fix replication source in-memory calculation by excluding bulk load file"
This reverts commit 050fae501a.
2018-05-31 15:28:37 -07:00
Xu Cang 050fae501a HBASE-18116 fix replication source in-memory calculation by excluding bulk load file
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2018-05-31 14:22:12 -07:00
Sean Busbey fc9743c17a HBASE-20444 Addendum keep folks from looking at raw version component array.
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2018-05-31 14:17:41 -05:00
Andrew Purtell aaec02e0f5 HBASE-20646 TestWALProcedureStoreOnHDFS failing on branch-1 2018-05-30 14:44:54 -07:00
Andrew Purtell 15bb234d51 Revert "TestWALProcedureStoreOnHDFS failing on branch-1"
This reverts commit 694e79a67e.
2018-05-30 14:44:49 -07:00
Andrew Purtell 694e79a67e TestWALProcedureStoreOnHDFS failing on branch-1 2018-05-30 13:46:08 -07:00
zhangduo b785896cbd HBASE-20659 Implement a reopen table regions procedure 2018-05-30 20:03:35 +08:00
tedyu 856a3ac154 HBASE-20639 Implement permission checking through AccessController instead of RSGroupAdminEndpoint - revert due to pending discussion 2018-05-29 19:58:32 -07:00
Andrew Purtell 2dc51934f4 HBASE-20597 Serialize access to a shared reference to ZooKeeperWatcher in HBaseReplicationEndpoint 2018-05-29 11:29:12 -07:00
Andrew Purtell 7f154dc20e Revert "HBASE-20597 Use a lock to serialize access to a shared reference to ZooKeeperWatcher in HBaseReplicationEndpoint"
This reverts commit 60dcef289b.
2018-05-29 11:24:30 -07:00
Nihal Jain d36cce1574 HBASE-20633 Dropping a table containing a disable violation policy fails to remove the quota upon table delete
Signed-off-by: Josh Elser <elserj@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
2018-05-29 11:50:40 -04:00
eshcar aa00391140 HBASE-20390 ADDENDUM 2: fix TestHRegionWithInMemoryFlush OOME 2018-05-29 16:24:27 +03:00
eshcar cf1928aaca HBASE-20390-ADDENDUM: fix TestHRegionWithInMemoryFlush OOME 2018-05-29 13:01:07 +03:00