hbase

Commit Graph

Author	SHA1	Message	Date
zhangduo	7e4cb7d7ec	HBASE-21323 Should not skip force updating for a sub procedure even if it has been finished Reapplication after fixing failing test.	2018-10-19 15:25:15 -07:00
Duo Zhang	63f718974b	HBASE-21075 Confirm that we can (rolling) upgrade from 2.0.x and 2.1.x to 2.2.x after HBASE-20881 Signed-off-by: Michael Stack <stack@apache.org>	2018-10-19 12:34:36 -07:00
Michael Stack	0cd23c3dda	Revert "HBASE-21323 Should not skip force updating for a sub procedure even if it has been finished" This reverts commit `fffd9b9b6d`. Revert till we figure why behavior between 2.1 and 2.2 is different.	2018-10-18 20:04:24 -07:00
Michael Stack	8fd3fd0e9c	Revert "HBASE-21323 Should not skip force updating for a sub procedure even if" This reverts commit `30727764a3`. Revert till we figure why behavior between 2.1 and 2.2 is different.	2018-10-18 20:03:57 -07:00
tianjingyun	915e87ecf7	HBASE-21291 Add a test for bypassing stuck state-machine procedures Signed-off-by: Michael Stack <stack@apache.org>	2018-10-18 14:26:47 -07:00
Michael Stack	30727764a3	HBASE-21323 Should not skip force updating for a sub procedure even if it has been finished; ADDENDUM Fix broke unit test.	2018-10-18 13:48:02 -07:00
zhangduo	fffd9b9b6d	HBASE-21323 Should not skip force updating for a sub procedure even if it has been finished	2018-10-18 14:44:31 +08:00
Duo Zhang	85c3ec3fb4	HBASE-21315 The getActiveMinProcId and getActiveMaxProcId of BitSetNode are incorrect if there are no active procedure	2018-10-16 15:42:10 +08:00
Duo Zhang	c3401d4327	HBASE-21254 Need to find a way to limit the number of proc wal files	2018-10-12 11:47:48 +08:00
zhangduo	5a300f3fc9	HBASE-21250 Refactor WALProcedureStore and add more comments for better understanding the implementation	2018-10-07 17:16:09 +08:00
Michael Stack	9d34b4581c	HBASE-21242 [amv2] Miscellaneous minor log and assign procedure create improvements For RIT Duration, do better than print ms/seconds. Remove redundant UI column dedicated to duration when we log it in the status field too. Make bypass log at INFO level. Make it so on complete of subprocedure, we note count of outstanding siblings so we have a clue how much further the parent has to go before it is done (Helpful when hundreds of servers doing SCP). Have the SCP run the AP preflight check before creating an AP; saves creation of thousands of APs during fixup. Don't log tablename three times when reporting remote call failed. If lock is held already, note who has it. Also log after we get lock or if we have to wait rather than log on entrance though we may later have to wait (or we may have just picked up the lock). Signed-off-by: Mike Drob <mdrob@apache.org>	2018-10-04 17:18:13 -07:00
Michael Stack	8fc90a23ae	HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign Adds override to assigns and unassigns. Changes bypass 'force' to align calling the param 'override' instead. Adds recursive to 'bypass', a means of calling bypass on parent and its subprocedures (usually bypass works on leaf nodes rippling the bypass up to parent -- recursive has us work in the opposite direction): EXPERIMENTAL. bypass on an assign/unassign leaves region in RIT and the RegionStateNode loaded with the bypassed procedure. First implementation had assign/unassign cleanup leftover state. Second implementation, on feedback, keeps the state in place as a fence against other Procedures assuming the region entity, and instead adds an 'override' function that hbck2 can set on assigns/unassigns to override the fencing. Note that the below also converts ProcedureExceptions that come out of the Pv2 system into DoNotRetryIOEs. It is a little awkward because DNRIOE is in client-module, not in procedure module. Previous, we'd just keep retrying the bypass, etc. M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java Have bypass take an environment like all other methods so subclasses. Fix javadoc issues. M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java Javadoc issues. Pass environment when we invoke bypass. M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java Rename waitUntilNamespace... etc. to align with how these method types are named elsehwere .. i.e. waitFor rather than waitUntil.. M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java Cleanup message we emit when we find an exisitng procedure working against this entity. Add support for a force function which allows Assigns/Unassigns force ownership of the Region entity. A hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestRegionBypass.java Test bypass and force. M hbase-shell/src/main/ruby/shell/commands/list_procedures.rb Minor cleanup of the json output... do iso8601 timestamps.	2018-10-04 16:37:37 -07:00
Michael Stack	259d12f739	Revert "Revert "Revert "HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign""" This reverts commit `2174461cf7`. Revert because not ready to port to other branches.	2018-09-29 04:06:46 -07:00
Michael Stack	2174461cf7	Revert "Revert "HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign"" This reverts commit `b96905d1df`. i.e. a revert of a revert so a reapplication! Revert so I can add signed-off-by.... Signed-off-by: Allan Yang <allan163@apache.org>	2018-09-29 03:34:36 -07:00
Michael Stack	b96905d1df	Revert "HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign" This reverts commit `b42d7978cb`.	2018-09-29 03:34:10 -07:00
Michael Stack	b42d7978cb	HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign bypass on an assign/unassign leaves region in RIT and the RegionStateNode loaded with the bypassed procedure. First implementation had assign/unassign cleanup leftover state. Second implementation, on feedback, keeps the state in place as a fence against other Procedures assuming the region entity, and instead adds an 'override' function that hbck2 can set on assigns/unassigns to override the fencing. Note that the below also converts ProcedureExceptions that come out of the Pv2 system into DoNotRetryIOEs. It is a little awkward because DNRIOE is in client-module, not in procedure module. Previous, we'd just keep retrying the bypass, etc. M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java Have bypass take an environment like all other methods so subclasses. Fix javadoc issues. M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java Javadoc issues. Pass environment when we invoke bypass. M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java Rename waitUntilNamespace... etc. to align with how these method types are named elsehwere .. i.e. waitFor rather than waitUntil.. M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java Cleanup message we emit when we find an exisitng procedure working against this entity. Add support for a force function which allows Assigns/Unassigns force ownership of the Region entity. A hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestRegionBypass.java Test bypass and force. M hbase-shell/src/main/ruby/shell/commands/list_procedures.rb Minor cleanup of the json output... do iso8601 timestamps.	2018-09-29 03:33:07 -07:00
meiyi	8dea600795	HBASE-21249 Add jitter for ProcedureUtil.getBackoffTimeMs Signed-off-by: zhangduo <zhangduo@apache.org>	2018-09-28 21:28:16 +08:00
zhangduo	4947e72f63	HBASE-21233 Allow the procedure implementation to skip persistence of the state after a execution	2018-09-28 11:14:49 +08:00
Umesh Agashe	e6c7ed34e0	HBASE-21023 Added bypassProcedure() API to HbckService	2018-09-19 15:01:29 -07:00
Michael Stack	487f713c63	HBASE-21190 Log files and count of entries in each as we load from the MasterProcWAL store	2018-09-12 10:19:46 -07:00
Duo Zhang	2da6dbe563	HBASE-21172 Reimplement the retry backoff logic for ReopenTableRegionsProcedure	2018-09-12 16:01:55 +08:00
TAK LON WU	2c19b04274	HBASE-21181 Use the same filesystem for wal archive directory and wal directory Signed-off-by: Andrew Purtell <apurtell@apache.org>	2018-09-11 15:50:41 -07:00
Michael Stack	f755ded2d2	HBASE-21171 [amv2] Tool to parse a directory of MasterProcWALs standalone Signed-off-by: Mike Drob <mdrob@apache.org>	2018-09-08 20:35:15 -07:00
Michael Stack	205783419c	HBASE-21155 Save on a few log strings and some churn in wal splitter by skipping out early if no logs in dir	2018-09-06 16:36:59 -07:00
Allan Yang	e33591515c	HBASE-21083 Introduce a mechanism to bypass the execution of a stuck procedure	2018-08-28 20:18:47 -07:00
Michael Stack	d954031d50	HBASE-21078 [amv2] CODE-BUG NPE in RTP doing Unassign	2018-08-24 13:22:16 -07:00
Allan Yang	ee3507d456	HBASE-21050 Exclusive lock may be held by a SUCCESS state procedure forever Signed-off-by: Michael Stack <stack@apache.org> Signed-off-by: zhangduo <zhangduo@apache.org>	2018-08-15 15:39:15 -07:00
Allan Yang	e1188d27f5	HBASE-20978 [amv2] Worker terminating UNNATURALLY during MoveRegionProcedure Signed-off-by: Michael Stack <stack@apache.org> Signed-off-by: Duo Zhang <zhangduo@apache.org>	2018-08-14 16:29:58 -07:00
Allan Yang	26828b1860	HBASE-20975 Lock may not be taken or released while rolling back procedure	2018-08-13 20:12:57 +08:00
jackbearden	2d12e1ecf0	HBASE-20981. Rollback stateCount accounting thrown-off when exception out of rollbackState Signed-off-by: Michael Stack <stack@apache.org>	2018-08-11 11:58:12 -07:00
Michael Stack	88f3148810	HBASE-20989 Minor, miscellaneous logging fixes Signed-off-by: Zach York <zyork@amazon.com> Signed-off-by: Mingliang Liu <liuml07@apache.org>	2018-08-01 11:20:01 -07:00
Alex Leblang	31cbd7ab8f	HBASE-19369 Switch to Builder Pattern In WAL This patch switches to the builder pattern by adding a helper method. It also checks to ensure that the pattern is available (i.e. that HBase is running on a hadoop version that supports it). Amending-Author: Mike Drob <mdrob@apache.org> Signed-off-by: tedyu <yuzhihong@gmail.com> Signed-off-by: zhangduo <zhangduo@apache.org>	2018-07-27 23:43:08 -05:00
zhangduo	8bfdb19e85	HBASE-20939 There will be race when we call suspendIfNotReady and then throw ProcedureSuspendedException	2018-07-27 21:30:23 +08:00
zhangduo	1777ea3aae	HBASE-20938 Set version to 2.1.1-SNAPSHOT for branch-2.1	2018-07-25 21:45:09 +08:00
zhangduo	833657c46d	HBASE-20846 Restore procedure locks when master restarts	2018-07-25 14:37:36 +08:00
Michael Stack	46e5baf670	HBASE-20914 Trim Master memory usage Add (weak reference) interning of ServerNames. Correct Balancer regions x racks matrix. Make smaller defaults when creating ArrayDeques.	2018-07-20 10:08:13 -07:00
zhangduo	113652eb88	HBASE-20847 The parent procedure of RegionTransitionProcedure may not have the table lock	2018-07-11 17:37:27 +08:00
zhangduo	a2db3d27ff	HBASE-20849 Set version as 2.1.0 in branch-2.1 in prep for first RC	2018-07-06 15:32:23 +08:00
Yu Li	d61bb64e93	HBASE-20691 Change the default WAL storage policy back to "NONE"" This reverts commit `564c193d61` and added more doc about why we choose "NONE" as the default.	2018-07-04 13:45:54 +08:00
Michael Stack	9eeb501825	HBASE-20745 Log when master proc wal rolls	2018-06-19 19:53:29 -07:00
zhangduo	3e33aecea2	HBASE-20708 Remove the usage of RecoverMetaProcedure in master startup	2018-06-19 15:09:11 +08:00
Josh Elser	1725094e6b	HBASE-19735 Create a client-tarball assembly Provides an extra client descriptor to build a second tarball with a reduced set of dependencies. Not of great impact now, but will build the way for better in the future. Signed-off-by: Sean Busbey <busbey@apache.org> Conflicts: hbase-assembly/pom.xml Conflicts: hbase-spark/pom.xml	2018-06-18 14:03:22 -07:00
zhangduo	6befdc43ba	HBASE-20700 Move meta region when server crash can cause the procedure to be stuck	2018-06-11 15:28:21 +08:00
zhangduo	d834859404	HBASE-20634 Reopen region while server crash can cause the procedure to be stuck A reattempt at fixing HBASE-20173 [AMv2] DisableTableProcedure concurrent to ServerCrashProcedure can deadlock The scenario is a SCP after processing WALs, goes to assign regions that were on the crashed server but a concurrent Procedure gets in there first and tries to unassign a region that was on the crashed server (could be part of a move procedure or a disable table, etc.). The unassign happens to run AFTER SCP has released all RPCs that were going against the crashed server. The unassign fails because the server is crashed. The unassign used to suspend itself only it would never be woken up because the server it was going against had already been processed. Worse, the SCP could not make progress because the unassign was suspended with the lock on a region that it wanted to assign held making it so it could make no progress. In here, we add to the unassign recognition of the state where it is running post SCP cleanup of RPCs. If present, unassign moves to finish instead of suspending itself. Includes a nice unit test made by Duo Zhang that reproduces nicely the hung scenario. M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/FailedRemoteDispatchException.java Moved this class back to hbase-procedure where it belongs. M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/NoNodeDispatchException.java M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/NoServerDispatchException.java M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/NullTargetServerDispatchException.java Specializiations on FRDE so we can be more particular when we say there was a problem. M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/RemoteProcedureDispatcher.java Change addOperationToNode so we throw exceptions that give more detail on issue rather than a mysterious true/false M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto Undo SERVER_CRASH_HANDLE_RIT2. Bad idea (from HBASE-20173) M hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java Have expireServer return true if it actually queued an expiration. Used later in this patch. M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java Hide methods that shouldn't be public. Add a particular check used out in unassign procedure failure processing. M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MoveRegionProcedure.java Check that server we're to move from is actually online (might catch a few silly move requests early). M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStates.java Add doc on ServerState. Wasn't being used really. Now we actually stamp a Server OFFLINE after its WAL has been split. Means its safe to assign since all WALs have been processed. Add methods to update SPLITTING and to set it to OFFLINE after splitting done. M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java Change logging to be new-style and less repetitive of info. Cater to new way in which .addOperationToNode returns info (exceptions rather than true/false). M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java Add looking for the case where we failed assign AND we should not suspend because we will never be woken up because SCP is beyond doing this for all stuck RPCs. Some cleanup of the failure processing grouping where we can proceed. TODOs have been handled in this refactor including the TODO that wonders if it possible that there are concurrent fails coming in (Yes). M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerCrashProcedure.java Doc and removing the old HBASE-20173 'fix'. Also updating ServerStateNode post WAL splitting so it gets marked OFFLINE. A hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestServerCrashProcedureStuck.java Nice test by Duo Zhang. Signed-off-by: Umesh Agashe <uagashe@cloudera.com> Signed-off-by: Duo Zhang <palomino219@gmail.com> Signed-off-by: Mike Drob <mdrob@apache.org>	2018-06-04 09:26:36 -07:00
zhangduo	b785896cbd	HBASE-20659 Implement a reopen table regions procedure	2018-05-30 20:03:35 +08:00
Sean Busbey	61f96b6ffa	HBASE-20544 Make HBTU default to random ports. Signed-off-by: Umesh Agashe <uagashe@cloudera.com> Signed-off-by: Josh Elser <elserj@apache.org> Conflicts: hbase-backup/src/test/resources/hbase-site.xml hbase-spark-it/src/test/resources/hbase-site.xml hbase-spark/src/test/resources/hbase-site.xml	2018-05-09 23:45:39 -07:00
Chia-Ping Tsai	984fb5bd05	HBASE-20169 NPE when calling HBTU.shutdownMiniCluster (TestAssignmentManagerMetrics is flakey); AMENDMENT	2018-05-02 16:14:38 -07:00
Michael Stack	da3e06afab	HBASE-20492 UnassignProcedure is stuck in retry loop on region stuck in OPENING state Add backoff when stuck in RegionTransitionProcedure, the subclass of AssignProcedure and UnassignProcedure. Can happen when we go to transition but the current Region state is not what we expect. M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java Add doc on being able to suspend and wait on a timeout. M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto Add 'attempt' counter so we can do backoff when we get stuck. M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignProcedure.java M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java Add persistence of new 'attempt' counter M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java Doc data members that are persisted by subclasses given this is 'odd'. Add a counter for 'attempts' used when 'stuck' to implement backoff. Add suspend with timeout when 'stuck'. Add callback when timeout is exhausted which does wakeup of this procedure. A hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestUnexpectedStateException.java Test of backoff.	2018-04-30 17:58:27 -07:00
Wei-Chiu Chuang	b1901c9a15	HBASE-20338 WALProcedureStore#recoverLease() should have fixed sleeps for retrying rollWriter() Signed-off-by: Mike Drob <mdrob@apache.org> Signed-off-by: Umesh Agashe <uagashe@cloudera.com> Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>	2018-04-12 16:35:11 -05:00
Umesh Agashe	ec6295bed0	HBASE-20330 ProcedureExecutor.start() gets stuck in recover lease on store rollWriter() fails after creating the file and returns false. In next iteration of while loop in recoverLease() file list is refreshed. Signed-off-by: Appy <appy@cloudera.com>	2018-04-11 16:07:23 -07:00
Josh Elser	c3d82a283d	HBASE-20223 Update to hbase-thirdparty 2.1.0 Remove commons-cli and commons-collections4 use. Account for the newer internal protobuf version of 3.5.1. Signed-off-by: Michael Stack <stack@apache.org> Signed-off-by: Mike Drob <mdrob@apache.org>	2018-03-26 16:07:39 -04:00
Umesh Agashe	96d63fee11	HBASE-20224 Web UI is broken in standalone mode Changes for HBASE-20027 seem to cause UI not showing up on default port in standalone mode. For concurrent unit test execution, individual tests can set hbase.localcluster.assign.random.ports to true or modify test/resources/hbase-site.xml.	2018-03-22 20:28:08 -07:00
Michael Stack	79e4c9d925	Revert "HBASE-20224 Web UI is broken in standalone mode" Broke shell tests. This reverts commit `dd9fe813ec`.	2018-03-22 10:47:47 -07:00
Umesh Agashe	dd9fe813ec	HBASE-20224 Web UI is broken in standalone mode Changes for HBASE-20027 seem to cause UI not showing up on default port in standalone mode. For concurrent unit test execution, individual tests can set hbase.localcluster.assign.random.ports to true or modify test/resources/hbase-site.xml.	2018-03-22 06:52:51 -07:00
Chia-Ping Tsai	dd9e46bbf5	HBASE-20212 Make all Public classes have InterfaceAudience category Signed-off-by: tedyu <yuzhihong@gmail.com> Signed-off-by: Michael Stack <stack@apache.org>	2018-03-22 18:09:54 +08:00
Michael Stack	fabb1d97cc	HBASE-20169 NPE when calling HBTU.shutdownMiniCluster Adds a prepare step to RecoverMetaProcedure in which we test for cluster up and master being up. If not up, we fail the run. Modified hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/HFileCleaner.java Modified hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ChunkCreator.java Minor log cleanup. Modified hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/RecoverMetaProcedure.java Add pepare step. Modified hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManagerMetrics.java Debug for the failing test.... Added hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestRecoverMetaProcedure.java Test the prepare step goes down if master or cluster are down.	2018-03-20 13:09:43 -07:00
Michael Stack	3f1c86786c	HBASE-20213 [LOGGING] Aligning formatting and logging less (compactions, in-memory compactions) Log less. Log using same format as used elsewhere in log. Align logs in HFileArchiver with how we format elsewhere. Removed redundant 'region' qualifiers, tried to tighten up the emissions so easier to read the long lines. M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ChunkCreator.java Add a label for each of the chunkcreators we make (I was confused by two chunk creater stats emissions in log file -- didn't know that one was for data and the other index). M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplit.java Formatting. Log less. M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreCompactionStrategy.java Make the emissions in here trace-level. When more than a few regions, log is filled with this stuff.	2018-03-16 13:07:34 -07:00
Umesh Agashe	55e3dda25d	HBASE-20024 Fixed flakyness of TestMergeTableRegionsProcedure We assumed that we can run for loop from 0 to lastStep sequentially. MergeTableRegionProcedure skips step 2. So, when i is 0 the procedure is already at step 3. Added a method StateMachineProcedure#getCurrentStateId that can be used from test code only.	2018-03-09 12:44:49 -08:00
zhangduo	5e410d8140	HBASE-19524 Master side changes for moving peer modification from zk watcher to procedure	2018-03-09 20:55:48 +08:00
zhangduo	95af14fea6	HBASE-19216 Implement a general framework to execute remote procedure on RS	2018-03-09 20:55:48 +08:00
Sean Busbey	71cc7869db	HBASE-20155 update branch-2 version to 2.1.0-SNAPSHOT Signed-off-by: Peter Somogyi <psomogyi@apache.org>	2018-03-08 08:44:30 -08:00
Sean Busbey	9927c2e14a	HBASE-20070 refactor website generation * rely on git plumbing commands when checking if we've built the site for a particular commit already * switch to forcing '-e' for bash * add command line switches for: path to hbase, working directory, and publishing * only export JAVA/MAVEN HOME if they aren't already set. * add some docs about assumptions * Update javadoc plugin to consistently be version 3.0.0 * avoid duplicative site invocations on reactor modules * update use of cp command so it works both on linux and mac * manually skip enforcer plugin during build * still doing install of all jars due to MJAVADOC-490, but then skip rebuilding during aggregate reports. * avoid the pager on git-diff by teeing to a log file, which also helps later reviewing in the case of big changesets. Signed-off-by: Michael Stack <stack@apache.org> Signed-off-by: Misty Stanley-Jones <misty@apache.org> Conflicts: hbase-backup/pom.xml hbase-spark-it/pom.xml	2018-03-02 09:51:43 -06:00
Michael Stack	a2de29560f	HBASE-20113 Move branch-2 version from 2.0.0-beta-2-SNAPSHOT to 2.0.0-beta-2	2018-03-01 15:46:38 -08:00
Michael Stack	44544c7db0	HBASE-20069 fix existing findbugs errors in hbase-server	2018-02-26 10:55:53 -08:00
Michael Stack	8b3ae58e18	HBASE-20043 ITBLL fails against hadoop3 Fix MoveRandomRegionOfTableAction. It depended on old AM behavior. Make it do explicit move as is required in AMv3; w/o it, it was just closing region causing test to fail. Fix pom so hadoop3 profile specifies a different netty3 version. Bunch of logging format change that came of trying trying to read the spew from this test.	2018-02-24 17:29:24 -08:00
Michael Stack	9be0360c5d	HBASE-20024 TestMergeTableRegionsProcedure is STILL flakey	2018-02-20 11:07:36 -08:00
zhangduo	c1fe9f441c	HBASE-19978 The keepalive logic is incomplete in ProcedureExecutor	2018-02-19 17:13:16 -08:00
Michael Stack	8f1e01b6e5	HBASE-19951 Cleanup the explicit timeout value for test method	2018-02-07 16:39:54 -08:00
Michael Stack	bac4687345	HBASE-19919 Tidying up logging	2018-02-02 22:42:30 -08:00
Michael Stack	90a75fb052	HBASE-19888 Move branch-2 version from 2.0.0-beta-1 to 2.0.0-beta-2-SNAPSHOT	2018-01-29 14:17:54 -08:00
Duo Zhang	bbf3bae72a	HBASE-19873 Add a CategoryBasedTimeout ClassRule for all UTs	2018-01-29 12:41:14 -08:00
Thiruvel Thirumoolan	c9950b5a79	HBASE-19756 Master NPE during completed failed proc eviction Signed-off-by: Andrew Purtell <apurtell@apache.org>	2018-01-24 16:43:08 -08:00
Michael Stack	86ecc963e4	HBASE-19828 Flakey TestRegionsOnMasterOptions.testRegionsOnAllServers Rename the PE Worker threads. Send an interrupt if worker taking a long time to go down (it may be RPC'ing out to a dead server, retrying so interrupt). Also join on the ProcedureExecutor shutting down. This will make problems shutting down more obvious. Disable TestRegionsOnMasterOptions. Master carrying Regions is broke.	2018-01-19 21:54:44 -08:00
Michael Stack	7225899e01	HBASE-19527 Make ExecutorService threads daemon=true Set the ProcedureExcecutor worker threads as daemon. Ditto for the timeout thread. Remove hack from TestRegionsOnMasterOptions that was put in place because the test would not go down.	2018-01-18 11:30:46 -08:00
Michael Stack	af2d890055	Revert "HBASE-19527 Make ExecutorService threads daemon=true" Applied prematurely. Revert. This reverts commit `5e4ed33fa2`.	2018-01-17 15:08:42 -08:00
Michael Stack	5e4ed33fa2	HBASE-19527 Make ExecutorService threads daemon=true Set the ProcedureExcecutor worker threads as daemon. Ditto for the timeout thread. Remove hack from TestRegionsOnMasterOptions that was put in place because the test would not go down.	2018-01-17 13:41:38 -08:00
Peter Somogyi	0561312bc4	HBASE-19809 Fix findbugs and error-prone warnings in hbase-procedure (branch-2)	2018-01-17 11:24:22 -08:00
Mike Drob	64cb777a8a	HBASE-19552 find-and-replace thirdparty offset	2017-12-28 12:01:25 -06:00
Michael Stack	d6d8369655	HBASE-19648 Move branch-2 version from 2.0.0-beta-1-SNAPSHOT to 2.0.0-beta-1	2017-12-27 14:41:19 -08:00
Chia-Ping Tsai	7dee1bcd31	HBASE-19644 add the checkstyle rule to reject the illegal imports	2017-12-28 04:17:45 +08:00
Peter Somogyi	bf998077b9	HBASE-19578 MasterProcWALs cleaning is incorrect Signed-off-by: tedyu <yuzhihong@gmail.com>	2017-12-21 09:39:31 -08:00
Balazs Meszaros	992b5d8630	HBASE-10092 Move up on to log4j2 Changes: - replaced commons-logging to slf4j everywhere - log.XXX(Throwable) calls were replaced with log.XXX(t.toString(), t) - log.XXX(Object) calls were replaced with log.XXX(Objects.toString(obj)) - log.fatal() calls were replaced with log.error(HBaseMarkers.FATAL, ...) - programmatic log4j configuration was removed from the unit test This commit does not affect the current logging configurations, because log4j is still on the classpath. slf4j-log4j12 binds log4j to slf4j. Signed-off-by: Michael Stack <stack@apache.org>	2017-12-20 22:58:12 -08:00
Michael Stack	13d9e8088c	HBASE-19218 Master stuck thinking hbase:namespace is assigned after restart preventing intialization Signed-off-by: Li Xiang <easyliangjob@gmail.com>	2017-12-20 21:47:50 -08:00
Guanghao Zhang	38d9125cc6	HBASE-19563 A few hbase-procedure classes missing @InterfaceAudience annotation	2017-12-20 09:33:33 -08:00
Mike Drob	23a9059cb2	HBASE-18838 Fix hadoop3 check-shaded-invariants	2017-12-15 13:20:54 -06:00
Michael Stack	672c440b9f	HBASE-18946 Stochastic load balancer assigns replica regions to the same RS Added new bulk assign createRoundRobinAssignProcedure to complement the existing createAssignProcedure. The former asks the balancer for target servers to set into the created AssignProcedures. The latter sets no target server into AssignProcedure. When no target server is specified, we make effort at assign-time at trying to deploy the region to its old location if there was one. The new round robin assign procedure creator does not do this. Use the new round robin method on table create or reenabling offline regions. Use the old assign in ServerCrashProcedure or in EnableTable so there is a chance we retain locality. Bulk preassigning passing all to-be-assigned to the balancer in one go is good for ensuring good distribution especially when read replicas in the mix. The old assign was single-assign scoped so region replicas could end up on the same server. M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignProcedure.java Cleanup around forceNewPlan. Was confusing. Added a Comparator to sort AssignProcedures so meta and system tables come ahead of user-space tables. M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java Remove the forceNewPlan argument on createAssignProcedure. Didn't make sense given we were creating a new AssignProcedure; the arg had no effect. (createRoundRobinAssignProcedures) Recast to feed all regions to the balancer in bulk and to sort the return so meta and system tables take precedence. Miscellaneous fixes including keeping the Master around until all RegionServers are down, documentation on how assignment retention works, etc.	2017-12-15 08:54:35 -08:00
Mike Drob	2952cc7dea	HBASE-19289 Add flag to disable stream capability enforcement Signed-off-by: Josh Elser <elserj@apache.org>	2017-12-14 12:19:59 -06:00
Apekshit Sharma	e8ba7b2320	HBASE-19457 Debugging flaky TestTruncateTableProcedure - Adds debug logging for future ease - Removes 60s timeout since testRecoveryAndDoubleExecutionPreserveSplits is only halfway after a minute. - Adds some comments - Logging change: Some places report "regionState=" while others just "state=". State machine procs also have "state=" in their logs. Let me change all region related logging to "regionState=" so that 1) it's consistent everywhere, 2) more filtered results when searching through logs.	2017-12-08 17:25:44 -08:00
Apekshit Sharma	4f4aac77e1	HBASE-19367 Refactoring in RegionStates, and RSProcedureDispatcher - Adding javadoc comments - Bug: ServerStateNode#regions is HashSet but there's no synchronization to prevent concurrent addRegion/removeRegion. Let's use concurrent set instead. - Use getRegionsInTransitionCount() directly to avoid instead of getRegionsInTransition().size() because the latter copies everything into a new array - what a waste for just the size. - There's mixed use of getRegionNode and getRegionStateNode for same return type - RegionStateNode. Changing everything to getRegionStateNode. Similarly rename other RegionNode() fns to RegionStateNode(). - RegionStateNode#transitionState() return value is useless since it always returns it's first param. - Other minor improvements	2017-11-29 22:42:39 -08:00
Apekshit Sharma	96e63ac7b8	HBASE-19319 Fix bug in synchronizing over ProcedureEvent Also moves event related functions (wake/wait/suspend) from ProcedureScheduler to ProcedureEvent class	2017-11-27 12:06:07 -08:00
Peter Somogyi	bcd367e293	HBASE-19315 Incorrect snapshot version is used for 2.0.0-beta-1 Signed-off-by: Michael Stack <stack@apache.org>	2017-11-21 10:41:50 -08:00
Tamas Penzes	7a69ebc73e	HBASE-18601: Update Htrace to 4.2 Updated HTrace version to 4.2 Created TraceUtil class to wrap htrace methods. Uses try with resources. Signed-off-by: Balazs Meszaros <balazs.meszaros@cloudera.com> Signed-off-by: Michael Stack <stack@apache.org>	2017-11-13 10:38:36 -08:00
Michael Stack	f13cf56f1c	HBASE-19197 Move version on branch-2 from 2.0.0-alpha4 to 2.0.0-beta-1.SNAPSHOT	2017-11-06 20:46:38 -08:00
Mike Drob	33ae6dce42	HBASE-18983 fixes from update error-prone to 2.1.1	2017-11-04 21:29:48 -05:00
Sean Busbey	a9f0c5d4e2	HBASE-18784 if available, query underlying outputstream capabilities where we need hflush/hsync. * pull things that don't rely on HDFS in hbase-server/FSUtils into hbase-common/CommonFSUtils * refactor setStoragePolicy so that it can move into hbase-common/CommonFSUtils, as a side effect update it for Hadoop 2.8,3.0+ * refactor WALProcedureStore so that it handles its own FS interactions * add a reflection-based lookup of stream capabilities * call said lookup in places where we make WALs to make sure hflush/hsync is available. * javadoc / checkstyle cleanup on changes as flagged by yetus Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>	2017-11-02 21:54:16 -05:00
Sean Busbey	35094bf4d5	HBASE-18933 set version number to 2.0.0-alpha4-SNAPSHOT following release of alpha3	2017-10-04 07:57:49 -05:00
Michael Stack	7660f9e86a	HBASE-18819 Set version number to 2.0.0-alpha3 from 2.0.0-alpha3-SNAPSHOT	2017-09-14 12:38:46 -07:00
Sean Busbey	d576e5a32d	HBASE-17823 Migrate to Apache Yetus Audience Annotations Includes partial backport of hbase-build-configuration module Signed-off-by: Michael Stack <stack@apache.org> Signed-off-by: Misty Stanley-Jones <misty@apache.org>	2017-09-12 23:15:50 -05:00
Balazs Meszaros	c48dc02b76	HBASE-18106 Redo ProcedureInfo and LockInfo Main changes: - ProcedureInfo and LockInfo were removed, we use JSON instead of them - Procedure and LockedResource are their server side equivalent - Procedure protobuf state_data became obsolate, it is only kept for reading previously written WAL - Procedure protobuf contains a state_message field, which stores the internal state messages (Any type instead of bytes) - Procedure.serializeStateData and deserializeStateData were changed slightly - Procedures internal states are available on client side - Procedures are displayed on web UI and in shell in the following jruby format: { ID => '1', PARENT_ID = '-1', PARAMETERS => [ ..extra state information.. ] } Signed-off-by: Michael Stack <stack@apache.org>	2017-09-08 11:56:28 -07:00
Peter Somogyi	33711fd481	HBASE-18704 Upgrade hbase to commons-collections 4 Upgrade commons-collections:3.2.2 to commons-collections4:4.1 Add missing dependency for hbase-procedure, hbase-thrift Replace CircularFifoBuffer with CircularFifoQueue in WALProcedureStore and TaskMonitor Signed-off-by: Sean Busbey <busbey@apache.org> Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com> (cherry picked from commit `137b105c67`)	2017-09-07 10:39:13 -05:00

1 2 3 4 5 ...

293 Commits