Commit Graph

358 Commits

Author SHA1 Message Date
Nick Dimiduk 714a6f53d8 HBASE-24662 Update DumpClusterStatusAction to notice changes in region server count
Sometimes running chaos monkey, I've found that we lose accounting of
region servers. I've taken to a manual process of checking the
reported list against a known reference. It occurs to me that
ChaosMonkey has a known reference, and it can do this accounting for
me.

Signed-off-by: Viraj Jasani <vjasani@apache.org>
2020-08-06 09:18:51 -07:00
Nick Dimiduk 1f0abf8279 HBASE-24295 [Chaos Monkey] abstract logging through the class hierarchy ; ADDENDUM
Signed-off-by: Jan Hentschel <jan.hentschel@ultratendency.com>
2020-08-03 12:57:11 -07:00
Nick Dimiduk 8d1228ece7 HBASE-24295 [Chaos Monkey] abstract logging through the class hierarchy
Adds `protected abstract Logger getLogger()` to `Action` so that
implementation's names are logged when actions are performed.

Signed-off-by: stack <stack@apache.org>
Signed-off-by: Jan Hentschel <jan.hentschel@ultratendency.com>

foo
2020-08-03 12:57:11 -07:00
Nick Dimiduk 46f6d46b64 HBASE-24658 Update PolicyBasedChaosMonkey to handle uncaught exceptions
Running `ServerKillingChaosMonkey` via `RESTApiClusterManager` for any
duration of time slowly leaks region servers. I see failures on the
RESTApi side go unreported on the ChaosMonkey side. It seems like
`RuntimeException`s are being thrown and lost.

`PolicyBasedChaosMonkey` uses a primitive means of thread management
anyway. Update to use a thread pool, thread groups, and an
uncaughtExceptionHandler.

Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2020-07-21 09:59:39 -07:00
Nick Dimiduk 0224dccdb8 HBASE-24360 RollingBatchRestartRsAction loses track of dead servers
`RollingBatchRestartRsAction` doesn't handle failure cases when
tracking its list of dead servers. The original author believed that a
failure to restart would result in a retry. However, by removing the
dead server from the failed list, that state is lost, and retry never
occurs. Because this action doesn't ever look back to the current
state of the cluster, relying only on its local state for the current
action invocation, it never realizes the abandoned server is still
dead. Instead, be more careful to only remove the dead server from the
list when the `startRs` invocation claims to have been successful.

Signed-off-by: stack <stack@apache.org>
(cherry picked from commit 0dae377f53)
2020-06-22 10:29:09 -07:00
BukrosSzabolcs 186373bea4 HBASE-22982: region server suspend/resume (#592)
* Add chaos monkey action for suspend/resume region servers
* Add these to relevant chaos monkeys

branch-1-backport-note: Graceful regionserver restart action wasn't
backported due to a dependency of "RegionMover" script. Can be done
later if needed.

Signed-off-by: Balazs Meszaros <meszibalu@apache.org>
Signed-off-by: Peter Somogyi <psomogyi@apache.org>
2020-06-22 10:29:09 -07:00
Bharath Vissapragada 4e4d11cb19
HBASE-24260 Add a ClusterManager that issues commands via coprocessor (#1853)
Implements `ClusterManager` that relies on the new
`ShellExecEndpointCoprocessor` for remote shell command execution.

Signed-off-by: Bharath Vissapragada <bharathv@apache.org>

Co-authored-by: Nick Dimiduk <ndimiduk@apache.org>
2020-06-04 22:08:16 -07:00
Lokesh Khurana f80c4c24b6
HBASE-24193 : BackPort HBASE-18651 to branch-1 (#1521)
Signed-off-by: Reid Chan <reidchan@apache.org>
2020-04-16 10:13:20 +08:00
Andrew Purtell b9c676cdc0
Set version on branch-1 to 1.7.0-SNAPSHOT 2020-02-14 11:31:32 -08:00
Andrew Purtell 5ec5a5b115
Update POMs and CHANGES.txt for 1.6.0RC0 2020-02-14 11:30:22 -08:00
Peter Somogyi 907184dfa0
HBASE-23675 Move to Apache parent POM version 22 (#1023)
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Sean Busbey <busbey@apache.org>
2020-01-15 10:10:23 +01:00
Nick Dimiduk 80c3581dbd HBASE-23552 Format Javadocs on ITBLL
We have this nice description in the java doc on ITBLL but it's
unformatted and thus illegible. Add some formatting so that it can be
read by humans.

Signed-off-by: Jan Hentschel <janh@apache.org>
Signed-off-by: Josh Elser <elserj@apache.org>
2019-12-10 13:13:10 -08:00
ravowlga123 abf6ec0d73
HBASE-18439 Subclasses of o.a.h.h.chaos.actions.Action all use the same logger
Signed-off-by: Jan Hentschel <jan.hentschel@ultratendency.com>
Signed-off-by: Guangxu Cheng <gxcheng@apache.org>
2019-11-08 20:53:55 +01:00
Sean Busbey 4bcc397f3e
HBASE-23229 Update branch-1 to 1.6.0-SNAPSHOT (#772)
Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org>
2019-10-30 09:22:39 -05:00
Andrew Purtell 214d33e0f4
Renumber branch to 1.5.1-SNAPSHOT 2019-10-12 13:21:18 -07:00
Andrew Purtell 3966d0fee6
Update POMs and CHANGES.txt for 1.5.0 RC4 2019-10-07 11:46:53 -07:00
Viraj Jasani 4b34d24f7a
HBASE-22728 Jackson dependency cleanup and moving to Jackson2
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2019-08-15 18:32:36 -07:00
Andrew Purtell 98dc440462
HBASE-22449 https everywhere in Maven metadata (#247) 2019-05-21 12:44:02 -07:00
Andrew Purtell e2d48f41c5
Set version on branch back to 1.5.0-SNAPSHOT 2019-05-20 13:02:40 -07:00
Andrew Purtell ce6a6014da
Update POMs and CHANGES.txt for 1.5.0 RC0 2019-02-01 12:36:10 -08:00
Allan Yang 20759a3c7e
HBASE-20870 Wrong HBase root dir in ITBLL's Search Tool
Conflicts:
	hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestingUtility.java
	hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java

Amending-Author: Andrew Purtell <apurtell@apache.org>
2019-01-23 16:12:09 -08:00
Andrew Purtell ebad3ab8ed HBASE-21256 Improve IntegrationTestBigLinkedList for testing huge data
Backport to branch-1

Signed-off-by: Duo Zhang <zhangduo@apache.org>
2018-10-15 10:53:34 +08:00
Monani Mihir 0298c06b4f HBASE-19036 Add action in Chaos Monkey to restart Active Namenode
Signed-off-by: tedyu <yuzhihong@gmail.com>
2018-08-02 05:04:27 -07:00
Andrew Purtell 9d9d5aec0e HBASE-20505 PE should support multi column family read and write cases 2018-05-07 18:39:18 -07:00
Guangxu Cheng 6f1dd258b1 HBASE-19483 Add proper privilege check for rsgroup commands
Signed-off-by: tedyu <yuzhihong@gmail.com>
2018-01-10 02:25:47 -08:00
Andrew Purtell 1fe75f98d3 HBASE-19421 branch-1 does not compile against Hadoop 3.0.0 2017-12-04 15:48:38 -08:00
libisthanks 41b4877950
HBASE-18090 Improve TableSnapshotInputFormat to allow more multiple mappers per region
Signed-off-by: Michael Stack <stack@apache.org>
2017-11-28 14:56:38 -08:00
Mike Drob dbcda15ae7 HBASE-19240 more error-prone results
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2017-11-13 18:23:10 -08:00
Andrew Purtell 6961526573 HBASE-12350 Backport error-prone build support to branch-1 and branch-2 2017-11-09 15:34:09 -08:00
Andrew Purtell 51b65707b3 HBASE-19173 Configure IntegrationTestRSGroup automatically for minicluster mode 2017-11-03 23:27:52 -07:00
Andrew Purtell 64328caef0 HBASE-15631 Backport Regionserver Groups (HBASE-6721) to branch-1 (Francis Liu and Andrew Purtell) 2017-10-23 17:10:33 -07:00
Sean Mackrory aec4bf6bae HBASE-15947 Classes used only for tests included in main code base
Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>
2017-08-30 10:53:31 +08:00
Josh Elser fd43879985 HBASE-18679 Add a null check around the result of getCounters() in ITBLL 2017-08-25 18:54:01 -04:00
Josh Elser 00a4e0697a HBASE-18631 Allow ChaosMonkey properties to be specified in hbase-site 2017-08-20 15:07:35 -04:00
Balazs Meszaros 0053cb967b HBASE-18185 IntegrationTestTimeBoundedRequestsWithRegionReplicas unbalanced tests fails with AssertionError
unbalance.kill.meta.rs property was added which controls the monkey to
kill that region server which holds hbase:meta.

Change-Id: I2c871789645b6c1986104f5a16cc6b9badfbc172
Signed-off-by: Apekshit Sharma <appy@apache.org>
2017-07-27 15:02:51 -07:00
Andrew Purtell 3dd55fa0c0 Set versions on branch-1 to 1.5.0-SNAPSHOT 2017-07-03 18:01:15 -07:00
Nemo Chen 4030facc99 HBASE-16469 Several log refactoring/improvement suggestions
Signed-off-by: Sean Busbey <busbey@apache.org>
2017-04-11 14:28:47 -05:00
zhangduo 094e9a311b HBASE-16584 Backport the new ipc implementation in HBASE-16432 to branch-1 2017-03-16 23:00:30 +08:00
Andrew Purtell eb889f6a4b HBASE-17637 Update progress more frequently in IntegrationTestBigLinkedList.Generator.persist 2017-02-13 15:04:27 -08:00
Abhishek Singh Chouhan 807fcfd22f HBASE-17616 Incorrect actions performed by CM
Signed-off-by: tedyu <yuzhihong@gmail.com>
2017-02-09 12:35:22 -08:00
Josh Elser 2f35956eb8 HBASE-17171 Proactively catch the case when no time remains for random reads
The framework sets a configuration property to control how long reads
should be executed. When writes take too long, no time remains for reads
and the user sees an error about a property they must set. We should
prevent this case and log an appropriate message.

Also fixes a rogue character in the class-level javadoc.

Signed-off-by: Michael Stack <stack@apache.org>
2016-11-23 11:41:04 -08:00
Enis Soztutar bf0483c37c HBASE-17091 IntegrationTestZKAndFSPermissions failed with 'KeeperException' 2016-11-15 13:09:32 -08:00
Apekshit Sharma 9bc9f9b597 HBASE-17004 IntegrationTestManyRegions verifies that many regions get assigned within given time. To do so, it spawns a new thread and uses CountDownLatch.await() to timeout.
Replacing this mechanism with junit @ClassRule to timeout the test.
Also adds missing kdc deps in hbase-it/pom.xml

Change-Id: I00930c2f974b4215e3f82a0ec007d9ef3ebd7cdd
2016-11-04 11:48:01 -07:00
Apekshit Sharma 51ba7cfde3 HBASE-17006 Give name to existing threads.
Having thread names in logs and thread dumps greatly improve debugability. This patch is simply adding the names to the threads we spawn.

Change-Id: I6ff22cc3804bb81147dde3a8e9ab671633c6f6ce
2016-11-03 18:31:03 -07:00
Sean Busbey a1536c2876 Revert "HBASE-16562 ITBLL should fail to start if misconfigured"
This reverts commit 38b946c276.

See discussion on JIRA.
2016-10-24 09:16:53 -05:00
Sean Busbey 65c2dd489f Revert "HBASE-16562 ITBLL should fail to start if misconfigured, addendum"
This reverts commit 6f73ef2dff.

See discussion on JIRA.
2016-10-24 09:16:40 -05:00
Stephen Yuan Jiang 42e7a4acd7 HBASE-16889 Proc-V2: verifyTables in the IntegrationTestDDLMasterFailover test after each table DDL is incorrect (Stephen Yuan Jiang) 2016-10-20 18:25:29 -07:00
Jerry He 92b1b5ac80 HBASE-16667 Building with JDK 8: ignoring option MaxPermSize=256m (Niels Basjes) 2016-09-24 16:29:41 -07:00
Jonathan M Hsieh 13d6acbc7f HBASE-12088 Remove unused hadoop-1.0, hadoop-1.1 profiles from non-root poms 2016-09-21 20:52:19 -07:00
chenheng 6f73ef2dff HBASE-16562 ITBLL should fail to start if misconfigured, addendum 2016-09-07 15:45:09 +08:00