Commit Graph

516 Commits

Author SHA1 Message Date
Nick Dimiduk cf9e337c0f
HBASE-24361 Make `RESTApiClusterManager` more resilient (#1701)
* sometimes API calls return with null/empty response bodies. thus,
  wrap all API calls in a retry loop.
* calls that submit work in the form of "commands" now retrieve the
  commandId from successful command submission, and track completion
  of that command before returning control to calling context.
* model CM's process state and use that model to guide state
  transitions more intelligently. this guards against, for example,
  the start command failing with an error message like "Role must be
  stopped".
* improvements to logging levels, avoid spamming logs with the
  side-effects of retries at this and higher contexts.
* include references to API documentation, such as it is.

Signed-off-by: stack <stack@apache.org>
2020-05-19 09:43:55 -07:00
Nick Dimiduk ce9051ea91
HBASE-24360 RollingBatchRestartRsAction loses track of dead servers
`RollingBatchRestartRsAction` doesn't handle failure cases when
tracking its list of dead servers. The original author believed that a
failure to restart would result in a retry. However, by removing the
dead server from the failed list, that state is lost, and retry never
occurs. Because this action doesn't ever look back to the current
state of the cluster, relying only on its local state for the current
action invocation, it never realizes the abandoned server is still
dead. Instead, be more careful to only remove the dead server from the
list when the `startRs` invocation claims to have been successful.

Signed-off-by: stack <stack@apache.org>
2020-05-18 12:08:52 -07:00
meiyi 9b64ab029c HBASE-24364 [Chaos Monkey] Invalid data block encoding in ChangeEncodingAction (#1707)
Signed-off-by: Jan Hentschel <janh@apache.org>
2020-05-15 12:01:21 +08:00
Duo Zhang a5aa8d208e HBASE-24309 Avoid introducing log4j and slf4j-log4j dependencies for … (#1697)
Signed-off-by: stack <stack@apache.org>
2020-05-13 18:41:27 +08:00
Nick Dimiduk 6cf140850c HBASE-24295 [Chaos Monkey] abstract logging through the class hierarchy ; ADDENDUM
Signed-off-by: Jan Hentschel <jan.hentschel@ultratendency.com>
2020-05-07 13:27:29 -07:00
Michael Stack bee66bd22a HBASE-24284 [h3/jdk11] REST server won't start Exclude transitive includes of jax-rs 1.x and then explicitly include jax-rs 2.x glassfish impl for REST context when hadoop3. (#1625) 2020-05-05 15:28:31 -07:00
Nick Dimiduk f6ee3163bd HBASE-24295 [Chaos Monkey] abstract logging through the class hierarchy
Adds `protected abstract Logger getLogger()` to `Action` so that
implementation's names are logged when actions are performed.

Signed-off-by: stack <stack@apache.org>
Signed-off-by: Jan Hentschel <jan.hentschel@ultratendency.com>
2020-05-04 11:36:46 -07:00
Nick Dimiduk fa1890c880 HBASE-24260 Add a ClusterManager that issues commands via coprocessor
Implements `ClusterManager` that relies on the new
`ShellExecEndpointCoprocessor` for remote shell command execution.

Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
2020-05-04 10:50:42 -07:00
Nick Dimiduk bd27542a45 HBASE-24274 `RESTApiClusterManager` attempts to deserialize response using serialization API
Use the correct GSON API for deserializing service responses. Add
simple unit test covering a very limited selection of the overall API
surface area, just enough to ensure deserialization works.

Signed-off-by: stack <stack@apache.org>
2020-04-29 13:09:03 -07:00
Duo Zhang bf2350e25f HBASE-24249 Move code in FSHDFSUtils to FSUtils and mark related clas… (#1586)
Signed-off-by: stack <stack@apache.org>
2020-04-29 11:32:44 +08:00
BukrosSzabolcs f951913e24
HBASE-23891: Add an option to Actions to filter out meta RS
Signed-off-by: Peter Somogyi <psomogyi@apache.org>
2020-03-17 15:02:33 +01:00
Nick Dimiduk 4f76e24755 Revert "HBASE-23891: Add an option to Actions to filter out meta RS (#1217)"
This reverts commit 7d8fa5c818.
2020-03-10 11:48:12 -07:00
BukrosSzabolcs 7d8fa5c818 HBASE-23891: Add an option to Actions to filter out meta RS (#1217)
Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org>
(cherry picked from commit 4cb60327be)
2020-03-06 11:10:00 +00:00
BukrosSzabolcs f9abaee50c HBASE-23566: Fix package/packet terminology problem in chaos monkeys (#933)
s/package/packet/g

Signed-off-by: Sean Busbey <busbey@apache.org>
(cherry picked from commit 413d4b2d0f)
2019-12-12 16:34:31 -06:00
Nick Dimiduk 391be59835 HBASE-23552 Format Javadocs on ITBLL
We have this nice description in the java doc on ITBLL but it's
unformatted and thus illegible. Add some formatting so that it can be
read by humans.

Signed-off-by: Jan Hentschel <janh@apache.org>
Signed-off-by: Josh Elser <elserj@apache.org>
2019-12-10 13:12:54 -08:00
BukrosSzabolcs 014a40b678 HBASE-23352: Allow chaos monkeys to access cmd line params, and improve FillDiskCommandAction (#885)
Instead of using the default properties when checking for monkey
properties, now we use the ones already extended with command line
params.
Change FillDiskCommandAction to try to stop the remote process if the
command failed with an exception.

Signed-off-by: stack <stack@apache.org>
2019-12-02 10:33:56 +08:00
Peter Somogyi ce34d895e5 HBASE-23085 Network and Data related Actions; ADDENDUM (#871)
Fix percentage in String.format

Signed-off-by: Sean Busbey <busbey@apache.org>
2019-11-22 22:15:27 -06:00
BukrosSzabolcs 54be3d1d86 HBASE-23085 Network and Data related Actions
Add monkey actions:
- manipulate network packages with tc (reorder, loose,...)
- add CPU load
- fill the disk
- corrupt or delete regionserver data files

Extend HBaseClusterManager to allow sudo calls.

Signed-off-by: Josh Elser <elserj@apache.org>
Signed-off-by: Balazs Meszaros <meszibalu@apache.org>
2019-11-19 10:15:35 +01:00
ravowlga123 5dfa58b017
HBASE-18439 Subclasses of o.a.h.h.chaos.actions.Action all use the same logger
Signed-off-by: Jan Hentschel <jan.hentschel@ultratendency.com>
Signed-off-by: Guangxu Cheng <gxcheng@apache.org>
2019-11-08 20:26:43 +01:00
meiyi d841245115 HBASE-23170 Admin#getRegionServers use ClusterMetrics.Option.SERVERS_NAME (#721) 2019-10-18 10:09:42 +08:00
BukrosSzabolcs cd9367512a HBASE-22982: region server suspend/resume and graceful rolling restart actions (#592)
* Add chaos monkey action for suspend/resume region servers
* Add chaos monkey action for graceful rolling restart
* Add these to relevant chaos monkeys

Signed-off-by: Balazs Meszaros <meszibalu@apache.org>
Signed-off-by: Peter Somogyi <psomogyi@apache.org>
2019-09-26 11:56:17 +02:00
Balazs Meszaros 14634411da HBASE-15666 shaded dependencies for hbase-testing-util
Added new artifact hbase-shaded-testing-util. It wraps a whole hbase-server
with its testing dependencies. Users should use only the following dependency
in pom:

<dependency>
  <groupId>org.apache.hbase</groupId>
  <artifactId>hbase-shaded-testing-util</artifactId>
  <version>${hbase.version}</version>
  <scope>test</scope>
</dependency>

Added hbase-shaded-testing-util-tester maven module which ensures
that hbase-shaded-testing-util works with a shaded client.

Signed-off-by: Josh Elser <elserj@apache.org>
2019-08-07 07:43:44 -07:00
Guanghao 78f704796e HBASE-22624 Should sanity check table configuration when clone snapshot to a new table 2019-07-03 18:28:48 +08:00
Andrew Purtell 2c55bd9344
HBASE-22449 https everywhere in Maven metadata (#247) 2019-05-21 12:38:42 -07:00
Sean Busbey 4862a596ef HBASE-22083 move eclipse settings into a profile.
Signed-off-by: stack <stack@apache.org>

 Conflicts:
	hbase-backup/pom.xml
	hbase-hadoop-compat/pom.xml
	hbase-protocol/pom.xml
2019-04-25 14:38:38 -05:00
李小保 1bd5b5cf7b HBASE-22250 The same constants used in many places should be placed in constant classes
Signed-off-by: stack <stack@apache.org>
2019-04-23 21:24:46 -07:00
Jan Hentschel c40e6e2339 HBASE-22231 Removed unused and '*' import 2019-04-23 12:53:52 +02:00
zhanggangxue 9a0daa8cbd HBASE-21257 misspelled words.[occured -> occurred] 2019-04-14 21:36:24 +08:00
zhangduo b04b1ecc74 HBASE-22108 Avoid passing null in Admin methods
Signed-off-by: Guanghao Zhang <zghao@apache.org>
2019-04-07 21:08:55 +08:00
Vladimir Rodionov ae4bfabeaa HBASE-21688 Address WAL filesystem issues
Amending-Author: Josh Elser <elserj@apache.org>
Signed-off-by: Josh Elser <elserj@apache.org>
2019-04-03 13:55:48 -04:00
stack 939a29b41e HBASE-22052 pom cleaning; filter out jersey-core in hadoop2 to match hadoop3 and remove redunant version specifications
This is a reapply of a reverted commit. This commit includes
HBASE-22059 amendment and subsequent ammendments to HBASE-22052.
See HBASE-22052 for full story.

jersey-core is problematic. It was transitively included from hadoop
and polluting our CLASSPATH with an implementation of a 1.x version
of the javax.ws.rs.core.Response Interface from jsr311-api when we
want the javax.ws.rs-api 2.x version.

    M hbase-endpoint/pom.xml
    M hbase-http/pom.xml
    M hbase-mapreduce/pom.xml
    M hbase-rest/pom.xml
    M hbase-server/pom.xml
    M hbase-zookeeper/pom.xml
     Remove redundant version specification (and the odd property define
     done already up in parent pom).
    M hbase-it/pom.xml
    M hbase-rest/pom.xml
     Exclude jersey-core explicitly.

    M hbase-procedure/pom.xml
     Remove redundant version and classifier.

    M pom.xml
     Add jersey-core exclusions to all dependencies that pull it in
     except hadoop-minicluster. mr tests fail w/o the jersey-core
     so let it in for minicluster and then in modules, exclude it
     where it causes damage as in hbase-it.
2019-03-25 09:28:39 -04:00
Guanghao Zhang 607ac735c4 HBASE-21922 BloomContext#sanityCheck may failed when use ROWPREFIX_DELIMITED bloom filter 2019-02-23 23:29:53 +08:00
Duo Zhang 761aef6d9d HBASE-20587 Replace Jackson with shaded thirdparty gson
Signed-off-by: Michael Stack <stack@apache.org>
2019-02-22 16:40:45 +08:00
Sean Busbey e8767ea495 HBASE-21808 Ensure we can build with JDK11 targetting JDK8
Signed-off-by: Josh Elser <elserj@apache.org>
(cherry picked from commit 5784a09fff)
2019-02-01 16:28:39 -06:00
Guanghao Zhang 16665b6e93 HBASE-21799 Update branch-2 version to 2.3.0-SNAPSHOT 2019-01-29 21:53:21 +08:00
Duo Zhang 4e792414f6 HBASE-21731 Do not need to use ClusterConnection in IntegrationTestBigLinkedListWithVisibility
Signed-off-by: Peter Somogyi <psomogyi@apache.org>
2019-01-16 20:59:37 +08:00
Duo Zhang 9ec84c235f HBASE-21704 The implementation of DistributedHBaseCluster.getServerHoldingRegion is incorrect 2019-01-11 21:20:50 +08:00
Zephyr Guo 2b1716fd8e HBASE-21256 Improve IntegrationTestBigLinkedList for testing huge data
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2018-10-12 11:00:03 +08:00
Guangxu Cheng fd68e7593e
HBASE-20636 Introduce two bloom filter type : ROWPREFIX and ROWPREFIX_DELIMITED
Signed-off-by: Andrew Purtell <apurtell@apache.org>
Amending-Author: Andrew Purtell <apurtell@apache.org>
2018-09-21 16:06:34 -07:00
Monani Mihir 06a92a3d20 HBASE-19036 Add action in Chaos Monkey to restart Active Namenode
Signed-off-by: tedyu <yuzhihong@gmail.com>
2018-08-02 05:00:16 -07:00
Allan Yang 1a6fae74b5 HBASE-20870 Wrong HBase root dir in ITBLL's Search Tool 2018-07-20 11:22:03 +08:00
Sahil Aggarwal e61507b9a0
HBASE-19164: Remove UUID.randomUUID in tests.
Signed-off-by: Mike Drob <mdrob@apache.org>
2018-06-27 10:36:48 -05:00
zhangduo dde042cc93 HBASE-20776 Update branch-2 version to 2.2.0-SNAPSHOT 2018-06-22 22:15:18 +08:00
Sean Busbey ee84a8f243 HBASE-20332 shaded mapreduce module shouldn't include hadoop
* modify the jar checking script to take args; make hadoop stuff optional
* separate out checking the artifacts that have hadoop vs those that don't.
* * Unfortunately means we need two modules for checking things
* * put in a safety check that the support script for checking jar contents is maintained in both modules
* * have to carve out an exception for o.a.hadoop.metrics2. :(
* fix duplicated class warning
* clean up dependencies in hbase-server and some modules that depend on it.
* allow Hadoop to have its own htrace where it needs it
* add a precommit check to make sure we're not using old htrace imports

 Conflicts:
	hbase-backup/pom.xml
	hbase-checkstyle/src/main/resources/hbase/checkstyle-suppressions.xml

Signed-off-by: Mike Drob <mdrob@apache.org>
2018-06-18 14:02:48 -07:00
Mike Drob b04c976fe6 HBASE-20478 Update checkstyle to v8.2
Cannot go to latest (8.9) yet due to
  https://github.com/checkstyle/checkstyle/issues/5279

* move hbaseanti import checks to checkstyle
* implment a few missing equals checks, and ignore one
* fix lots of javadoc errors

Signed-off-by: Sean Busbey <busbey@apache.org>
2018-06-18 14:02:40 -07:00
maoling 4c95b82b61 HBASE-19761:Fix Checkstyle errors in hbase-zookeeper
Signed-off-by: Jan Hentschel <jan.hentschel@ultratendency.com>
2018-06-02 10:17:27 +02:00
Josh Elser c3d82a283d HBASE-20223 Update to hbase-thirdparty 2.1.0
Remove commons-cli and commons-collections4 use. Account
for the newer internal protobuf version of 3.5.1.

Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: Mike Drob <mdrob@apache.org>
2018-03-26 16:07:39 -04:00
Chia-Ping Tsai dd9e46bbf5 HBASE-20212 Make all Public classes have InterfaceAudience category
Signed-off-by: tedyu <yuzhihong@gmail.com>
Signed-off-by: Michael Stack <stack@apache.org>
2018-03-22 18:09:54 +08:00
Chia-Ping Tsai 95596e8ba7 HBASE-20119 Introduce a pojo class to carry coprocessor information in order to make TableDescriptorBuilder accept multiple cp at once
Signed-off-by: Ted Yu <yuzhihong@gmail.com>
Signed-off-by: Michael Stack <stack@apache.org>
2018-03-16 01:26:08 +08:00
Michael Stack 260ee0da60 HBASE-20173 [AMv2] DisableTableProcedure concurrent to ServerCrashProcedure can deadlock
Allow that DisableTableProcedue can grab a region lock before
ServerCrashProcedure can. Cater to this cricumstance where SCP
was not unable to make progress by running the search for RIT
against the crashed server a second time, post creation of all
crashed-server assignemnts. The second run will uncover such as
the above DisableTableProcedure unassign and will interrupt its
suspend allowing both procedures to make progress.

M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
 Add new procedure step post-assigns that reruns the RIT finder method.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
 Make this important log more specific as to what is going on.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java
 Better explanation as to what is going on.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerCrashProcedure.java
 Add extra step and run handleRIT a second time after we've queued up
 all SCP assigns. Also fix a but. SCP was adding an assign of a RIT
 that was actually trying to unassign (made the deadlock more likely).
2018-03-13 05:44:43 -07:00