Note: branch-1 has design difference compared to other branches in the replication sub-system. HMaster does not coordinate replication actions in branch-1 and hence each RS is responsible for initing peers and updating ZK states. As part of this change we are updating zk state of peers after reading from configuration, so if there is a divergence in configuration across RS the result can be can be non-deterministic and the last RS RPC will win.
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Looped through the test 100 times and it passes. Without the patch it fails
every ~10 runs or so.
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
Looped through the test 100 times and it passes. Without the patch it fails
every ~10 runs or so.
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
Sometimes running chaos monkey, I've found that we lose accounting of
region servers. I've taken to a manual process of checking the
reported list against a known reference. It occurs to me that
ChaosMonkey has a known reference, and it can do this accounting for
me.
Signed-off-by: Viraj Jasani <vjasani@apache.org>
* refactor how we use connection to rely on the access method
* refactor initialization and cleanup of the shared connection
* incompatibly change HCTU's Configuration member variable to be final so it can be safely accessed from multiple threads.
Closes#2188
adapted for jdk7
Signed-off-by: Viraj Jasani <vjasani@apache.org>
(cherry picked from commit 86ebbdd8a2df89de37c2c3bd50e64292eaf28b11)
(cherry picked from commit 0806349adab338330428c900588234d7f6fcfcc2)
Adds `protected abstract Logger getLogger()` to `Action` so that
implementation's names are logged when actions are performed.
Signed-off-by: stack <stack@apache.org>
Signed-off-by: Jan Hentschel <jan.hentschel@ultratendency.com>
foo
if hbase.rowlock.wait.duration is <=0 then log a message and treat it as a value of 1ms.
amended for branches-1
Signed-off-by: Viraj Jasani <vjasani@apache.org>
(cherry picked from commit 840a55761b4b3db6bfcf8ca5b7ae67509fc21566)
Rewrote the patch for branch-1 since master has significanly diverged.
(cherry picked from commit dc5ef7af1f8b9e386495a73924c9442203f65a77)
Co-authored-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Sandeep Pal <50725353+sandeepvinayak@users.noreply.github.com>
Co-authored-by: Sandeep Pal <50725353+sandeepvinayak@users.noreply.github.com>
We observed this delete call to be a bottleneck for table with lots of
regions. Patch attempts to parallelize them.
Signed-off-by: Andrew Purtell <apurtell@apache.org>
(cherry picked from commit f07f30ae24e6d0e9151bb76aebf78928ec9572e3)
Running `ServerKillingChaosMonkey` via `RESTApiClusterManager` for any
duration of time slowly leaks region servers. I see failures on the
RESTApi side go unreported on the ChaosMonkey side. It seems like
`RuntimeException`s are being thrown and lost.
`PolicyBasedChaosMonkey` uses a primitive means of thread management
anyway. Update to use a thread pool, thread groups, and an
uncaughtExceptionHandler.
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
* Specify the version
* by using apt-get instead of pip install
* Remove comment blocks
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Lars Hofhansl <larsh@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>