Adds region state check on hbck2 assigns/unassigns. Returns pid of -1
if in inappropriate state with logging explaination which suggests
passing override if operator wants to assign/unassign anyways. Here
is an example of what happens now if hbck2 tries an unassign and
Region already unassigned:
2020-08-19 11:22:06,926 INFO [RpcServer.default.FPBQ.Fifo.handler=1,queue=0,port=50086] assignment.AssignmentManager(820): Failed {ENCODED => d1112e553991e938b6852f87774c91ee, NAME => 'TestHbck,zzzzz,1597861310769.d1112e553991e938b6852f87774c91ee.', STARTKEY => 'zzzzz', ENDKEY => ''} unassign, override=false; set override to by-pass state checks.
org.apache.hadoop.hbase.client.DoNotRetryRegionException: Unexpected state for state=CLOSED, location=null, table=TestHbck, region=d1112e553991e938b6852f87774c91ee
at org.apache.hadoop.hbase.master.assignment.AssignmentManager.preTransitCheck(AssignmentManager.java:583)
at org.apache.hadoop.hbase.master.assignment.AssignmentManager.createOneUnassignProcedure(AssignmentManager.java:812)
at org.apache.hadoop.hbase.master.MasterRpcServices.unassigns(MasterRpcServices.java:2616)
at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$HbckService$2.callBlockingMethod(MasterProtos.java)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:397)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)
Previous it would just create the unassign anyways. Now must pass override
to queue the procedure regardless. Safer.
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java
javadoc on assigns/unassigns. Minor refactor in assigns/unassigns to cater to
case where procedure may come back null (if override not set and fails state checks).
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
checkstyle cleanups.
Clarifying javadoc on how there is no state checking when bulk assigns creating/enabling
tables.
createOneAssignProcedure and createOneUnassignProcedure now handle exceptions which now
can be thrown if no override and region state is not appropriate.
Aggregation of createAssignProcedure and createUnassignProcedure instances adding in
region state check invoked if override is NOT set.
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStateNode.java
Change to setProcedure so it returns passed proc as result instead of void
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Guanghao Zhang <zghao@apache.org>
Signed-off-by: Anoop Sam John <anoopsamjohn@apache.org>
Introduce an additional method to our Admin interface that allow an
operator to selectivly run the normalizer. The IPC protocol supports
general table name select via compound filter.
Signed-off-by: Sean Busbey <busbey@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
when neighbor is larger than average size
* add `testMergeEmptyRegions` to explicitly cover different
interleaving of 0-sized regions.
* fix bug where merging a 0-size region is skipped due to large
neighbor.
* remove unused `splitPoint` from `SplitNormalizationPlan`.
* generate `toString`, `hashCode`, and `equals` methods from Apache
Commons Lang3 template on `SplitNormalizationPlan` and
`MergeNormalizationPlan`.
* simplify test to use equality matching over `*NormalizationPlan`
instances as plain pojos.
* test make use of this handy `TableNameTestRule`.
* fix line-length issues in `TestSimpleRegionNormalizer`
Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: huaxiangsun <huaxiangsun@apache.org>
Signed-off-by: Aman Poonia <aman.poonia.29@gmail.com>
Looped through the test 100 times and it passes. Without the patch it fails
every ~10 runs or so.
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
Looped through the test 100 times and it passes. Without the patch it fails
every ~10 runs or so.
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
Allow specifying base WALEntry filter on construction of
ReplicationSource. Add means of being able to filter WALs by name.
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
Add constructor that allows passing a predicate for filtering *in* WALs
and a list of filters for filtering *out* WALEntries. The latter was
hardcoded to filter out system-table WALEntries. The former did not
exist but we'll need it if Replication takes in more than just the
default Provider.
Signed-off-by Anoop Sam John <anoopsamjohn@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
writer null check if not initialized yet during syncrunner run (#2201)
Signed-off-by: Ramkrishna <ramkrishna@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: AnoopSamJohn<anoopsamjohn@apache.org>
(cherry picked from commit b0863c5832)
* RegionMover to ignore move failures for split/merged regions with ack mode
* Refactor MoveWithAck and MoveWithoutAck as high level classes
* UT for RegionMover gracefully handling split/merged regions while loading regions and throwing failure while loading offline regions
Closes#2172
Signed-off-by: Sean Busbey <busbey@apache.org>
Signed-off-by: Ted Yu <tyu@apache.org>
Signed-off-by: Anoop Sam John <anoopsamjohn@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Sean Busbey <busbey@apache.org>
* refactor how we use connection to rely on the access method
* refactor initialization and cleanup of the shared connection
* incompatibly change HCTU's Configuration member variable to be final so it can be safely accessed from multiple threads.
Closes#2180
Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
(cherry picked from commit 86ebbdd8a2)
if hbase.rowlock.wait.duration is <=0 then log a message and treat it as a value of 1ms.
Signed-off-by: Viraj Jasani <vjasani@apache.org>
(cherry picked from commit 840a55761b)
We observed this delete call to be a bottleneck for table with lots of
regions. Patch attempts to parallelize them.
Signed-off-by: Andrew Purtell <apurtell@apache.org>
(cherry picked from commit f07f30ae24)
Also fix three bugs:
* We were trying to delete non-empty directory; weren't doing
accounting for meta WALs where meta had moved off the server
(successfully)
* We were deleting split WALs rather than archiving them.
* We were not handling corrupt files.
Deprecations and removal of tests of old system.