Commit Graph

8902 Commits

Author SHA1 Message Date
Viraj Jasani 84f9900c99
HBASE-22923 min version of RegionServer to move system table regions (#3439) (#3438)
Signed-off-by: Andrew Purtell <apurtell@apache.org>
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
2021-07-01 16:02:37 +05:30
Viraj Jasani 4c7da496ad
HBASE-25902 Add missing CFs in meta during HBase 1 to 2 Upgrade (#3441) (#3417)
Signed-off-by: Michael Stack <stack@apache.org>
2021-07-01 15:13:25 +05:30
bsglz 147b030c1f
HBASE-26028 The view as json page shows exception when using TinyLfuBlockCache (#3420) 2021-06-30 11:36:00 +08:00
Duo Zhang 51893b9ba3
HBASE-26029 It is not reliable to use nodeDeleted event to track region server's death (#3430)
Signed-off-by: Xin Sun <ddupgs@gmail.com>
2021-06-30 08:44:19 +08:00
Duo Zhang 64d4915ca8
HBASE-26039 TestReplicationKillRS is useless after HBASE-23956 (#3440)
Signed-off-by: Michael Stack <stack@apache.org>
2021-06-30 08:00:17 +08:00
GeorryHuang 22ec681ad9
HBASE-25980 Master table.jsp pointed at meta throws 500 when no all r… (#3374)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2021-06-27 22:30:28 +08:00
GeorryHuang fb4af2a8bf
HBASE-25914 Provide slow/large logs on RegionServer UI (#3319)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Pankaj Kumar <pankajkumar@apache.org>
2021-06-27 22:26:35 +08:00
GeorryHuang e6eb65733a
HBASE-26015 Should implement getRegionServers(boolean) method in Asyn… (#3406)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2021-06-27 21:58:18 +08:00
Duo Zhang 39d143f290
HBASE-26020 Split TestWALEntryStream.testDifferentCounts out (#3409)
Signed-off-by: Xiaolin Ha <haxiaolin@apache.org>
2021-06-23 22:46:07 +08:00
litao fa2d127b7f
HBASE-25934 Add username for RegionScannerHolder (#3325)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2021-06-23 12:34:43 +05:30
lujiefsi d9bd29603b
HBASE-25877 Add access check for compactionSwitch (#3253)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2021-06-22 23:37:36 +08:00
belugabehr d44292ac1a
HBASE-25937: Clarify UnknownRegionException (#3330)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2021-06-22 22:36:30 +08:00
YutSean f640eef924
HBASE-26013 Get operations readRows metrics becomes zero after HBASE-25677 (#3404)
Signed-off-by: Reid Chan <reidchan@apache.org>
2021-06-22 11:38:42 +08:00
Duo Zhang c5461aaa5b HBASE-25992 Addendum add missing catch WALEntryFilterRetryableException back 2021-06-21 23:58:08 +08:00
Viraj Jasani 9f4177f7b4
HBASE-25698 Fixing IllegalReferenceCountException when using TinyLfuBlockCache (#3215)
Signed-off-by: Andrew Purtell <apurtell@apache.org>
Signed-off-by: Anoop Sam John <anoopsamjohn@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
2021-06-21 11:50:33 +05:30
Duo Zhang d2923755ec
HBASE-25992 Polish the ReplicationSourceWALReader code for 2.x after HBASE-25596 (#3376)
Signed-off-by: Yulin Niu <niuyulin@apache.org>
2021-06-20 16:32:42 +08:00
YutSean 53f61ef8d5
HBASE-26001 When turn on access control, the cell level TTL of Increment and Append operations is invalid (#3385)
Signed-off-by: Reid Chan <reidchan@apache.org>
2021-06-18 11:15:45 +08:00
Bharath Vissapragada 336d8464cc
HBASE-25998: Redo synchronization in SyncFuture (#3382)
Currently uses coarse grained synchronized approach that seems to
create a lot of contention. This patch

- Uses a reentrant lock instead of synchronized monitor
- Switches to a condition variable based waiting rather than busy wait
- Removed synchronization for unnecessary fields

Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: Andrew Purtell <apurtell@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2021-06-17 12:21:05 -07:00
Duo Zhang eb242be674
HBASE-25976 Implement a master based ReplicationTracker (#3390)
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
2021-06-17 18:24:49 +08:00
Bharath Vissapragada 5a19bcfa98
HBASE-25984: Avoid premature reuse of sync futures in FSHLog (#3371)
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2021-06-16 14:30:15 -07:00
binlijin 8f618a0846
HBASE-25997 NettyRpcFrameDecoder decode request header wrong when han… (#3380)
* HBASE-25997 NettyRpcFrameDecoder decode request header wrong when handleTooBigRequest
2021-06-15 14:25:18 +08:00
Toshihiro Suzuki 4262887432 HBASE-26002 MultiRowMutationEndpoint should return the result of the conditional update (addendum)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2021-06-15 09:42:49 +09:00
Andrew Purtell 97f90e0be2
HBASE-25994 Active WAL tailing fails when WAL value compression is enabled (#3377)
Depending on which compression codec is used, a short read of the
compressed bytes can cause catastrophic errors that confuse the WAL reader.
This problem can manifest when the reader is actively tailing the WAL for
replication. To avoid these issues when WAL value compression is enabled,
BoundedDelegatingInputStream should assume enough bytes are available to
supply a reader up to its bound. This behavior is valid per the contract
of available(), which provides an _estimate_ of available bytes, and
equivalent to IOUtils.readFully but without requiring an intermediate
buffer.

Added TestReplicationCompressedWAL and TestReplicationValueCompressedWAL.
Without the WALCellCodec change TestReplicationValueCompressedWAL will
fail.

Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
2021-06-14 17:16:31 -07:00
Toshihiro Suzuki a35ec994b9 HBASE-26002 MultiRowMutationEndpoint should return the result of the conditional update (addendum)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2021-06-15 08:03:30 +09:00
Toshihiro Suzuki ec31818574
HBASE-26002 MultiRowMutationEndpoint should return the result of the conditional update (#3384) 2021-06-15 07:43:27 +09:00
bsglz 329f0baa98
HBASE-25967 The readRequestsCount does not calculate when the outResu… (#3351)
* HBASE-25967 The readRequestsCount does not calculate when the outResults is empty

Co-authored-by: Zheng Wang <wangzheng@apache.org>
2021-06-10 09:37:31 +08:00
Xiaolin Ha 471e8159f0
HBASE-25981 JVM crash when displaying RegionServer UI (#3364)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2021-06-09 18:11:18 +08:00
GeorryHuang 40a3d57628
HBASE-22708 Remove the deprecated methods in Hbck interface (#3362)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2021-06-08 09:33:08 +08:00
Duo Zhang eddf4cc3a1
HBASE-25963 HBaseCluster should be marked as IA.Public (#3348)
Signed-off-by: Yulin Niu <niuyulin@apache.org>
2021-06-06 21:20:59 +08:00
Peter Somogyi de06e20e0a
HBASE-25970 MOB data loss - incorrect concatenation of MOB_FILE_REFS (#3355)
Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org>
Signed-off-by: Pankaj Kumar<pankajkumar@apache.org>
2021-06-05 08:57:26 +02:00
meiyi 4671cb1801
HBASE-25929 RegionServer JVM crash when compaction (#3318)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2021-06-03 17:17:17 +08:00
xijiawen 426c3c16f3
HBASE-25799 add clusterReadRequests and clusterWriteRequests jmx (#3188)
* HBASE-25799 add clusterReadRequests and clusterWriteRequests jmx

Co-authored-by: stevenxi <stevenxi@tencent.com>
2021-06-03 15:48:03 +08:00
Andrew Purtell 335305e0cf
HBASE-25911 Replace calls to System.currentTimeMillis with EnvironmentEdgeManager.currentTime (#3302)
We introduced EnvironmentEdgeManager as a way to inject alternate clocks
for unit tests. In order for this to be effective, all callers that would
otherwise use System.currentTimeMillis() must call
EnvironmentEdgeManager.currentTime() instead, except the implementers of
EnvironmentEdge.

Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2021-06-01 09:57:48 -07:00
Bharath Vissapragada 4fb0861214
HBASE-25932 addendum: Add test comments. (#3344)
Signed-off-by Anoop Sam John <anoopsamjohn@apache.org>
2021-06-01 08:03:25 -07:00
Bharath Vissapragada b04c3c7786
HBASE-25932: Ensure replication reads the trailer bytes from WAL. (#3332)
This bug was exposed by the test from HBASE-25924. Since this wal
implementations close the wal asynchronously, replication can potentially
miss the trailer bytes. (see jira comment for detailed analysis).

While this is not a correctness problem (since trailer does not have any entry data),
it erroneously bumps a metric that is used to track skipped bytes in WAL resulting
in false alarms which is something we should avoid.

Reviewed-by: Rushabh Shah <rushabh.shah@salesforce.com>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by Anoop Sam John <anoopsamjohn@apache.org>
2021-05-31 22:12:47 -07:00
Duo Zhang 06c6e06803
HBASE-25916 Move FavoredNodeLoadBalancer to hbase-balancer module (#3327)
Signed-off-by: Yulin Niu <niuyulin@apache.org>
2021-05-31 22:55:04 +08:00
Duo Zhang f2ff816532
HBASE-25939 Move more tests code for StochasticLoadBalancer to hbase-balancer module (#3331)
Signed-off-by: Yulin Niu <niuyulin@apache.org>
2021-05-30 22:00:18 +08:00
Michael Stack f119a865cf
HBASE-25940 Update Compression/TestCompressionTest: LZ4, SNAPPY, LZO (#3334)
Undo asserts that LZ4 and SNAPPY fails if their native libs are NOT
loaded; as of hadoop 3.3.1, LZ4 and SNAPPY can work w/o native libs.

Signed-off-by: Duo Zhang <zhangduo@apache.org>
2021-05-29 09:02:52 -07:00
Sandeep Pal 9a2027bf71
HBASE-25927: Fix the log messages by not stringifying the exceptions in log (#3338)
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
2021-05-28 21:28:19 -07:00
Duo Zhang c1d299fc1d
HBASE-25938 The SnapshotOfRegionAssignmentFromMeta.initialize call in FavoredNodeLoadBalancer is just a dummy one (#3329)
Signed-off-by: Yulin Niu <niuyulin@apache.org>
2021-05-29 11:04:41 +08:00
Victor 3f7d2897a1
HBASE-25910 - Fix port assignment test (#3308)
Signed-off-by: David Manning <david.manning@salesforce.com>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2021-05-28 20:52:07 +05:30
Duo Zhang 7218c83f81
HBASE-25931 Move FavoredNodeManager to hbase-balancer module (#3324)
Signed-off-by: Yulin Niu <niuyulin@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
2021-05-28 15:57:59 +08:00
Duo Zhang ed8df5eded
HBASE-25758 Move MetaTableAccessor out of hbase-balancer module (#3309)
Signed-off-by: Yulin Niu <niuyulin@apache.org>
2021-05-28 09:19:07 +08:00
Wellington Ramos Chevreuil feb89d988b
HBASE-25933 Log trace raw exception, instead of cause message in NettyRpcServerRequestDecoder (#3323)
Signed-off-by: Rushabh Shah <shahrs87@gmail.com>
Signed-off-by: Peter Somogyi <psomogyi@apache.org>
Signed-off-by: Josh Elser <elserj@apache.org>
2021-05-27 19:54:25 +01:00
Duo Zhang 63141bf576
HBASE-25926 Cleanup MetaTableAccessor references in FavoredNodeBalancer related code (#3313)
Signed-off-by: Yulin Niu <niuyulin@apache.org>
2021-05-27 16:05:14 +08:00
Rushabh Shah a22e418cf6
HBASE-25924 Re-compute size of WAL file while removing from WALEntryStream (#3314)
Signed-off-by: Andrew Purtell <apurtell@apache.org>
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
2021-05-26 10:40:44 -07:00
Duo Zhang 76fbb8b965
HBASE-25818 Move StochasticLoadBalancer to hbase-balancer module (#3206)
Signed-off-by: Yi Mei <myimeiyi@gmail.com>
2021-05-25 23:24:35 +08:00
Duo Zhang 6a77872879
HBASE-25894 Improve the performance for region load and region count related cost functions (#3276)
Signed-off-by: Yi Mei <myimeiyi@gmail.com>
2021-05-25 18:04:06 +08:00
GeorryHuang 36affdaa8e
HBASE-25906 UI of master-status to show recent history of balancer desicion (#3296)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2021-05-25 11:08:48 +05:30
Baiqiang Zhao 21aa553bc1
HBASE-25745 Deprecate/Rename config `hbase.normalizer.min.region.count` to `hbase.normalizer.merge.min.region.count`
Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
2021-05-24 13:03:27 -07:00
Anoop Sam John f53ceeecb0
HBASE-25898 RS getting aborted due to NPE in Replication WALEntryStream (#3292)
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Rushabh Shah <shahrs87@gmail.com>
2021-05-24 23:41:45 +05:30
Xiaolin Ha b02c8102b7
HBASE-25899 Improve efficiency of SnapshotHFileCleaner (#3280)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2021-05-24 22:15:47 +08:00
Duo Zhang f94f4e29fe
HBASE-25873 Refactor and cleanup the code for CostFunction (#3274)
Signed-off-by: Yi Mei <myimeiyi@gmail.com>
2021-05-24 18:14:55 +08:00
Xiaolin Ha 7f6b778c14
HBASE-25773 TestSnapshotScannerHDFSAclController.setupBeforeClass is flaky (#3160) 2021-05-22 21:56:17 +08:00
caoliqing edde01c605
HBASE-25892: 'False' should be 'True' in auditlog of listLabels (#3273)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2021-05-22 11:22:51 +08:00
Andrew Purtell 8ec6fd9459
HBASE-25869 WAL value compression (#3244)
WAL storage can be expensive, especially if the cell values
represented in the edits are large, consisting of blobs or
significant lengths of text. Such WALs might need to be kept around
for a fairly long time to satisfy replication constraints on a space
limited (or space-contended) filesystem.

We have a custom dictionary compression scheme for cell metadata that
is engaged when WAL compression is enabled in site configuration.
This is fine for that application, where we can expect the universe
of values and their lengths in the custom dictionaries to be
constrained. For arbitrary cell values it is better to use one of the
available compression codecs, which are suitable for arbitrary albeit
compressible data.

Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
2021-05-21 11:05:52 -07:00
Rushabh Shah dfa88e1ffe
HBASE-25827 Per Cell TTL tags get duplicated with increments causing tags length overflow (#3210)
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Aman Poonia <apoonia@salesforce.com>
2021-05-21 22:56:11 +05:30
Baiqiang Zhao a1177b3e91
HBASE-25682 Add a new command to update the configuration of all RSs in a RSGroup (#3080)
* HBASE-25682 Add a new command to update the configuration of all RSs in a RSGroup

Signed-off-by: Pankaj Kumar<pankajkumar@apache.org>
2021-05-21 22:49:25 +05:30
Sandeep Pal 15e861169f
HBASE-25848: Add flexibility to backup replication in case replication filter throws an exception (#3283)
* HBASE-25848: Add flexibility to backup replication in case replication filter throws an exception
2021-05-20 13:31:44 -07:00
Duo Zhang 7c24ed4f45
HBASE-25897 TestRetainAssignmentOnRestart is flaky after HBASE-25032 (#3281)
Signed-off-by: Xiaolin Ha <haxiaolin@apache.org>
2021-05-20 20:58:53 +08:00
GeorryHuang 5b9940907e
HBASE-25791 UI of master-status to show a recent history of that why balancer was rejected to run (#3275)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2021-05-19 12:05:29 +08:00
Duo Zhang 741b4b4674
HBASE-25032 Do not assign regions to region server which has not called regionServerReport yet (#3268)
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
2021-05-18 08:08:03 +08:00
Pankaj 2126ec94f0
HBASE-25875 RegionServer failed to start with IllegalThreadStateException due to race condition in AuthenticationTokenSecretManager (#3250)
* HBASE-25875 RegionServer failed to start with IllegalThreadStateException due to race condition in AuthenticationTokenSecretManager's start & retrievePassword method

Signed-off-by: stack <stack@apache.com>
2021-05-17 12:17:24 +05:30
Baiqiang Zhao d69d5c24b1
HBASE-25861 Correct the usage of Configuration#addDeprecation (#3249)
Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
2021-05-14 09:31:06 -07:00
Viraj Jasani 0955a7a22e
HBASE-25884 Return empty records for disabled balancer in-memory queue (#3263)
Signed-off-by: stack <stack@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Aman Poonia <apoonia@salesforce.com>
2021-05-14 12:54:07 +05:30
Michael Stack 630c73fda4 HBASE-25867 Extra doc around ITBLL (#3242)
* HBASE-25867 Extra doc around ITBLL

Minor edits to a few log messages.
Explain how the '-c' option works when passed to ChaosMonkeyRunner.
Some added notes on ITBLL.
Fix whacky 'R' and 'Not r' thing in Master (shows when you run ITBLL).
In HRS, report hostname and port when it checks in (was debugging issue
where Master and HRS had different notions of its hostname).
Spare a dirty FNFException on startup if base dir not yet in place.

* Address Review by Sean

Signed-off-by: Sean Busbey <busbey@apache.org>
2021-05-11 19:26:57 +01:00
Duo Zhang 29bd3dd586
HBASE-25852 Move all the intialization work of LoadBalancer implementation to initialize method (#3248)
Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: Yulin Niu <niuyulin@apache.org>
2021-05-11 22:03:33 +08:00
Catalin Luca 2b6a91a1da
HBASE-25859 Reference class incorrectly parses the protobuf magic marker (#3236)
Co-authored-by: Catalin Luca <luca@adobe.com>
Signed-off-by: stack <stack@apache.org>
2021-05-10 12:45:23 -07:00
Rushabh Shah 8c2332d465
HBASE-25860 Add metric for successful wal roll requests. (#3238)
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2021-05-08 12:58:29 +05:30
Duo Zhang c2a1d31270 HBASE-25774 Addendum fix compile error 2021-05-08 13:56:57 +08:00
Andrew Purtell 02b018cf1a HBASE-25774 ServerManager.getOnlineServer may miss some region servers when refreshing state in some procedure implementations
Revert "HBASE-25032 Wait for region server to become online before adding it to online servers in Master (#2769)"

This reverts commit 1e4639d2eb.

Conflicts:

	hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
2021-05-07 18:21:46 -07:00
niuyulin 6cfff27465
HBASE-25837 TestRollingRestart is flaky (#3220)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2021-05-07 18:58:45 +08:00
Andrew Purtell 6309c090b5
HBASE-25854 Remove redundant AM in-memory state changes in CatalogJanitor (#3234)
In CatalogJanitor we schedule GCRegionProcedure to clean up both
filesystem and in-memory state after a split, and
GCMultipleMergedRegionsProcedure to do the same for merges. Both of these
procedures clean up in-memory state, but CatalogJanitor also does this
redundantly just after scheduling the procedures. The cleanup should be
done in only one place. Presumably we are using the procedures to do it in
a principled way. Remove the redundancy in CatalogJanitor and fix any
follow on issues, like test failures.

Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2021-05-06 09:13:33 -07:00
Duo Zhang ba4cb91211
HBASE-25851 Make LoadBalancer not extend Configurable interface (#3233)
Signed-off-by: Yulin Niu <niuyulin@apache.org>
2021-05-06 16:11:46 +08:00
Andrew Purtell cc88cf0ecf
HBASE-25847 More DEBUG and TRACE level logging in CatalogJanitor and HbckChore (#3230)
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
2021-05-05 17:01:00 -07:00
Nick Dimiduk eb9b54304e HBASE-25843 move master http-related code into o.a.h.h.master.http
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2021-05-05 08:34:37 -07:00
Duo Zhang 90f986497b
HBASE-25834 Remove balanceTable method from LoadBalancer interface (#3217)
Signed-off-by: Yulin Niu <niuyulin@apache.org>
2021-05-05 15:48:01 +08:00
Nick Dimiduk 17193dae58 HBASE-25842 move regionserver http-related code into o.a.h.h.regionserver.http
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2021-05-04 15:40:17 -07:00
Andrew Purtell 432d141474
HBASE-25835 Ignore duplicate split requests from regionserver reports (#3218)
Processing of the RS report happens asynchronously from other activities
which can mutate region state. For example, a split procedure may already
be running. A split procedure cannot succeed if the parent region is no
longer open, so we can ignore it in that case.

Note that submitting more than one split procedure for a given region is
harmless -- the split is fenced in the procedure handling -- but it would
be noisy in the logs. Only one procedure can succeed. The other
procedure(s) would abort during initialization and report failure with
WARN level logging.

Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Pankaj <pankajkumar@apache.org>
2021-05-04 10:05:29 -07:00
Andrew Purtell fda324b116
HBASE-25836 RegionStates#getAssignmentsForBalancer should only care about OPEN or OPENING regions (#3219)
RegionStates#getAssignmentsForBalancer is used by the HMaster to
collect all regions of interest to the balancer for the next chore
iteration. We check if a table is in disabled state to exclude
regions that will not be of interest (because disabled regions are
or will be offline) or are in a state where they shouldn't be
mutated (like SPLITTING). The current checks are not actually
comprehensive.

Filter out regions not in OPEN or OPENING state when building the
set of interesting regions for the balancer to consider. Only
regions open (or opening) on the cluster are of interest to
balancing calculations for the current iteration. Regions in all
other states can be expected to not be of interest – either offline
(OFFLINE, or FAILED_*), not subject to balancer decisions now
(SPLITTING, SPLITTING_NEW, MERGING, MERGING_NEW), or will be
offline shortly (CLOSING) – until at least the next chore
iteration.

Add TRACE level logging.

Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2021-05-03 18:23:07 -07:00
Andrew Purtell e44592a37d
HBASE-25840 CatalogJanitor warns about skipping gc of regions during RIT, but does not actually skip (#3223)
We claim in a WARN level log line to be "Playing-it-safe skipping merge/
split gc'ing of regions from hbase:meta while regions-in-transition (RIT)"
but do not actually skip because of a missing return. Remove the warning.

Signed-off-by: Duo Zhang <zhangduo@apache.org>
2021-05-03 18:14:38 -07:00
Duo Zhang 762abe3bea
HBASE-25838 Use double instead of Double in StochasticLoadBalancer (#3221)
Signed-off-by: Yulin Niu <niuyulin@apache.org>
2021-05-04 09:04:47 +08:00
Duo Zhang 7640134e3e
HBASE-25774 Added more detailed logs about the restarting of region servers (#3213)
Signed-off-by: Yulin Niu <niuyulin@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2021-05-03 20:33:33 +08:00
GeorryHuang 00fec24c90
HBASE-25790 NamedQueue 'BalancerRejection' for recent history of balancer skipping (#3182)
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2021-05-02 21:30:48 +05:30
Duo Zhang 73a82bd7c6
HBASE-25825 RSGroupBasedLoadBalancer.onConfigurationChange should chain the request to internal balancer (#3209)
Signed-off-by: Yulin Niu <niuyulin@apache.org>
2021-04-30 22:45:33 +08:00
Duo Zhang 6c65314cdf
HBASE-25819 Fix style issues for StochasticLoadBalancer (#3207)
Signed-off-by: Yulin Niu <niuyulin@apache.org>
2021-04-29 11:03:55 +08:00
Nick Dimiduk b061b0c4ed HBASE-25779 HRegionServer#compactSplitThread should be private
Minor refactor. Make the `compactSplitThread` member field of `HRegionServer` private, and gate
all access through the getter method.

Signed-off-by: Yulin Niu <niuyulin@apache.org>
Signed-off-by: Pankaj Kumar <pankajkumar@apache.org>
2021-04-28 16:46:36 -07:00
Michael Stack 2382f68b23
HBASE-25792 Filter out o.a.hadoop.thirdparty building shaded jars (#3184)
Need to add to allowed-licenses list too....

Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
Reviewed-by: Duo Zhang <zhangduo@apache.org>
Reviewed-by: Nick Dimiduk <ndimiduk@apache.org>
2021-04-27 08:37:25 -07:00
Duo Zhang 8856f61986 HBASE-25757 Addendum remove CandidateGenerator classes under hbase-server module 2021-04-27 23:25:51 +08:00
Duo Zhang a4d954e606
HBASE-25757 Move BaseLoadBalancer to hbase-balancer module (#3191)
Signed-off-by: Yulin Niu <niuyulin@apache.org>
2021-04-26 12:03:25 +08:00
Duo Zhang 7f90c2201f HBASE-25723 Temporarily remove the trace support for RegionScanner.next (#3119)
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2021-04-25 09:23:23 +08:00
Duo Zhang f6ff519dd0 HBASE-25591 Upgrade opentelemetry to 0.17.1 (#2971)
Signed-off-by: Guanghao Zhang <zghao@apache.org>
2021-04-25 09:23:23 +08:00
Duo Zhang bb8c4967f8 HBASE-25535 Set span kind to CLIENT in AbstractRpcClient (#2907)
Signed-off-by: Guanghao Zhang <zghao@apache.org>
2021-04-25 09:23:23 +08:00
Duo Zhang 2be2c63f0d HBASE-25484 Add trace support for WAL sync (#2892)
Signed-off-by: Guanghao Zhang <zghao@apache.org>
2021-04-25 09:23:23 +08:00
Duo Zhang 03e12bfa4a HBASE-25455 Add trace support for HRegion read/write operation (#2861)
Signed-off-by: Guanghao Zhang <zghao@apache.org>
2021-04-25 09:23:23 +08:00
Duo Zhang ae2c62ffaa HBASE-25481 Add host and port attribute when tracing rpc call at client side (#2857)
Signed-off-by: Guanghao Zhang <zghao@apache.org>
2021-04-25 09:23:23 +08:00
Duo Zhang 805b2ae2ad HBASE-23898 Add trace support for simple apis in async client (#2813)
Signed-off-by: Guanghao Zhang <zghao@apache.org>
2021-04-25 09:23:23 +08:00
Duo Zhang 2420286715 HBASE-25401 Add trace support for async call in rpc client (#2790)
Signed-off-by: Guanghao Zhang <zghao@apache.org>
2021-04-25 09:23:23 +08:00
Duo Zhang 302d9ea8b8 HBASE-25373 Remove HTrace completely in code base and try to make use of OpenTelemetry
Signed-off-by: stack <stack@apache.org>
2021-04-25 09:23:23 +08:00
Andrew Purtell 9895b2dfdf
HBASE-25756 Support alternate compression for major and minor compactions (#3142)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2021-04-23 15:45:26 -07:00
Duo Zhang 96fefce9c3
HBASE-25802 Miscellaneous style improvements for load balancer related classes (#3192)
Signed-off-by: Yulin Niu <niuyulin@apache.org>
2021-04-23 15:20:27 +08:00
haxiaolin 996862c1cc
HBASE-25754 StripeCompactionPolicy should support compacting cold regions (#3152)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2021-04-23 14:58:53 +08:00
Toshihiro Suzuki 5f4e2e111b
HBASE-25766 Introduce RegionSplitRestriction that restricts the pattern of the split point (#3150)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
2021-04-22 13:53:36 +09:00
Duo Zhang 50920ee306
HBASE-25774 TestSyncReplicationStandbyKillRS#testStandbyKillRegionServer is flaky (#3189)
Wait for the restarter thread to finish before checking the state
Add more detailed logs

Signed-off-by: meiyi <myimeiyi@gmail.com>
2021-04-22 10:10:15 +08:00
Duo Zhang d5c5e48839
HBASE-25793 Move BaseLoadBalancer.Cluster to a separated file (#3185)
Signed-off-by: Yulin Niu <niuyulin@apache.org>
2021-04-22 09:59:49 +08:00
haxiaolin 0d257baf29
HBASE-25763 TestRSGroupsWithACL.setupBeforeClass is flaky (#3158)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Yulin Niu <niuyulin@apache.org>
2021-04-21 14:41:51 +08:00
Duo Zhang 781da1899a
HBASE-25290 Remove table on master related code in balancer implementation (#3162)
Signed-off-by: Yulin Niu <niuyulin@apache.org>
2021-04-20 21:31:09 +08:00
Nick Dimiduk b65890da1d Revert "HBASE-25739 TableSkewCostFunction need to use aggregated deviation (#3067)"
This reverts commit 533c84d330.
2021-04-16 09:35:02 -07:00
Duo Zhang bf78246b4f
HBASE-25775 Use a special balancer to deal with maintenance mode (#3161)
Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org>
2021-04-16 09:50:24 +08:00
clarax 533c84d330
HBASE-25739 TableSkewCostFunction need to use aggregated deviation (#3067)
Signed-off-by: Michael Stack <stack@apache.org>
Reviewed-by: David Manning <david.manning@salesforce.com>
2021-04-15 13:12:07 -07:00
xiaoyu 6cf4fdde61
HBASE-25776 Use Class.asSubclass to fix the warning in StochasticLoadBalancer.loadCustomCostFunctions (#3163)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2021-04-15 23:34:06 +05:30
Nick Dimiduk bc52bca741
HBASE-25770 Http InfoServers should honor gzip encoding when requested (#3159)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Josh Elser <elserj@apache.org>
2021-04-15 09:07:13 -07:00
Duo Zhang 5910e9e2d1
HBASE-25767 CandidateGenerator.getRandomIterationOrder is too slow on large cluster (#3149)
Signed-off-by: XinSun <ddupgs@gmail.com>
Signed-off-by: Yulin Niu <niuyulin@apache.org>
2021-04-13 23:00:54 +08:00
Duo Zhang de012d7d1f
HBASE-25759 The master services field in LocalityBasedCostFunction is never used (#3144)
Signed-off-by: Yulin Niu <niuyulin@apache.org>
2021-04-12 22:27:01 +08:00
Duo Zhang f9e928e5a7
HBASE-25184 Move RegionLocationFinder to hbase-balancer (#2543)
Signed-off-by: Yulin Niu <niuyulin@apache.org>
2021-04-10 21:10:53 +08:00
Nick Dimiduk 5f1f8be667 HBASE-25744 Change default of `hbase.normalizer.merge.min_region_size.mb` to `0`
Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: Reid Chan <reidchan@apache.org>
2021-04-09 15:00:38 -07:00
Geoffrey Jacoby 74e533d5ab
HBASE-25751 - Add writable TimeToPurgeDeletes to ScanOptions (#3137)
Signed-off-by: Andrew Purtell <apurtell@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2021-04-09 13:05:47 -07:00
meiyi ad06aa2082
HBASE-25747 Remove unused getWriteAvailable method in OperationQuota (#3133)
Signed-off-by: stack <stack@apache.org>
2021-04-09 10:23:34 +08:00
Pankaj 6444e94c38
HBASE-25717 RegionServer aborted with due to ClassCastException (#3108)
Signed-off-by: stack <stack@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2021-04-08 16:27:17 +05:30
stack d9f4f41f76 HBASE-25735 Add target Region to connection exceptions
Addendum to fix broke compile.
2021-04-07 07:56:25 -07:00
Jan Hentschel 048ca4e43f
HBASE-25174 Remove deprecated fields in HConstants (#2558)
Remove the deprecated fields, which can be removed in 3.0.0. Marked the
constant OLDEST_TIMESTAMP as InterfaceAudience.Private as it is only use
in classes, which are also marked as InterfaceAudience.Private.

Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2021-04-03 23:12:16 +08:00
Jan Hentschel 5a63fe65aa
HBASE-25199 Remove deprecated HStore#getStoreHomedir methods (#2562)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2021-04-03 23:10:20 +08:00
d-c-manning 7a31557c51
HBASE-25726 MoveCostFunction is not included in the list of cost functions for StochasticLoadBalancer (#3116)
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2021-04-02 09:22:13 -07:00
lujiefsi e14ec57eee
HBASE-25558:Adding audit log for execMasterService (#3101)
Signed-off-by: stack <stack@apache.org>
2021-03-31 16:12:31 -07:00
Toshihiro Suzuki 46f7d9dd4b
HBASE-25703 Support conditional update in MultiRowMutationEndpoint (#3098)
Signed-off-by: Michael Stack <stack@apache.org>
2021-03-30 09:18:56 +09:00
Josh Elser 57a49f5ca7
HBASE-25692 Always try to close the WAL reader when we catch any exception (#3090)
There are code paths in which we throw non-IOExceptions when
initializing a WAL reader. However, we only close the InputStream to the
WAL filesystem when the exception is an IOException. Close it if it is
open in all cases.

Co-authored-by: Josh Elser <jelser@cloudera.com>
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2021-03-29 12:15:58 -07:00
Michael Stack 6a8998b29c HBASE-25695 Link to the filter on hbase:meta from user tables panel on master page (#3092)
Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
2021-03-27 20:36:55 -07:00
Toshihiro Suzuki 93b1163a8b
HBASE-25702 Remove RowProcessor (#3097)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2021-03-28 07:38:42 +09:00
caroliney14 1e4639d2eb
HBASE-25032 Wait for region server to become online before adding it to online servers in Master (#2769)
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
2021-03-25 10:39:07 -07:00
Andrew Purtell f6bb4bb93e
HBASE-25693 NPE getting metrics from standby masters (MetricsMasterWrapperImpl.getMergePlanCount) (#3091)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
2021-03-24 19:09:10 -07:00
huaxiangsun 1e3fe3ceac
HBASE-25691 Test failure: TestVerifyBucketCacheFile.testRetrieveFromFile (#3081)
The issue is that FileInputStream is created with try-with-resources, so its close() is called right after the try sentence.
FileInputStream is a finalize class, when this object is garbage collected, its close() is called again.
To avoid this double-free resources, add guard against it.

Signed-off-by: stack <stack@apache.org>
2021-03-24 09:01:17 -07:00
XinSun 3358091b7e
HBASE-25683 Simplify UTs using DummyServer (#3069)
Co-authored-by: sunxin <sunxin@apache.com>
Signed-off-by: stack <stack@apache.org>
2021-03-22 08:54:09 -07:00
Duo Zhang ba3610d097
HBASE-19577 Use log4j2 instead of log4j for logging (#1708)
Signed-off-by: stack <stack@apache.org>
2021-03-20 09:21:25 +08:00
Baiqiang Zhao a3938c8725
HBASE-25681 Add a switch for server/table queryMeter (#3070)
Signed-off-by: stack <stack@apache.org>
2021-03-19 16:23:41 -07:00
shahrs87 fea4bd12e2 HBASE-25679 Size of log queue metric is incorrect (#3071)
Co-authored-by: Rushabh <rushabh.shah@salesforce.com>
Signed-off-by: stack <stack@apache.org>
2021-03-19 16:16:48 -07:00
Toshihiro Suzuki f4059907e2
HBASE-25678 Support nonce operations for Increment/Append in RowMutations and CheckAndMutate (#3064)
Signed-off-by: stack <stack@apache.org>
2021-03-19 21:25:40 +09:00
haxiaolin 585aca1f05
HBASE-25518 Support separate child regions to different region servers (#3001)
Signed-off-by: stack <stack@apache.org>
2021-03-18 12:38:17 -07:00
bsglz d93035a131
HBASE-25643 The delayed FlushRegionEntry should be removed when we ne… (#3049)
Signed-off-by: AnoopSamJohn <anoopsamjohn@apache.org>
Signed-off-by: stack <stack@apache.org>
2021-03-18 12:13:06 -07:00
Michael Stack 7ac1c8bbf8 HBASE-25677 Server+table counters on each scan #nextRaw invocation becomes a bottleneck when heavy load (#3061)
Don't have every handler update regionserver metrics on each
scan#nextRaw; instead, do a batch update just before Scan
returns. Otherwise, all running handlers end up contending
on metrics update.

M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
 Update of regionserver metrics counters moved out to caller where
 can be done as a batch update instead of per-next.

M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServer.java
 Class doc to encourage batch updating metrics.
 Remove the single update as unused anymore.

M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
 Count calls to nextRaw. Update regionserver count in finally block when
 scan is done rather than per nextRaw call. Move all metrics updates to
 finally.

Signed-off-by: Reid Chan <reidchan@apache.org>
Signed-off-by: Baiqiang Zhao <ZhaoBQ>
2021-03-18 11:46:37 -07:00
Sandeep Pal ff3821814a
HBASE-25627: HBase replication should have a metric to represent if the source is stuck getting initialized (#3018)
Introduces a new metric that tracks number of replication sources that are stuck in initialization.

Signed-off-by: Xu Cang <xucang@apache.org>
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
2021-03-17 09:10:44 -07:00
DivyeshChandra bcf503e6c2
HBASE-25653 Add units and round off region size to 2 digits after decimal (#3046)
Signed-off-by: stack <stack@duboce.net>
Reviewed-by: Viraj Jasani <vjasani@apache.org>
2021-03-16 21:32:12 -07:00
bitterfox ebb0adf500
HBASE-25665 Option to use hostname instead of canonical hostname for secure HBase cluster connection (#3051) 2021-03-16 21:04:25 -07:00
Baiqiang Zhao db2e6d8c63
HBASE-25597 Add row info in Exception when cell size exceeds maxCellSize (#2976)
Signed-off-by: stack <stack@apache.org>
2021-03-15 15:49:33 -07:00
haxiaolin 0ef892b68a
HBASE-25621 Balancer should check region plan source to avoid misplace region groups (#3002)
Signed-off-by: stack <stack@duboce.net>
2021-03-15 14:47:27 -07:00
haxiaolin 625bea3ecc
HBASE-25595 TestLruBlockCache.testBackgroundEvictionThread is flaky (#2974)
Signed-off-by: stack <stack@apache.org>
2021-03-15 14:25:38 -07:00
Michael Stack 630f47e4ec HBASE-25660 Print split policy in use on Region open (as well as split policy vitals) (#3044)
Add a toString to all split policy implementations listing name and
 vitals. Use this toString in the Region open message. Ditto for flush
 policy for the Region.

 Signed-off-by: Huaxiang Sun<huaxiangsun@apache.org>
2021-03-15 14:12:31 -07:00
haxiaolin aeec8ca64b
HBASE-25635 CandidateGenerator may miss some region balance actions (#3024)
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2021-03-15 21:28:22 +08:00
Duo Zhang 876fec1648
HBASE-25657 Fix spotbugs warnings after upgrading spotbugs to 4.x (#3041)
Signed-off-by: meiyi <myimeiyi@gmail.com>
Signed-off-by: stack <stack@apache.org>
2021-03-12 14:34:10 +08:00
shahrs87 7386fb6e1f
HBASE-25622 Result#compareResults should compare tags. (#3026)
Signed-off-by: stack <stack@apache.org>
2021-03-11 21:51:07 -08:00
Michael Stack 1a69a52653
HBASE-25570 On largish cluster, "CleanerChore: Could not delete dir..." makes master log unreadable (#2949)
Turn down the amount we log. If you want to see the full exception
enable TRACE-level logging.

Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: shahrs87
2021-03-11 21:35:24 -08:00