Commit Graph

267 Commits

Author SHA1 Message Date
Luca Kovács c04edf7835 HBASE-20904 Prometheus /metrics http endpoint for monitoring (#4691)
Co-authored-by: Luca Kovacs <kovacs.luca.agota@gmail.com>
Co-authored-by: Madhusoodan P <akshayapataki123@gmail.com>
Co-authored-by: Luca Kovacs <lkovacs@cloudera.com>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
(cherry picked from commit f9ea7ee0d6)

Conflicts:
	src/main/asciidoc/_chapters/ops_mgt.adoc
2022-08-24 11:31:16 +08:00
Bryan Beaudreault e99faad83e HBASE-27241 Add metrics for evaluating cost and effectiveness of bloom filters (#4669)
Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
2022-08-09 15:25:47 -04:00
Andrew Purtell b17bc94b10 HBASE-27235 Clean up error-prone warnings in hbase-hadoop-compat (#4648)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2022-07-25 17:30:50 -07:00
Duo Zhang 99f2ab5aa8 HBASE-27220 Apply the spotless format change in HBASE-27208 to our code base
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2022-07-19 10:00:31 +08:00
SiCheng-Zheng 325f90bb40 HBASE-27144 Add special rpc handlers for bulkload operations (#4558)
Co-authored-by: SiCheng-Zheng <zhengsicheng@jd.com>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
(cherry picked from commit ff8eb59709)

Conflicts:
	hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
	hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcScheduler.java
	hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/SimpleRpcScheduler.java
	hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/AnnotationReadingPriorityFunction.java
2022-07-17 21:41:27 +08:00
Bryan Beaudreault 98ec7586b4 HBASE-27188 Report maxStoreFileCount in jmx (#4609)
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2022-07-11 22:14:01 -04:00
Bryan Beaudreault 9ccb0f96b9 HBASE-27186 Report block cache size metrics separately for L1 and L2 (#4608)
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2022-07-11 22:01:06 -04:00
litao bc40edecfc HBASE-27047 Fix typo for metric drainingRegionServers (#4441)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
(cherry picked from commit 63fe279b45)
2022-05-22 15:02:34 +08:00
Duo Zhang 1a5b1b266c HBASE-26899 Run spotless:apply 2022-05-01 22:41:49 +08:00
liangxs 895e0f474a HBASE-26975 Add on heap and off heap memstore info in rs web UI (#4368)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
(cherry picked from commit cdf81ea5cc)
2022-04-28 23:15:10 +08:00
Bri Augenreich c2baa1fb7b HBASE-26581 Add metrics for failed replication edits (#4347)
Co-authored-by: Briana Augenreich <baugenreich@hubspot.com>
Signed-off-by: Andrew Purtell <apurtell@apache.org>
Signed-off-by: Bryan Beaudreault <bbeaudreault@apache.org>
2022-04-26 17:44:01 -04:00
Duo Zhang e7eb628025 HBASE-26922 Fix LineLength warnings as much as possible if it can not be fixed by spotless (#4324)
Signed-off-by: Yulin Niu <niuyulin@apache.org
(cherry picked from commit 3ae0d9012c)
2022-04-09 23:13:49 +08:00
Xiaolin Ha b7cfc1d0bd HBASE-26175 MetricsHBaseServer should record all kinds of Exceptions (#4248)
Signed-off-by: Pankaj Kumar <pankajkumar@apache.org>
2022-03-24 19:04:59 +08:00
Duo Zhang 340cc6c6f1
HBASE-26802 Backport the log4j2 changes to branch-2 (#4166)
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2022-03-11 11:17:43 -08:00
Andrew Purtell 300f9b9576 HBASE-26582 Prune use of Random and SecureRandom objects (#4118)
Avoid the pattern where a Random object is allocated, used once or twice, and
then left for GC. This pattern triggers warnings from some static analysis tools
because this pattern leads to poor effective randomness. In a few cases we were
legitimately suffering from this issue; in others a change is still good to
reduce noise in analysis results.

Use ThreadLocalRandom where there is no requirement to set the seed to gain
good reuse.

Where useful relax use of SecureRandom to simply Random or ThreadLocalRandom,
which are unlikely to block if the system entropy pool is low, if we don't need
crypographically strong randomness for the use case. The exception to this is
normalization of use of Bytes#random to fill byte arrays with randomness.
Because Bytes#random may be used to generate key material it must be backed by
SecureRandom.

Signed-off-by: Duo Zhang <zhangduo@apache.org>
2022-03-08 15:22:00 -08:00
Bryan Beaudreault fc92a00bd1 HBASE-26731 Add metrics for active and expired scanners (#4145)
Signed-off-by: Andrew Purtell <apurtell@apache.org>

Conflicts:
	hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServer.java
	hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegionServer.java
2022-03-04 14:06:13 -08:00
Duo Zhang 4644efb9f3
HBASE-26691 Replacing log4j with reload4j for branch-2.x (#4050)
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2022-03-04 12:06:34 -08:00
Bryan Beaudreault 2a7b413b71 HBASE-26727 Fix CallDroppedException reporting (#4088)
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2022-02-18 17:27:52 -08:00
Xiaolin Ha d9fae5cb67 HBASE-26397 Display the excluded datanodes on regionserver UI (#3990)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2022-01-10 13:02:31 +08:00
Richard Marscher 9890ac9639 HBASE-26623 Report CallDroppedException in exception metrics (#3980)
`CallDroppedException` can be thrown with `CallRunner.drop()` by queue implementations that decide to drop calls to groom the RPC call backlog. The LifoCoDel queue does this I believe and with Pluggable queue it's possible for 3rd party queue implementations to be using `drop()` for similar reasons. It would be nice for the server to be tracking these exceptions in metrics since otherwise you might have to do some extra lifting on the client side.

Signed-off-by: Duo Zhang <zhangduo@apache.org>
Reviewed-by: Bryan Beaudreault <bbeaudreault@hubspot.com>
2021-12-30 00:05:04 +08:00
Andrew Purtell b1bc5f3a5c Renumber to 2.6.0-SNAPSHOT after branching branch-2.5
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2021-12-08 16:54:32 -08:00
Bryan Beaudreault aa3b07f6bb
HBASE-26154: Adds exception metrics for QuotaExceededException and RpcThrottlingException (#3544)
Signed-off-by: Xiaolin Ha <haxiaolin@apache.org>
Signed-off-by: Pankaj Kumar<pankajkumar@apache.org>
2021-08-02 09:51:34 +05:30
Rushabh Shah e265eccf20
HBASE-25924 Re-compute size of WAL file while removing from WALEntryStream (#3315)
Signed-off-by: Andrew Purtell <apurtell@apache.org>
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
2021-05-26 10:42:03 -07:00
Rushabh Shah 90dc150b1b
HBASE-25860 Add metric for successful wal roll requests. (#3238)
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2021-05-08 13:02:52 +05:30
Baiqiang Zhao d575c11259
HBASE-25798 typo in MetricsAssertHelper (#3187)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2021-04-21 21:41:31 +08:00
Baiqiang Zhao 8ff17c68e2
HBASE-25687 Backport "HBASE-25681 Add a switch for server/table query… (#3074)
Signed-off-by: stack <stack@apache.org>
2021-04-07 11:11:46 -07:00
Sandeep Pal 72496272aa
HBASE-25627: HBase replication should have a metric to represent if the source is stuck getting initialized (#3018)
Introduces a new metric that tracks number of replication sources that are stuck in initialization.

Signed-off-by: Xu Cang <xucang@apache.org>
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
(cherry picked from commit ff3821814a)
2021-03-17 10:30:26 -07:00
meiyi b5fc5e17e2 HBASE-25636 Expose HBCK report as metrics (#3031)
Signed-off-by: zhangduo <zhangduo@apache.org>
2021-03-11 15:15:23 +08:00
Rahul Kumar e57c73a137
HBASE-25460 : Expose drainingServers as cluster metric (#2994) (#2995)
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2021-03-04 12:48:57 +05:30
shahrs87 6a4c9be967
HBASE-25539: Add age of oldest wal metric (#2962)
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
2021-02-18 20:59:07 -08:00
Viraj Jasani 0788547fea
HBASE-25474 : Bump HBase version on branch-2 (#2871)
Signed-off-by: stack <stack@apache.org>
2021-01-12 10:20:22 +05:30
gkanade 024349bd5d
HBASE-25026 Create a metric to track full region scans RPCs
Add new metric rpcFullScanRequestCount to track number of requests that are full region scans. Can be used to notify user to check if this is truly intended.

Signed-off-by Anoop Sam John <anoopsamjohn@apache.org>
Signed-off-by Ramkrishna S Vasudevan <ramkrishna@apache.org>
2020-11-19 09:55:33 +05:30
Reid Chan 70631e901f HBASE-25189 [Metrics] Add checkAndPut and checkAndDelete latency metrics at table level (#2549)
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2020-10-25 17:47:58 +08:00
ramkrish86 11cbca1a5e
HBASE-25135 Convert the internal seperator while emitting the memstore read metrics to # (#2486) (#2489)
Signed-off-by: Anoop Sam John <anoopsamjohn@apache.org>
2020-10-01 18:18:31 +05:30
Bharath Vissapragada 505ceacb4b
HBASE-25082: Per table WAL metrics: appendCount and appendSize (#2440)
Signed-off-by: Geoffrey Jacoby <gjacoby@apache.org>
Signed-off-by: Ankit Jain <jain.ankit@salesforce.com>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
(cherry picked from commit 56c7505f8f)
2020-09-23 21:07:44 -07:00
Javier Akira Luca de Tena cd66d8cba5
HBASE-24994 Add hedgedReadOpsInCurThread metric (#2367)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2020-09-11 13:49:11 +08:00
Toshihiro Suzuki 22bf9a38c9
HBASE-24680 Refactor the checkAndMutate code on the server side (#2184)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Josh Elser <elserj@apache.org>
2020-08-10 18:57:17 +09:00
Josh Elser 303db63b76 HBASE-24779 Report on the WAL edit buffer usage/limit for replication
Closes #2193

Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Sean Busbey <busbey@apache.org>
Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org>
2020-08-07 14:33:30 -04:00
Viraj Jasani d3ec4886a1
Revert "HBASE-24615 MutableRangeHistogram#updateSnapshotRangeMetrics doesn't calculate the distribution for last bucket"
This reverts commit 9ab5282c3e.
2020-07-15 00:20:13 +05:30
WenFeiYi 9ab5282c3e
HBASE-24615 MutableRangeHistogram#updateSnapshotRangeMetrics doesn't calculate the distribution for last bucket
Closes #1962

Signed-off-by: David Manning
Signed-off-by: Rushabh
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2020-07-15 00:10:43 +05:30
ramkrish86 ef809c198b
HBASE-24205 - Create metric to know the number of reads that happens (#1920)
* HBASE-24205 - Create metric to know the number of reads that happens
from memstore (branch-2)

* Add the optimization as in master and fix whitestyle and checkstyle

* Fix compilation error that accidently crept in

Authored-by: Ramkrishna <ramkrishna@apache.org>
Signed-off by:Anoop Sam John<anoopsamjohn@gmail.com>
Signed-off by:Viraj Jasani<virajjasani@apache.org>
2020-06-18 18:59:46 +05:30
Wellington Ramos Chevreuil 11d093bc39 HBASE-21406 "status 'replication'" should not show SINK if the cluste… (#1761)
Signed-off-by: Jan Hentschel <jan.hentschel@ultratendency.com>
Signed-off by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Josh Elser <elserj@apache.org>

(Cherry picked from commit e5345b3a7c)
2020-06-03 09:33:36 +01:00
Sandeep Pal 1ff532678d
HBASE-24350: Extending and Fixing HBaseTable level replication metrics (#1704)
Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org>
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2020-05-14 13:36:11 -07:00
Duo Zhang dc2146069c
HBASE-24309 Avoid introducing log4j and slf4j-log4j dependencies for … (#1697)
Signed-off-by: stack <stack@apache.org>
2020-05-13 17:59:21 +08:00
Bharath Vissapragada 9384b84552 HBASE-24075: Fix a race between master shutdown and metrics (re)init
JMXCacheBuster resets the metrics state at various points in time. These
events can potentially race with a master shutdown. When the master is
tearing down, metrics initialization can touch a lot of unsafe state,
for example invalidated FS objects. To avoid this, this patch makes
the getMetrics() a no-op when the master is either stopped or in the
process of shutting down. Additionally, getClusterId() when the server
is shutting down is made a no-op.

Simulating a test for this is a bit tricky but with the patch I don't
locally see the long stacktraces from the jira.

Signed-off-by: Michael Stack <stack@apache.org>
(cherry picked from commit 6f213e9d5a)
2020-04-01 10:14:34 -07:00
Wei-Chiu Chuang 8521207be4 HBASE-8868. add metric to report client shortcircuit reads. (#1334)
Signed-off-by: stack <stack@apache.net>
2020-03-24 15:31:34 -07:00
Nick Dimiduk ffb2359146
HBASE-24013 Bump branch-2 version to 2.4.0-SNAPSHOT (#1309)
Increment version in poms with

```
$ mvn org.codehaus.mojo:versions-maven-plugin:2.7:set -DnewVersion=2.4.0-SNAPSHOT -DgenerateBackupPoms=false
```

Verified no dangling references with

```
$ find . -iname '*pom.xml' -exec grep -n '2.3.0-SNAPSHOT' {} +
```

Verified build with

```
$ JAVA_HOME=/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home mvn clean package -DskipTests
$ JAVA_HOME=/Library/Java/JavaVirtualMachines/adoptopenjdk-11.jdk/Contents/Home mvn clean package -DskipTests -Dhadoop.profile=3.0
```

Signed-off-by: Jan Hentschel <jan.hentschel@ultratendency.com>
2020-03-19 08:01:43 -07:00
Viraj Jasani 17652a7b32
HBASE-23590 : Update maxStoreFileRefCount to maxCompactedStoreFileRefCount for auto region recovery based on old reader references
Signed-off-by: Anoop Sam John <anoopsamjohn@apache.org>
2020-01-01 22:50:37 +05:30
Ankit Singhal 6e6c7b3c2d HBASE-23065 [hbtop] Top-N heavy hitter user and client drill downs
Signed-off-by: Toshihiro Suzuki <brfrn169@gmail.com>
Signed-off-by: Josh Elser <elserj@apache.org>
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2019-12-22 20:13:50 -08:00
Josh Elser 46a18833a0 HBASE-23082 Backport of low latency space quotas for hbase snapshots
Includes the following, incorporating HBASE-20439 and HBASE-20440, too.

1)
HBASE-18133 Decrease quota reaction latency by HBase

Certain operations in HBase are known to directly affect
the utilization of tables on HDFS. When these actions
occur, we can circumvent the normal path and notify the
Master directly. This results in a much faster response to
changes in HDFS usage.

This requires FS scanning by the RS to be decoupled from
the reporting of sizes to the Master. An API inside each
RS is made so that any operation can hook into this call
in the face of other operations (e.g. compaction, flush,
bulk load).

2)
HBASE-18135 Implement mechanism for RegionServers to report file archival for space quotas

This de-couples the snapshot size calculation from the
SpaceQuotaObserverChore into another API which both the periodically
invoked Master chore and the Master service endpoint can invoke. This
allows for multiple sources of snapshot size to reported (from the
multiple sources we have in HBase).

When a file is archived, snapshot sizes can be more quickly realized and
the Master can still perform periodical computations of the total
snapshot size to account for any delayed/missing/lost file archival RPCs.

3)
HBASE-20531 RS may throw NPE when close meta regions in shutdown procedure.
2019-11-04 16:54:18 -05:00