Commit Graph

135 Commits

Author SHA1 Message Date
Zach York cad3d55892 HBASE-18520 Add jmx value to determine true Master Start time
This is to determine how long it took in total for the master to start and finish initializing.

Signed-off-by: tedyu <yuzhihong@gmail.com>
2017-08-05 22:34:26 -07:00
Abhishek Singh Chouhan abb9d88dce HBASE-18374 RegionServer Metrics improvements 2017-07-31 12:47:11 +05:30
Abhishek Singh Chouhan 9278037108 HBASE-15134 Add visibility into Flush and Compaction queues 2017-07-28 13:21:04 +05:30
Andrew Purtell 3dd55fa0c0 Set versions on branch-1 to 1.5.0-SNAPSHOT 2017-07-03 18:01:15 -07:00
Enis Soztutar ea3075e7fd HBASE-15160 Put back HFile's HDFS op latency sampling code and add metrics for monitoring (Yu Li and Enis Soztutar) 2017-06-06 14:41:02 -07:00
Vincent a3c3f1012d HBASE-18060 Backport to branch-1 HBASE-9774 HBase native metrics and metric collection for coprocessors
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2017-05-24 13:20:44 -07:00
ckulkarni defc25c6d1 HBASE-17448 Export metrics from RecoverableZooKeeper
Added metrics for RecoverableZooKeeper related to specific exceptions,
total failed ZooKeeper API calls and latency histograms for read,
write and sync operations. Also added unit tests for the same. Added
service provider for the ZooKeeper metrics implementation inside the
hadoop compatibility module.

Signed-off-by: Andrew Purtell <apurtell@apache.org>

Conflicts:
	hbase-client/src/main/java/org/apache/hadoop/hbase/zookeeper/RecoverableZooKeeper.java
	hbase-metrics-api/src/main/java/org/apache/hadoop/hbase/metrics/PackageMarker.java
2017-04-26 18:14:53 -07:00
Ashu Pachauri 2eb810d0f7 HBASE-17627 Active workers metric for thrift (Ashu Pachauri)
Signed-off-by: Gary Helmling <garyh@apache.org>
2017-02-15 15:46:29 -08:00
Gary Helmling b123313f11 HBASE-17578 Thrift metrics should handle exceptions 2017-02-06 12:20:19 -08:00
Guanghao Zhang 682dd57cd6 HBASE-17205 Add a metric for the duration of region in transition 2016-12-01 10:32:24 -08:00
Guanghao Zhang 7b2673db12 HBASE-16561 Add metrics about read/write/scan queue length and active read/write/scan handler count
Signed-off-by: zhangduo <zhangduo@apache.org>
2016-11-29 16:09:56 +08:00
zhangduo be042652aa Revert "HBASE-16561 Add metrics about read/write/scan queue length and active read/write/scan handler count"
Forget to add signoff

This reverts commit 5ec218dbc2.
2016-11-29 16:09:22 +08:00
Guanghao Zhang 5ec218dbc2 HBASE-16561 Add metrics about read/write/scan queue length and active read/write/scan handler count 2016-11-29 16:00:37 +08:00
Enis Soztutar 123d26ed90 HBASE-17017 Remove the current per-region latency histogram metrics 2016-11-08 18:31:12 -08:00
Dustin Pho 59ca4dad70 HBASE-16661 Add last major compaction age to per-region metrics
Signed-off-by: Gary Helmling <garyh@apache.org>
2016-10-10 15:21:53 -07:00
Sean Busbey df25ebf84f HBASE-15984 Handle premature EOF treatment of WALs in replication.
In some particular deployments, the Replication code believes it has
reached EOF for a WAL prior to succesfully parsing all bytes known to
exist in a cleanly closed file.

Consistently this failure happens due to an InvalidProtobufException
after some number of seeks during our attempts to tail the in-progress
RegionServer WAL. As a work-around, this patch treats cleanly closed
files differently than other execution paths. If an EOF is detected due
to parsing or other errors while there are still unparsed bytes before
the end-of-file trailer, we now reset the WAL to the very beginning and
attempt a clean read-through.

In current testing, a single such reset is sufficient to work around
observed dataloss. However, the above change will retry a given WAL file
indefinitely. On each such attempt, a log message like the below will
be emitted at the WARN level:

  Processing end of WAL file '{}'. At position {}, which is too far away
  from reported file length {}. Restarting WAL reading (see HBASE-15983
  for details).

Additionally, this patch adds some additional log detail at the TRACE
level about file offsets seen while handling recoverable errors. It also
add metrics that measure the use of this recovery mechanism.
2016-09-29 10:47:57 -05:00
Enis Soztutar 8a797e81b8 HBASE-16604 Scanner retries on IOException can cause the scans to miss data 2016-09-22 18:48:06 -07:00
Ashish Singhi 3606b890f8 HBASE-16471 Region Server metrics context will be wrong when machine hostname contain "master" word (Pankaj Kumar) 2016-08-24 19:01:58 +05:30
Geoffrey 6e9b49cac7 HBASE-16448 Custom metrics for custom replication endpoints
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2016-08-23 17:17:42 -07:00
Jingcheng Du 83c0cc109e HBASE-15353 Add metric for number of CallQueueTooBigException's 2016-06-24 14:39:53 +08:00
Gary Helmling 198165ef5b HBASE-16085 Add a metric for failed compactions 2016-06-23 16:04:27 -07:00
Enis Soztutar d07d316113 HBASE-15740 Replication source.shippedKBs metric is undercounting because it is in KB 2016-05-09 10:25:57 -07:00
Enis Soztutar 550d253ead HBASE-15671 Add per-table metrics on memstore, storefile and regionsize (Alicia Ying Shu) 2016-04-21 13:33:31 -07:00
Andrew Purtell e9acc104b7 HBASE-15663 Hook up JvmPauseMonitor to ThriftServer 2016-04-20 17:36:59 -07:00
Andrew Purtell db83e631ad HBASE-15662 Hook up JvmPauseMonitor to REST server 2016-04-20 17:36:59 -07:00
Andrew Purtell 780cff5886 HBASE-15614 Report metrics from JvmPauseMonitor 2016-04-20 17:36:59 -07:00
Enis Soztutar 1311e25171 HBASE-15518 Add Per-Table metrics back (Alicia Ying Shu) 2016-04-20 14:35:52 -07:00
tedyu b7502feff3 HBASE-15093 Replication can report incorrect size of log queue for the global source when multiwal is enabled (Ashu Pachauri) 2016-04-11 08:23:34 -07:00
Elliott Clark 75d46e4697 HBASE-14983 Create metrics for per block type hit/miss ratios
Summary: Missing a root index block is worse than missing a data block. We should know the difference

Test Plan: Tested on a local instance. All numbers looked reasonable.

Differential Revision: https://reviews.facebook.net/D55563
2016-03-30 11:53:46 -07:00
Enis Soztutar d07230a759 HBASE-15412 Add average region size metric (Alicia Ying Shu)
Conflicts:
	hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceImpl.java
	hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerWrapperImpl.java
	hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerWrapperStub.java
2016-03-22 14:50:14 -07:00
Enis Soztutar 249e37f83c HBASE-15464 Flush / Compaction metrics revisited
Conflicts:
	hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
	hbase-server/src/test/java/org/apache/hadoop/hbase/MockRegionServerServices.java
	hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java
2016-03-21 17:56:22 -07:00
Enis Soztutar 934c0274e3 HBASE-15377 Per-RS Get metric is time based, per-region metric is size-based (Heng Chen)
Conflicts:
	hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
	hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
2016-03-15 13:35:13 -07:00
Enis Soztutar 95421d276e HBASE-15435 Add WAL (in bytes) written metric (Alicia Ying Shu)
Conflicts:
	hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/wal/MetricsWALSource.java
	hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/wal/MetricsWALSourceImpl.java
	hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestMetricsWAL.java
2016-03-10 20:21:53 -08:00
chenheng bbb10c4c18 HBASE-15376 ScanNext metric is size-based while every other per-operation metric is time based
Conflicts:
	hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerMetrics.java
2016-03-07 17:48:37 +08:00
Mikhail Antonov 998e339d6e HBASE-15342 create branch-1.3 and update branch-1 poms to 1.4.0-SNAPSHOT 2016-02-28 16:23:29 -08:00
Mikhail Antonov 04a3b27330 HBASE-15136 Explore different queuing behaviors while busy 2016-02-24 20:42:23 -08:00
Elliott Clark 3352173ec8 HBASE-15222 Use less contended classes for metrics
Summary:
Use less contended things for metrics.
For histogram which was the largest culprit we use FastLongHistogram
For atomic long where possible we now use counter.

Test Plan: unit tests

Reviewers:

Subscribers:

Differential Revision: https://reviews.facebook.net/D54381
2016-02-24 14:47:00 -08:00
Mikhail Antonov 4784717235 HBASE-15135 Add metrics for storefile age 2016-02-22 02:43:24 -08:00
stack 0649755d34 HBASE-15163 Add sampling code and metrics for get/scan/multi/mutate count separately (Yu Li) 2016-02-06 06:36:47 -08:00
tedyu e7d935cbed HBASE-15068 Add metrics for region normalization plans 2016-01-07 03:28:50 -08:00
Lars Hofhansl 254af5a321 Merge branch 'branch-1' of https://git-wip-us.apache.org/repos/asf/hbase into branch-1 2016-01-05 15:54:49 -08:00
Elliott Clark 8508dd07ff HBASE-14946 Don't allow multi's to over run the max result size.
Summary:
* Add VersionInfoUtil to determine if a client has a specified version or better
* Add an exception type to say that the response should be chunked
* Add on client knowledge of retry exceptions
* Add on metrics for how often this happens

Test Plan: Added a unit test

Differential Revision: https://reviews.facebook.net/D51771
2015-12-10 18:31:05 -08:00
ramkrishna 65117d3d04 HBASE-13153 Bulk Loaded HFile Replication (Ashish Singhi) 2015-12-10 13:10:41 +05:30
Lars Hofhansl fba9e49dcf HBASE-14869 Better request latency and size histograms. (Vikas Vishwakarma and Lars Hofhansl)
Conflicts:

	hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerMetrics.java
2015-12-08 17:07:45 -08:00
Vrishal Kulkarni 2e5499ed6c HBASE-14719 Add metrics for master WAL count (numMasterWALs). Metric numMasterWALs appears as follows in metrics dump
{
    "name" : "Hadoop:service=HBase,name=Master,sub=Procedure",
    "modelerType" : "Master,sub=Procedure",
    "tag.Context" : "master",
    "tag.Hostname" : "vrishal-mbp",
    "numMasterWALs" : 1
},

Signed-off-by: Elliott Clark <eclark@apache.org>
2015-12-07 11:15:25 -08:00
Sanjeev Lakshmanan 2ce27951b0 HBASE-14862 Add support for reporting p90 for histogram metrics
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2015-11-23 15:56:07 -08:00
Elliott Clark 604c9b2cca HBASE-14793 Allow limiting size of block into L1 block cache. 2015-11-17 10:38:02 -08:00
Elliott Clark e8e0e86b49 HBASE-14778 Make block cache hit percentages not integer in the metrics system 2015-11-10 12:28:24 -08:00
stack 51f5412a92 Revert "HBASE-14725 Vet categorization of tests so they for sure go into the right small/medium/large buckets"
Reverting. It seems to have destabilized the build.

This reverts commit aaaa813225.
2015-11-02 08:16:59 -08:00
stack aaaa813225 HBASE-14725 Vet categorization of tests so they for sure go into the right small/medium/large buckets 2015-11-01 22:55:59 -08:00