Commit Graph

107 Commits

Author SHA1 Message Date
Guanghao Zhang 682dd57cd6 HBASE-17205 Add a metric for the duration of region in transition 2016-12-01 10:32:24 -08:00
Guanghao Zhang 7b2673db12 HBASE-16561 Add metrics about read/write/scan queue length and active read/write/scan handler count
Signed-off-by: zhangduo <zhangduo@apache.org>
2016-11-29 16:09:56 +08:00
zhangduo be042652aa Revert "HBASE-16561 Add metrics about read/write/scan queue length and active read/write/scan handler count"
Forget to add signoff

This reverts commit 5ec218dbc2.
2016-11-29 16:09:22 +08:00
Guanghao Zhang 5ec218dbc2 HBASE-16561 Add metrics about read/write/scan queue length and active read/write/scan handler count 2016-11-29 16:00:37 +08:00
Enis Soztutar 123d26ed90 HBASE-17017 Remove the current per-region latency histogram metrics 2016-11-08 18:31:12 -08:00
Dustin Pho 59ca4dad70 HBASE-16661 Add last major compaction age to per-region metrics
Signed-off-by: Gary Helmling <garyh@apache.org>
2016-10-10 15:21:53 -07:00
Sean Busbey df25ebf84f HBASE-15984 Handle premature EOF treatment of WALs in replication.
In some particular deployments, the Replication code believes it has
reached EOF for a WAL prior to succesfully parsing all bytes known to
exist in a cleanly closed file.

Consistently this failure happens due to an InvalidProtobufException
after some number of seeks during our attempts to tail the in-progress
RegionServer WAL. As a work-around, this patch treats cleanly closed
files differently than other execution paths. If an EOF is detected due
to parsing or other errors while there are still unparsed bytes before
the end-of-file trailer, we now reset the WAL to the very beginning and
attempt a clean read-through.

In current testing, a single such reset is sufficient to work around
observed dataloss. However, the above change will retry a given WAL file
indefinitely. On each such attempt, a log message like the below will
be emitted at the WARN level:

  Processing end of WAL file '{}'. At position {}, which is too far away
  from reported file length {}. Restarting WAL reading (see HBASE-15983
  for details).

Additionally, this patch adds some additional log detail at the TRACE
level about file offsets seen while handling recoverable errors. It also
add metrics that measure the use of this recovery mechanism.
2016-09-29 10:47:57 -05:00
Enis Soztutar 8a797e81b8 HBASE-16604 Scanner retries on IOException can cause the scans to miss data 2016-09-22 18:48:06 -07:00
Ashish Singhi 3606b890f8 HBASE-16471 Region Server metrics context will be wrong when machine hostname contain "master" word (Pankaj Kumar) 2016-08-24 19:01:58 +05:30
Geoffrey 6e9b49cac7 HBASE-16448 Custom metrics for custom replication endpoints
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2016-08-23 17:17:42 -07:00
Jingcheng Du 83c0cc109e HBASE-15353 Add metric for number of CallQueueTooBigException's 2016-06-24 14:39:53 +08:00
Gary Helmling 198165ef5b HBASE-16085 Add a metric for failed compactions 2016-06-23 16:04:27 -07:00
Enis Soztutar d07d316113 HBASE-15740 Replication source.shippedKBs metric is undercounting because it is in KB 2016-05-09 10:25:57 -07:00
Enis Soztutar 550d253ead HBASE-15671 Add per-table metrics on memstore, storefile and regionsize (Alicia Ying Shu) 2016-04-21 13:33:31 -07:00
Andrew Purtell e9acc104b7 HBASE-15663 Hook up JvmPauseMonitor to ThriftServer 2016-04-20 17:36:59 -07:00
Andrew Purtell db83e631ad HBASE-15662 Hook up JvmPauseMonitor to REST server 2016-04-20 17:36:59 -07:00
Andrew Purtell 780cff5886 HBASE-15614 Report metrics from JvmPauseMonitor 2016-04-20 17:36:59 -07:00
Enis Soztutar 1311e25171 HBASE-15518 Add Per-Table metrics back (Alicia Ying Shu) 2016-04-20 14:35:52 -07:00
tedyu b7502feff3 HBASE-15093 Replication can report incorrect size of log queue for the global source when multiwal is enabled (Ashu Pachauri) 2016-04-11 08:23:34 -07:00
Elliott Clark 75d46e4697 HBASE-14983 Create metrics for per block type hit/miss ratios
Summary: Missing a root index block is worse than missing a data block. We should know the difference

Test Plan: Tested on a local instance. All numbers looked reasonable.

Differential Revision: https://reviews.facebook.net/D55563
2016-03-30 11:53:46 -07:00
Enis Soztutar d07230a759 HBASE-15412 Add average region size metric (Alicia Ying Shu)
Conflicts:
	hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceImpl.java
	hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerWrapperImpl.java
	hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerWrapperStub.java
2016-03-22 14:50:14 -07:00
Enis Soztutar 249e37f83c HBASE-15464 Flush / Compaction metrics revisited
Conflicts:
	hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
	hbase-server/src/test/java/org/apache/hadoop/hbase/MockRegionServerServices.java
	hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java
2016-03-21 17:56:22 -07:00
Enis Soztutar 934c0274e3 HBASE-15377 Per-RS Get metric is time based, per-region metric is size-based (Heng Chen)
Conflicts:
	hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
	hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
2016-03-15 13:35:13 -07:00
Enis Soztutar 95421d276e HBASE-15435 Add WAL (in bytes) written metric (Alicia Ying Shu)
Conflicts:
	hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/wal/MetricsWALSource.java
	hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/wal/MetricsWALSourceImpl.java
	hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestMetricsWAL.java
2016-03-10 20:21:53 -08:00
chenheng bbb10c4c18 HBASE-15376 ScanNext metric is size-based while every other per-operation metric is time based
Conflicts:
	hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerMetrics.java
2016-03-07 17:48:37 +08:00
Mikhail Antonov 04a3b27330 HBASE-15136 Explore different queuing behaviors while busy 2016-02-24 20:42:23 -08:00
Elliott Clark 3352173ec8 HBASE-15222 Use less contended classes for metrics
Summary:
Use less contended things for metrics.
For histogram which was the largest culprit we use FastLongHistogram
For atomic long where possible we now use counter.

Test Plan: unit tests

Reviewers:

Subscribers:

Differential Revision: https://reviews.facebook.net/D54381
2016-02-24 14:47:00 -08:00
Mikhail Antonov 4784717235 HBASE-15135 Add metrics for storefile age 2016-02-22 02:43:24 -08:00
stack 0649755d34 HBASE-15163 Add sampling code and metrics for get/scan/multi/mutate count separately (Yu Li) 2016-02-06 06:36:47 -08:00
tedyu e7d935cbed HBASE-15068 Add metrics for region normalization plans 2016-01-07 03:28:50 -08:00
Lars Hofhansl 254af5a321 Merge branch 'branch-1' of https://git-wip-us.apache.org/repos/asf/hbase into branch-1 2016-01-05 15:54:49 -08:00
Elliott Clark 8508dd07ff HBASE-14946 Don't allow multi's to over run the max result size.
Summary:
* Add VersionInfoUtil to determine if a client has a specified version or better
* Add an exception type to say that the response should be chunked
* Add on client knowledge of retry exceptions
* Add on metrics for how often this happens

Test Plan: Added a unit test

Differential Revision: https://reviews.facebook.net/D51771
2015-12-10 18:31:05 -08:00
ramkrishna 65117d3d04 HBASE-13153 Bulk Loaded HFile Replication (Ashish Singhi) 2015-12-10 13:10:41 +05:30
Lars Hofhansl fba9e49dcf HBASE-14869 Better request latency and size histograms. (Vikas Vishwakarma and Lars Hofhansl)
Conflicts:

	hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerMetrics.java
2015-12-08 17:07:45 -08:00
Vrishal Kulkarni 2e5499ed6c HBASE-14719 Add metrics for master WAL count (numMasterWALs). Metric numMasterWALs appears as follows in metrics dump
{
    "name" : "Hadoop:service=HBase,name=Master,sub=Procedure",
    "modelerType" : "Master,sub=Procedure",
    "tag.Context" : "master",
    "tag.Hostname" : "vrishal-mbp",
    "numMasterWALs" : 1
},

Signed-off-by: Elliott Clark <eclark@apache.org>
2015-12-07 11:15:25 -08:00
Sanjeev Lakshmanan 2ce27951b0 HBASE-14862 Add support for reporting p90 for histogram metrics
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2015-11-23 15:56:07 -08:00
Elliott Clark 604c9b2cca HBASE-14793 Allow limiting size of block into L1 block cache. 2015-11-17 10:38:02 -08:00
Elliott Clark e8e0e86b49 HBASE-14778 Make block cache hit percentages not integer in the metrics system 2015-11-10 12:28:24 -08:00
stack 51f5412a92 Revert "HBASE-14725 Vet categorization of tests so they for sure go into the right small/medium/large buckets"
Reverting. It seems to have destabilized the build.

This reverts commit aaaa813225.
2015-11-02 08:16:59 -08:00
stack aaaa813225 HBASE-14725 Vet categorization of tests so they for sure go into the right small/medium/large buckets 2015-11-01 22:55:59 -08:00
Gary Helmling 85f2aee070 HBASE-14700 Support a permissive mode for secure clusters to allow SIMPLE auth clients 2015-10-30 20:06:23 -07:00
Sean Busbey 436bd5e823 HBASE-14516 categorize hadoop-compat tests
* make sure the test classifications are in test scope for their use in the hadoop-compat modules
* added a test category for 'metrics related' since that's what all these tests are for
* categorized tests as small,metrics

Conflicts:
	hbase-hadoop2-compat/src/test/java/org/apache/hadoop/hbase/master/TestMetricsMasterSourceImpl.java
2015-10-03 01:15:52 -05:00
Sanjeev Srivatsa d047c37871 HBASE-14459 Add response and request size metrics
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2015-09-30 18:23:17 -07:00
Andrew Purtell fc79ba338a HBASE-14205 RegionCoprocessorHost System.nanoTime() performance bottleneck 2015-09-24 11:16:47 -07:00
Enis Soztutar bb4a690b79 HBASE-14082 Add replica id to JMX metrics names (Lei Chen) 2015-09-16 17:43:32 -07:00
tedyu 00bdf14f96 HBASE-14314 Metrics for block cache should take region replicas into account 2015-09-09 13:32:02 -07:00
Elliott Clark d6c2beb4bf HBASE-14166 Per-Region metrics can be stale 2015-08-17 11:22:21 -07:00
tedyu 6a2b618d97 HBASE-13965 Stochastic Load Balancer JMX Metrics (Lei Chen) 2015-08-05 19:22:44 -07:00
tedyu 24dbe25e95 HBASE-13965 Revert due to test failure in TestAssignmentManager 2015-08-03 15:32:43 -07:00
tedyu c215b900f4 HBASE-13965 Stochastic Load Balancer JMX Metrics (Lei Chen) 2015-08-03 12:50:44 -07:00