Commit Graph

156 Commits

Author SHA1 Message Date
Sean Busbey 76396714e1 HBASE-15984 Handle premature EOF treatment of WALs in replication.
In some particular deployments, the Replication code believes it has
reached EOF for a WAL prior to succesfully parsing all bytes known to
exist in a cleanly closed file.

Consistently this failure happens due to an InvalidProtobufException
after some number of seeks during our attempts to tail the in-progress
RegionServer WAL. As a work-around, this patch treats cleanly closed
files differently than other execution paths. If an EOF is detected due
to parsing or other errors while there are still unparsed bytes before
the end-of-file trailer, we now reset the WAL to the very beginning and
attempt a clean read-through.

In current testing, a single such reset is sufficient to work around
observed dataloss. However, the above change will retry a given WAL file
indefinitely. On each such attempt, a log message like the below will
be emitted at the WARN level:

  Processing end of WAL file '{}'. At position {}, which is too far away
  from reported file length {}. Restarting WAL reading (see HBASE-15983
  for details).

Additionally, this patch adds some additional log detail at the TRACE
level about file offsets seen while handling recoverable errors. It also
add metrics that measure the use of this recovery mechanism.
2016-09-29 10:07:14 -05:00
Enis Soztutar eb112783ae HBASE-16604 Scanner retries on IOException can cause the scans to miss data - RECOMMIT after revert 2016-09-23 11:27:13 -07:00
Enis Soztutar 39db0cac78 Revert "HBASE-16604 Scanner retries on IOException can cause the scans to miss data"
This reverts commit 83cf44cd3f.

Reverting because accidental files are committed with this.
2016-09-23 11:25:23 -07:00
Enis Soztutar 83cf44cd3f HBASE-16604 Scanner retries on IOException can cause the scans to miss data 2016-09-22 12:06:11 -07:00
zhangduo 6eb6225456 HBASE-7612 [JDK8] Replace use of high-scale-lib counters with intrinsic facilities 2016-09-19 13:37:24 +08:00
Geoffrey cb02be38ab HBASE-16448 Custom metrics for custom replication endpoints
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2016-08-23 17:17:08 -07:00
Reid Chan abfd584fe6 HBASE-14743 Add metrics around HeapMemoryManager. (Reid Chan)
Change-Id: I7305f7b7034b216930b5fb5c57de9ba5eabf96d8

Signed-off-by: Apekshit Sharma <appy@apache.org>
2016-07-25 14:32:38 -07:00
Apekshit Sharma eff38ccf8c Revert HBASE-14743 because of wrong attribution. Since I added commit message to the raw patch, it's making me as author instead of Reid. I should have used --author flag to set Reid as author.
This reverts commit 064271da16.
2016-07-25 14:32:38 -07:00
Apekshit Sharma 064271da16 HBASE-14743 Add metrics around HeapMemoryManager. (Reid Chan)
Change-Id: I60b2435355b3e605e7d91cbf5aca5d2988f26f33
2016-07-25 13:45:50 -07:00
Enis Soztutar 6d94925af9 HBASE-16211 JMXCacheBuster restarting the metrics system might cause tests to hang 2016-07-12 13:43:52 -07:00
Jingcheng Du 518faa735b HBASE-15353 Add metric for number of CallQueueTooBigException's 2016-06-24 13:00:06 +08:00
Gary Helmling f4cec2e202 HBASE-16085 Add a metric for failed compactions 2016-06-23 15:38:58 -07:00
stack 6d5a25935e Revert "HBASE-15967 Metric for active ipc Readers and make default fraction of cpu count"
Revert mistaken commit
This reverts commit 1125215aad.
2016-06-07 16:41:01 -07:00
stack 1125215aad HBASE-15967 Metric for active ipc Readers and make default fraction of cpu count
Add new metric hbase.regionserver.ipc.runningReaders
Also make it so Reader count is a factor of processor count
2016-06-07 13:10:14 -07:00
Sean Mackrory 3b6e6e6c25 HBASE-15889. String case conversions are locale-sensitive, used without locale
Signed-off-by: Sean Busbey <busbey@apache.org>
2016-05-28 10:41:31 -07:00
Enis Soztutar b75b226804 HBASE-15740 Replication source.shippedKBs metric is undercounting because it is in KB 2016-05-09 10:25:49 -07:00
tedyu f76ffb7f38 HBASE-15742 Reduce allocation of objects in metrics (Phil Yang) 2016-05-03 08:58:00 -07:00
Enis Soztutar 4c0587134a HBASE-15671 Add per-table metrics on memstore, storefile and regionsize (Alicia Ying Shu) 2016-04-21 13:33:26 -07:00
Andrew Purtell b6617b4eb9 HBASE-15663 Hook up JvmPauseMonitor to ThriftServer 2016-04-20 17:37:35 -07:00
Andrew Purtell a330a2b505 HBASE-15662 Hook up JvmPauseMonitor to REST server 2016-04-20 17:37:35 -07:00
Andrew Purtell 2c26fe37ac HBASE-15614 Report metrics from JvmPauseMonitor 2016-04-20 17:37:34 -07:00
Enis Soztutar 18d70bc680 HBASE-15518 Add Per-Table metrics back (Alicia Ying Shu) 2016-04-20 14:35:45 -07:00
tedyu 8541fe4ad1 HBASE-15093 Replication can report incorrect size of log queue for the global source when multiwal is enabled (Ashu Pachauri) 2016-04-11 08:17:20 -07:00
Elliott Clark a71ce6e738 HBASE-14983 Create metrics for per block type hit/miss ratios
Summary: Missing a root index block is worse than missing a data block. We should know the difference

Test Plan: Tested on a local instance. All numbers looked reasonable.

Differential Revision: https://reviews.facebook.net/D55563
2016-03-30 11:41:11 -07:00
Enis Soztutar b3fe4ed16c HBASE-15412 Add average region size metric (Alicia Ying Shu) 2016-03-22 14:46:27 -07:00
Enis Soztutar 797562e6c3 HBASE-15464 Flush / Compaction metrics revisited 2016-03-21 17:50:02 -07:00
Enis Soztutar 51259fe4a5 HBASE-15377 Per-RS Get metric is time based, per-region metric is size-based (Heng Chen) 2016-03-15 11:22:18 -07:00
Enis Soztutar a979d85582 HBASE-15435 Add WAL (in bytes) written metric (Alicia Ying Shu) 2016-03-10 20:16:30 -08:00
chenheng f30afa05d9 HBASE-15376 ScanNext metric is size-based while every other per-operation metric is time based 2016-03-07 17:36:40 +08:00
Jonathan M Hsieh f658f3ef83 HBASE-15356 Remove unused imports (Youngjoon Kim) 2016-03-03 11:42:38 -08:00
Elliott Clark 77133fd225 HBASE-15222 Addendum - Use less contended classes for metrics 2016-02-25 09:08:11 -08:00
Mikhail Antonov 43f99def67 HBASE-15136 Explore different queuing behaviors while busy 2016-02-24 20:41:30 -08:00
Elliott Clark a3b4575f70 HBASE-15319 clearJmxCache does not take effect actually 2016-02-24 16:29:05 -08:00
Elliott Clark 630a65825e HBASE-15222 Use less contended classes for metrics
Summary:
Use less contended things for metrics.
For histogram which was the largest culprit we use FastLongHistogram
For atomic long where possible we now use counter.

Test Plan: unit tests

Reviewers:

Subscribers:

Differential Revision: https://reviews.facebook.net/D54381
2016-02-24 14:34:05 -08:00
Mikhail Antonov e58c0385a7 HBASE-15135 Add metrics for storefile age 2016-02-22 02:21:02 -08:00
stack eacf7bcf97 HBASE-15163 Add sampling code and metrics for get/scan/multi/mutate count separately (Yu Li) 2016-02-06 06:30:56 -08:00
chenheng 8f20bc748d HBASE-15197 Expose filtered read requests metric to metrics framework and Web UI (Eungsop Yoo) 2016-02-05 10:57:14 +08:00
tedyu 5266b07708 HBASE-15068 Add metrics for region normalization plans 2016-01-07 03:13:16 -08:00
Mikhail Antonov abe30b52a8 HBASE-14534 Bump yammer/coda/dropwizard metrics dependency version 2015-12-15 12:11:27 -08:00
Elliott Clark 48e217a7db HBASE-14946 Don't allow multi's to over run the max result size.
Summary:
* Add VersionInfoUtil to determine if a client has a specified version or better
* Add an exception type to say that the response should be chunked
* Add on client knowledge of retry exceptions
* Add on metrics for how often this happens

Test Plan: Added a unit test

Differential Revision: https://reviews.facebook.net/D51771
2015-12-10 18:10:32 -08:00
ramkrishna 26ac60b03f HBASE-13153 Bulk Loaded HFile Replication (Ashish Singhi) 2015-12-10 13:07:46 +05:30
Lars Hofhansl 7bfbb6a3c9 HBASE-14869 Better request latency and size histograms. (Vikas Vishwakarma and Lars Hofhansl) 2015-12-08 17:02:27 -08:00
Vrishal Kulkarni 1f999c1e2b HBASE-14719 Add metrics for master WAL count (numMasterWALs). Metric numMasterWALs appears as follows in metrics dump
{
    "name" : "Hadoop:service=HBase,name=Master,sub=Procedure",
    "modelerType" : "Master,sub=Procedure",
    "tag.Context" : "master",
    "tag.Hostname" : "vrishal-mbp",
    "numMasterWALs" : 1
},

Signed-off-by: Elliott Clark <eclark@apache.org>
2015-12-07 11:14:29 -08:00
stack 51503efcf0 HBASE-13857 Slow WAL Append count in ServerMetricsTmpl.jamon is hardcoded to zero (Vrishal Kulkarni) 2015-12-03 17:00:29 -08:00
Sanjeev Lakshmanan 6b11adbfa4 HBASE-14862 Add support for reporting p90 for histogram metrics
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2015-11-23 15:55:45 -08:00
Elliott Clark a48d30984a HBASE-14793 Allow limiting size of block into L1 block cache. 2015-11-17 10:37:49 -08:00
Gary Helmling 683f84e6a2 HBASE-14700 Support a permissive mode for secure clusters to allow SIMPLE auth clients 2015-10-30 19:45:46 -07:00
Matteo Bertozzi 36a196722c HBASE-14669 remove unused import and fix javadoc 2015-10-22 09:25:26 -07:00
Sean Busbey a545d71295 HBASE-14516 categorize hadoop-compat tests
* make sure the test classifications are in test scope for their use in the hadoop-compat modules
* added a test category for 'metrics related' since that's what all these tests are for
* categorized tests as small,metrics
2015-10-03 01:08:53 -05:00
Sanjeev Srivatsa 76463a36f5 HBASE-14459 Add response and request size metrics
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2015-09-30 18:22:48 -07:00