hbase

Commit Graph

Author	SHA1	Message	Date
Sean Busbey	76396714e1	HBASE-15984 Handle premature EOF treatment of WALs in replication. In some particular deployments, the Replication code believes it has reached EOF for a WAL prior to succesfully parsing all bytes known to exist in a cleanly closed file. Consistently this failure happens due to an InvalidProtobufException after some number of seeks during our attempts to tail the in-progress RegionServer WAL. As a work-around, this patch treats cleanly closed files differently than other execution paths. If an EOF is detected due to parsing or other errors while there are still unparsed bytes before the end-of-file trailer, we now reset the WAL to the very beginning and attempt a clean read-through. In current testing, a single such reset is sufficient to work around observed dataloss. However, the above change will retry a given WAL file indefinitely. On each such attempt, a log message like the below will be emitted at the WARN level: Processing end of WAL file '{}'. At position {}, which is too far away from reported file length {}. Restarting WAL reading (see HBASE-15983 for details). Additionally, this patch adds some additional log detail at the TRACE level about file offsets seen while handling recoverable errors. It also add metrics that measure the use of this recovery mechanism.	2016-09-29 10:07:14 -05:00
Enis Soztutar	eb112783ae	HBASE-16604 Scanner retries on IOException can cause the scans to miss data - RECOMMIT after revert	2016-09-23 11:27:13 -07:00
Enis Soztutar	39db0cac78	Revert "HBASE-16604 Scanner retries on IOException can cause the scans to miss data" This reverts commit `83cf44cd3f`. Reverting because accidental files are committed with this.	2016-09-23 11:25:23 -07:00
Enis Soztutar	83cf44cd3f	HBASE-16604 Scanner retries on IOException can cause the scans to miss data	2016-09-22 12:06:11 -07:00
anoopsamjohn	1384c9a08d	HBASE-16650 Wrong usage of BlockCache eviction stat for heap memory tuning.	2016-09-22 21:28:30 +05:30
Ashish Singhi	31f16d6aec	HBASE-16471 Region Server metrics context will be wrong when machine hostname contain "master" word (Pankaj Kumar)	2016-08-24 18:59:44 +05:30
Geoffrey	cb02be38ab	HBASE-16448 Custom metrics for custom replication endpoints Signed-off-by: Andrew Purtell <apurtell@apache.org>	2016-08-23 17:17:08 -07:00
stack	69d170063f	HBASE-16181 Fix AssignmentManager MBean name (Reid Chan)	2016-07-29 16:40:39 -07:00
Reid Chan	abfd584fe6	HBASE-14743 Add metrics around HeapMemoryManager. (Reid Chan) Change-Id: I7305f7b7034b216930b5fb5c57de9ba5eabf96d8 Signed-off-by: Apekshit Sharma <appy@apache.org>	2016-07-25 14:32:38 -07:00
Apekshit Sharma	eff38ccf8c	Revert HBASE-14743 because of wrong attribution. Since I added commit message to the raw patch, it's making me as author instead of Reid. I should have used --author flag to set Reid as author. This reverts commit `064271da16`.	2016-07-25 14:32:38 -07:00
Apekshit Sharma	064271da16	HBASE-14743 Add metrics around HeapMemoryManager. (Reid Chan) Change-Id: I60b2435355b3e605e7d91cbf5aca5d2988f26f33	2016-07-25 13:45:50 -07:00
Jingcheng Du	518faa735b	HBASE-15353 Add metric for number of CallQueueTooBigException's	2016-06-24 13:00:06 +08:00
Gary Helmling	f4cec2e202	HBASE-16085 Add a metric for failed compactions	2016-06-23 15:38:58 -07:00
stack	3a95552cfe	HBASE-15948 Port "HADOOP-9956 RPC listener inefficiently assigns connections to readers" Adds HADOOP-9955 RPC idle connection closing is extremely inefficient Changes how we do accounting of Connections to match how it is done in Hadoop. Adds a ConnectionManager class. Adds new configurations for this new class. "hbase.ipc.client.idlethreshold" 4000 "hbase.ipc.client.connection.idle-scan-interval.ms" 10000 "hbase.ipc.client.connection.maxidletime" 10000 "hbase.ipc.client.kill.max", 10 "hbase.ipc.server.handler.queue.size", 100 The new scheme does away with synchronization that purportedly would freeze out reads while we were cleaning up stale connections (according to HADOOP-9955) Also adds in new mechanism for accepting Connections by pulling in as many as we can at a time adding them to a Queue instead of doing one at a time. Can help when bursty traffic according to HADOOP-9956. Removes a blocking while Reader is busy parsing a request. Adds configuration "hbase.ipc.server.read.connection-queue.size" with default of 100 for queue size. Signed-off-by: stack <stack@apache.org>	2016-06-07 16:42:21 -07:00
stack	e66ecd7db6	Revert "HBASE-15948 Port "HADOOP-9956 RPC listener inefficiently assigns connections to readers"" Revert mistaken commit... This reverts commit `e0b70c00e7`.	2016-06-07 16:41:30 -07:00
stack	6d5a25935e	Revert "HBASE-15967 Metric for active ipc Readers and make default fraction of cpu count" Revert mistaken commit This reverts commit `1125215aad`.	2016-06-07 16:41:01 -07:00
stack	1125215aad	HBASE-15967 Metric for active ipc Readers and make default fraction of cpu count Add new metric hbase.regionserver.ipc.runningReaders Also make it so Reader count is a factor of processor count	2016-06-07 13:10:14 -07:00
stack	e0b70c00e7	HBASE-15948 Port "HADOOP-9956 RPC listener inefficiently assigns connections to readers" Adds HADOOP-9955 RPC idle connection closing is extremely inefficient Then removes queue added by HADOOP-9956 at Enis suggestion Changes how we do accounting of Connections to match how it is done in Hadoop. Adds a ConnectionManager class. Adds new configurations for this new class. "hbase.ipc.client.idlethreshold" 4000 "hbase.ipc.client.connection.idle-scan-interval.ms" 10000 "hbase.ipc.client.connection.maxidletime" 10000 "hbase.ipc.client.kill.max", 10 "hbase.ipc.server.handler.queue.size", 100 The new scheme does away with synchronization that purportedly would freeze out reads while we were cleaning up stale connections (according to HADOOP-9955) Also adds in new mechanism for accepting Connections by pulling in as many as we can at a time adding them to a Queue instead of doing one at a time. Can help when bursty traffic according to HADOOP-9956. Removes a blocking while Reader is busy parsing a request. Adds configuration "hbase.ipc.server.read.connection-queue.size" with default of 100 for queue size.	2016-06-07 13:10:14 -07:00
Enis Soztutar	b75b226804	HBASE-15740 Replication source.shippedKBs metric is undercounting because it is in KB	2016-05-09 10:25:49 -07:00
Alex Moundalexis	0bf065a5d5	HBASE-15768 fix capitalization of ZooKeeper usage Signed-off-by: Sean Busbey <busbey@apache.org>	2016-05-05 15:35:44 -05:00
Enis Soztutar	4c0587134a	HBASE-15671 Add per-table metrics on memstore, storefile and regionsize (Alicia Ying Shu)	2016-04-21 13:33:26 -07:00
Andrew Purtell	b6617b4eb9	HBASE-15663 Hook up JvmPauseMonitor to ThriftServer	2016-04-20 17:37:35 -07:00
Andrew Purtell	a330a2b505	HBASE-15662 Hook up JvmPauseMonitor to REST server	2016-04-20 17:37:35 -07:00
Andrew Purtell	2c26fe37ac	HBASE-15614 Report metrics from JvmPauseMonitor	2016-04-20 17:37:34 -07:00
Enis Soztutar	18d70bc680	HBASE-15518 Add Per-Table metrics back (Alicia Ying Shu)	2016-04-20 14:35:45 -07:00
tedyu	8541fe4ad1	HBASE-15093 Replication can report incorrect size of log queue for the global source when multiwal is enabled (Ashu Pachauri)	2016-04-11 08:17:20 -07:00
Elliott Clark	a71ce6e738	HBASE-14983 Create metrics for per block type hit/miss ratios Summary: Missing a root index block is worse than missing a data block. We should know the difference Test Plan: Tested on a local instance. All numbers looked reasonable. Differential Revision: https://reviews.facebook.net/D55563	2016-03-30 11:41:11 -07:00
Enis Soztutar	b3fe4ed16c	HBASE-15412 Add average region size metric (Alicia Ying Shu)	2016-03-22 14:46:27 -07:00
Enis Soztutar	797562e6c3	HBASE-15464 Flush / Compaction metrics revisited	2016-03-21 17:50:02 -07:00
Enis Soztutar	51259fe4a5	HBASE-15377 Per-RS Get metric is time based, per-region metric is size-based (Heng Chen)	2016-03-15 11:22:18 -07:00
Enis Soztutar	a979d85582	HBASE-15435 Add WAL (in bytes) written metric (Alicia Ying Shu)	2016-03-10 20:16:30 -08:00
chenheng	f30afa05d9	HBASE-15376 ScanNext metric is size-based while every other per-operation metric is time based	2016-03-07 17:36:40 +08:00
Jonathan M Hsieh	f658f3ef83	HBASE-15356 Remove unused imports (Youngjoon Kim)	2016-03-03 11:42:38 -08:00
Mikhail Antonov	43f99def67	HBASE-15136 Explore different queuing behaviors while busy	2016-02-24 20:41:30 -08:00
Elliott Clark	630a65825e	HBASE-15222 Use less contended classes for metrics Summary: Use less contended things for metrics. For histogram which was the largest culprit we use FastLongHistogram For atomic long where possible we now use counter. Test Plan: unit tests Reviewers: Subscribers: Differential Revision: https://reviews.facebook.net/D54381	2016-02-24 14:34:05 -08:00
Mikhail Antonov	e58c0385a7	HBASE-15135 Add metrics for storefile age	2016-02-22 02:21:02 -08:00
stack	eacf7bcf97	HBASE-15163 Add sampling code and metrics for get/scan/multi/mutate count separately (Yu Li)	2016-02-06 06:30:56 -08:00
chenheng	8f20bc748d	HBASE-15197 Expose filtered read requests metric to metrics framework and Web UI (Eungsop Yoo)	2016-02-05 10:57:14 +08:00
tedyu	5266b07708	HBASE-15068 Add metrics for region normalization plans	2016-01-07 03:13:16 -08:00
Elliott Clark	48e217a7db	HBASE-14946 Don't allow multi's to over run the max result size. Summary: * Add VersionInfoUtil to determine if a client has a specified version or better * Add an exception type to say that the response should be chunked * Add on client knowledge of retry exceptions * Add on metrics for how often this happens Test Plan: Added a unit test Differential Revision: https://reviews.facebook.net/D51771	2015-12-10 18:10:32 -08:00
ramkrishna	26ac60b03f	HBASE-13153 Bulk Loaded HFile Replication (Ashish Singhi)	2015-12-10 13:07:46 +05:30
Lars Hofhansl	7bfbb6a3c9	HBASE-14869 Better request latency and size histograms. (Vikas Vishwakarma and Lars Hofhansl)	2015-12-08 17:02:27 -08:00
Vrishal Kulkarni	1f999c1e2b	HBASE-14719 Add metrics for master WAL count (numMasterWALs). Metric numMasterWALs appears as follows in metrics dump { "name" : "Hadoop:service=HBase,name=Master,sub=Procedure", "modelerType" : "Master,sub=Procedure", "tag.Context" : "master", "tag.Hostname" : "vrishal-mbp", "numMasterWALs" : 1 }, Signed-off-by: Elliott Clark <eclark@apache.org>	2015-12-07 11:14:29 -08:00
stack	51503efcf0	HBASE-13857 Slow WAL Append count in ServerMetricsTmpl.jamon is hardcoded to zero (Vrishal Kulkarni)	2015-12-03 17:00:29 -08:00
Sanjeev Lakshmanan	6b11adbfa4	HBASE-14862 Add support for reporting p90 for histogram metrics Signed-off-by: Andrew Purtell <apurtell@apache.org>	2015-11-23 15:55:45 -08:00
Elliott Clark	a48d30984a	HBASE-14793 Allow limiting size of block into L1 block cache.	2015-11-17 10:37:49 -08:00
Elliott Clark	ea795213b2	HBASE-14778 Make block cache hit percentages not integer in the metrics system	2015-11-10 12:25:59 -08:00
stack	9630fec2d5	Revert "HBASE-14725 Vet categorization of tests so they for sure go into the right small/medium/large buckets" Revert. Seems to have destabilized the build This reverts commit `6dbb5a8052`.	2015-11-02 08:17:41 -08:00
stack	6dbb5a8052	HBASE-14725 Vet categorization of tests so they for sure go into the right small/medium/large buckets	2015-11-01 22:26:43 -08:00
Gary Helmling	683f84e6a2	HBASE-14700 Support a permissive mode for secure clusters to allow SIMPLE auth clients	2015-10-30 19:45:46 -07:00

1 2 3

138 Commits