15447 Commits

Author SHA1 Message Date
Michael Stack
9d34b4581c
HBASE-21242 [amv2] Miscellaneous minor log and assign procedure create improvements
For RIT Duration, do better than print ms/seconds. Remove redundant UI
column dedicated to duration when we log it in the status field too.

Make bypass log at INFO level.

Make it so on complete of subprocedure, we note count of outstanding
siblings so we have a clue how much further the parent has to go before
it is done (Helpful when hundreds of servers doing SCP).

Have the SCP run the AP preflight check before creating an AP; saves
creation of thousands of APs during fixup.

Don't log tablename three times when reporting remote call failed.

If lock is held already, note who has it. Also log after we get lock
or if we have to wait rather than log on entrance though we may
later have to wait (or we may have just picked up the lock).

Signed-off-by: Mike Drob <mdrob@apache.org>
2018-10-04 17:18:13 -07:00
Michael Stack
8fc90a23ae
HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign
Adds override to assigns and unassigns. Changes bypass 'force'
to align calling the param 'override' instead.

Adds recursive to 'bypass', a means of calling bypass on
parent and its subprocedures (usually bypass works on
leaf nodes rippling the bypass up to parent -- recursive
has us work in the opposite direction): EXPERIMENTAL.

bypass on an assign/unassign leaves region in RIT and the
RegionStateNode loaded with the bypassed procedure. First
implementation had assign/unassign cleanup leftover state.
Second implementation, on feedback, keeps the state in place
as a fence against other Procedures assuming the region entity,
and instead adds an 'override' function that hbck2 can set on
assigns/unassigns to override the fencing.

Note that the below also converts ProcedureExceptions that
come out of the Pv2 system into DoNotRetryIOEs. It is a
little awkward because DNRIOE is in client-module, not
in procedure module. Previous, we'd just keep retrying
the bypass, etc.

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java
 Have bypass take an environment like all other methods so subclasses.
 Fix javadoc issues.

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
 Javadoc issues. Pass environment when we invoke bypass.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
 Rename waitUntilNamespace... etc. to align with how these method types
 are named elsehwere .. i.e. waitFor rather than waitUntil..

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
 Cleanup message we emit when we find an exisitng procedure working
 against this entity.
 Add support for a force function which allows Assigns/Unassigns force
 ownership of the Region entity.

A hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestRegionBypass.java
 Test bypass and force.

M hbase-shell/src/main/ruby/shell/commands/list_procedures.rb
 Minor cleanup of the json output... do iso8601 timestamps.
2018-10-04 16:37:37 -07:00
Wellington Chevreuil
b0ac1c6aba HBASE-21185 - WALPrettyPrinter: Additional useful info to be printed by wal printer tool, for debugability purposes
Signed-off-by: Allan Yang <allan163@apache.org>
2018-10-04 03:28:21 -07:00
Xu Cang
3df8b6f7bb
HBASE-18549 Add metrics for failed replication queue recovery
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2018-10-01 18:39:07 -07:00
Andrew Purtell
f9d7ac2d5e
HBASE-21261 Add log4j.properties for hbase-rsgroup tests 2018-10-01 18:09:00 -07:00
Xu Cang
76a487c062
HBASE-19275 TestSnapshotFileCache never worked properly
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2018-10-01 17:12:21 -07:00
Michael Stack
259d12f739 Revert "Revert "Revert "HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign"""
This reverts commit 2174461cf729b61a844950278773ee0802ced158.

Revert because not ready to port to other branches.
2018-09-29 04:06:46 -07:00
Michael Stack
2174461cf7 Revert "Revert "HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign""
This reverts commit b96905d1df93aea0bc5b0e1ab074954e57b0dcc4.

i.e. a revert of a revert so a reapplication!

Revert so I can add signed-off-by....

Signed-off-by: Allan Yang <allan163@apache.org>
2018-09-29 03:34:36 -07:00
Michael Stack
b96905d1df Revert "HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign"
This reverts commit b42d7978cbb0d2b02eb5552a2f344cb128092b1e.
2018-09-29 03:34:10 -07:00
Michael Stack
b42d7978cb HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign
bypass on an assign/unassign leaves region in RIT and the
RegionStateNode loaded with the bypassed procedure. First
implementation had assign/unassign cleanup leftover state.
Second implementation, on feedback, keeps the state in place
as a fence against other Procedures assuming the region entity,
and instead adds an 'override' function that hbck2 can set on
assigns/unassigns to override the fencing.

Note that the below also converts ProcedureExceptions that
come out of the Pv2 system into DoNotRetryIOEs. It is a
little awkward because DNRIOE is in client-module, not
in procedure module. Previous, we'd just keep retrying
the bypass, etc.

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java
 Have bypass take an environment like all other methods so subclasses.
 Fix javadoc issues.

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
 Javadoc issues. Pass environment when we invoke bypass.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
 Rename waitUntilNamespace... etc. to align with how these method types
 are named elsehwere .. i.e. waitFor rather than waitUntil..

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
 Cleanup message we emit when we find an exisitng procedure working
 against this entity.
 Add support for a force function which allows Assigns/Unassigns force
 ownership of the Region entity.

A hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestRegionBypass.java
 Test bypass and force.

M hbase-shell/src/main/ruby/shell/commands/list_procedures.rb
 Minor cleanup of the json output... do iso8601 timestamps.
2018-09-29 03:33:07 -07:00
zhangduo
1f90d00614 HBASE-21248 Implement exponential backoff when retrying for ModifyPeerProcedure 2018-09-29 13:26:28 +08:00
Nihal Jain
c41003f5e6
HBASE-21196 HTableMultiplexer clears the meta cache after every put operation
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2018-09-28 16:35:57 -07:00
Kiran Kumar Maturi
b7c2b953bc
HBASE-20857 balancer status tag in jmx metrics
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2018-09-28 16:12:11 -07:00
Archana Katiyar
209d0a8a16
HBASE-21207 Add client side sorting functionality in master web UI for table and region server details
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2018-09-28 15:40:43 -07:00
ramie-raufdeen
e44ed1b1ef
HBASE-19418 configurable range of delay in PeriodicMemstoreFlusher
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2018-09-28 14:39:52 -07:00
xcang
e26a6e0e10
HBASE-18451 PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request, fix logging
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2018-09-28 11:50:24 -07:00
meiyi
8dea600795 HBASE-21249 Add jitter for ProcedureUtil.getBackoffTimeMs
Signed-off-by: zhangduo <zhangduo@apache.org>
2018-09-28 21:28:16 +08:00
Allan Yang
f6c05faccf Revert "HBASE-21237 Use CompatRemoteProcedureResolver to dispatch open/close region requests to RS" 2018-09-28 14:07:40 +08:00
zhangduo
4947e72f63 HBASE-21233 Allow the procedure implementation to skip persistence of the state after a execution 2018-09-28 11:14:49 +08:00
Allan Yang
0290f57c3a HBASE-21237 Use CompatRemoteProcedureResolver to dispatch open/close region requests to RS 2018-09-28 09:41:31 +08:00
Allan Yang
eb27251265 HBASE-21228 Memory leak since AbstractFSWAL caches Thread object and never clean later 2018-09-27 15:07:07 +08:00
Michael Stack
5169cfc8c3 HBASE-21232 Show table state in Tables view on Master home page 2018-09-26 10:57:23 -07:00
Zach York
504286d55c HBASE-20734 Colocate recovered edits directory with hbase.wal.dir
Amending-Author: Reid Chan <reidchan@apache.org>
Signed-off-by: Reid Chan <reidchan@apache.org>
2018-09-26 19:37:53 +08:00
Allan Yang
ba8a252167 HBASE-21212 Wrong flush time when update flush metric 2018-09-26 19:11:23 +08:00
Chia-Ping Tsai
a4e72544f7 HBASE-21208 Bytes#toShort doesn't work without unsafe
Signed-off-by: Ted Yu <yuzhihong@gmail.com>
Signed-off-by: anoopsamjohn <anoopsamjohn@gmail.com>
Signed-off-by: Reid Chan <reidchan@apache.org>
2018-09-26 18:19:19 +08:00
Mingliang Liu
fea75742b4
HBASE-21164 reportForDuty should do backoff rather than retry
Remove unused methods from Sleeper (its ok, its @Private).
Remove notion of startTime from Sleeper handling (it is is unused).
Allow passing in how long to sleep so can maintain externally.
In HRS, use a RetryCounter to calculate backoff sleep time for when
reportForDuty is failing against a struggling Master.
2018-09-25 11:31:39 -07:00
Michael Stack
0d008b4792
HBASE-21223 [amv2] Remove abort_procedure from shell
Signed-off-by: Balazs Meszaros <balazs.meszaros@cloudera.com>
2018-09-25 10:07:32 -07:00
Andrew Purtell
101205345b
Amend HBASE-20704 Sometimes some compacted storefiles are not archived on region close
Forward port small logging improvements from branch-1 version of this change.
2018-09-21 16:12:51 -07:00
Michael Stack
a22aec1dad
HBASE-21214 [hbck2] setTableState just sets hbase:meta state, not in-memory state 2018-09-21 16:03:58 -07:00
Andrew Purtell
1e05c9f3b5
HBASE-21203 TestZKMainServer#testCommandLineWorks won't pass with default 4lw whitelist
Recent versions of ZooKeeper whitelist the so-called 4-letter word admin
commands, and 'stat' is not in the default whitelist. Set system property
zookeeper.4lw.commands.whitelist=* in MiniZooKeeperCluster#setupTestEnv
as we do not need to whitelist 4-letter commands for unit tests.
2018-09-21 15:37:12 -07:00
openinx
5a73a1ab25 HBASE-21206 Scan with batch size may return incomplete cells 2018-09-20 22:20:02 +08:00
tianjingyun
c5af7b654b HBASE-21204 NPE when scan raw DELETE_FAMILY_VERSION and codec is not set
Signed-off-by: tedyu <yuzhihong@gmail.com>
2018-09-20 06:59:43 -07:00
Umesh Agashe
e6c7ed34e0
HBASE-21023 Added bypassProcedure() API to HbckService 2018-09-19 15:01:29 -07:00
Michael Stack
37cc07a772
HBASE-21156 [hbck2] Queue an assign of hbase:meta and bulk assign/unassign
Adds 'raw' assigns and unassigns methods to Hbck Service.

Fixes HbckService so it works when cluster is Kerberized.
2018-09-19 09:02:43 -07:00
Vasudevan
27b772ddc6 HBASE-21102 ServerCrashProcedure should select target server where no
other replicas exist for the current region (Ram)
2018-09-17 22:36:50 +05:30
Michael Stack
39e0b8515f HBASE-21191 Add a holding-pattern if no assign for meta or namespace (Can happen if masterprocwals have been cleared).
Add a check for hbase:meta being online before we go to read it.
If not online, move into a holding-pattern until rectified, probably
by external operator.

Incorporates bulk of patch made by Allan Yang over on HBASE-21035.

M hbase-common/src/main/java/org/apache/hadoop/hbase/util/RetryCounterFactory.java

 Add a Constructor for case where retries are for ever.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
 Move stuff around so that the first hbase:meta read is the AM#loadMeta.
 Previously, checking table state and/or favored nodes could end up
 trying to read a meta that was not onlined holding up master startup.
 Do similar for the namespace table. Adds new methods isMeta and
 isNamespace which check that the regions/tables are online.. if not,
 we wait logging with a back-off that assigns need to be run.

Signed-off-by: Allan Yang <allan163@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2018-09-16 21:12:59 -07:00
Francis Liu
a925a4ce16 HBASE-20704 Sometimes some compacted storefiles are not archived on region close 2018-09-16 18:38:03 -07:00
Ted Yu
842e0c974d HBASE-21097 Flush pressure assertion may fail in testFlushThroughputTuning
Amending-Author: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2018-09-15 18:39:42 +08:00
Toshihiro Suzuki
198aa90665 HBASE-21182 Failed to execute start-hbase.sh 2018-09-14 22:21:47 +09:00
Umesh Agashe
589c1e4078
HBASE-20941 Created and implemented HbckService in master
Added API setTableStateInMeta() to update table state only in Meta. This will be used by hbck2 tool.
2018-09-12 21:31:13 -07:00
Sean Busbey
2479282fb2 HBASE-21189 flaky job should gather machine stats
Signed-off-by: Michael Stack <stack@apache.org>
(cherry picked from commit 5d14c1af65c02f4e87059337c35e4431505de91c)
2018-09-12 23:09:13 -05:00
Michael Stack
487f713c63 HBASE-21190 Log files and count of entries in each as we load from the MasterProcWAL store 2018-09-12 10:19:46 -07:00
Mike Drob
d81e806718 HBASE-21168 Insecure Randomness in BloomFilterUtil
Flagged by Fortify static analysis

Signed-off-by: Andrew Purtell <apurtell@apache.org>
Signed-off-by: Mingliang Liu <liuml07@apache.org>
2018-09-12 09:52:41 -05:00
Duo Zhang
2da6dbe563 HBASE-21172 Reimplement the retry backoff logic for ReopenTableRegionsProcedure 2018-09-12 16:01:55 +08:00
Guangxu Cheng
ea4194039e HBASE-21179 Fix the number of actions in responseTooSlow log 2018-09-12 10:44:05 +08:00
Guangxu Cheng
15842109c0 HBASE-21174 [REST] Failed to parse empty qualifier in TableResource#getScanResource
Signed-off-by: tedyu <yuzhihong@gmail.com>
2018-09-12 10:41:47 +08:00
David Manning
75a7643b11 Backport "HBASE-21126 Add ability for HBase Canary to ignore a configurable number of ZooKeeper down nodes" to branch-2.1
Signed-off-by: Josh Elser <elserj@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2018-09-12 10:01:28 +08:00
krish.dey
63ef89bff7 HBASE-21125 Backport 'HBASE-20942 Improve RpcServer TRACE logging' to branch-2.1
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2018-09-12 09:59:28 +08:00
Duo Zhang
b9d74f89ff Revert "HBASE-20942 Fix ArrayIndexOutOfBoundsException for RpcServer TRACE logging"
This reverts commit 69756da503f4b1bc74462f67c65124cab5d045fb.
2018-09-12 09:55:46 +08:00
krish.dey
69756da503 HBASE-20942 Fix ArrayIndexOutOfBoundsException for RpcServer TRACE logging
Also makes the trace log message length configurable.

Signed-off-by: Josh Elser <elserj@apache.org>
2018-09-12 09:44:22 +08:00