Commit Graph

6680 Commits

Author SHA1 Message Date
Michael Stack acbdb86bb4 HBASE-20169 NPE when calling HBTU.shutdownMiniCluster
Adds a prepare step to RecoverMetaProcedure in which we test for
cluster up and master being up. If not up, we fail the run.

Modified hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/HFileCleaner.java
Modified hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ChunkCreator.java
 Minor log cleanup.

Modified hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/RecoverMetaProcedure.java
 Add pepare step.

Modified hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManagerMetrics.java
 Debug for the failing test....

Added hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestRecoverMetaProcedure.java
 Test the prepare step goes down if master or cluster are down.
2018-03-20 13:11:57 -07:00
Michael Stack 74c28bdf44 HBASE-20232 [LOGGING] Formatting around close and flush 2018-03-20 10:34:49 -07:00
tedyu 2a3f4a0a4e HBASE-20214 Review of RegionLocationFinder Class - revert due to the pending removal of commons-collections4 dependency 2018-03-19 16:52:27 -07:00
tedyu df5de33a02 HBASE-20196 Maintain all regions with same size in memstore flusher 2018-03-19 08:10:39 -07:00
zhangduo 67f013430c HBASE-20206 WALEntryStream should not switch WAL file silently 2018-03-19 11:40:11 +08:00
zhangduo 00095a2ef9 Revert "HBASE-19665 Add table based replication peers/queues storage back"
This reverts commit 31978c31bb.

 Conflicts:
	hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/TableReplicationStorageBase.java
2018-03-17 20:25:27 +08:00
BELUGA BEHR 104f58701e HBASE-20214 Review of RegionLocationFinder Class 2018-03-16 16:09:40 -07:00
Michael Stack bedf849d83 HBASE-20213 [LOGGING] Aligning formatting and logging less (compactions,
in-memory compactions)

Log less. Log using same format as used elsewhere in log.

Align logs in HFileArchiver with how we format elsewhere. Removed
redundant 'region' qualifiers, tried to tighten up the emissions so
easier to read the long lines.

M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ChunkCreator.java
 Add a label for each of the chunkcreators we make (I was confused by
two chunk creater stats emissions in log file -- didn't know that one
was for data and the other index).

M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplit.java
 Formatting. Log less.

M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreCompactionStrategy.java
 Make the emissions in here trace-level. When more than a few regions,
log is filled with this stuff.
2018-03-16 13:16:49 -07:00
Zach York aaa90d8069 [HBASE-20141] Fix TooManyFiles exception when RefreshingChannels
HBASE-19435 implements a fix for reopening file channels when they are unnexpected closed
to avoid disabling the BucketCache. However, it was missed that the the channels might not
actually be completely closed (the write or read channel might still be open
(see https://docs.oracle.com/javase/7/docs/api/java/nio/channels/ClosedChannelException.html)
This commit closes any open channels before creating a new channel.
2018-03-16 10:51:39 -07:00
Michael Stack 13f3ba3cee HBASE-20202 [AMv2] Don't move region if its a split parent or offlined
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/DoNotRetryRegionException.java
M hbase-client/src/main/java/org/apache/hadoop/hbase/exceptions/MergeRegionException.java
 Allow passing cause to Constructor.

M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
 Add prepare step to move procedure.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MergeTableRegionsProcedure.java
 Add check that regions to merge are actually online to the Constructor
so we can fail fast if they are offline

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MoveRegionProcedure.java
 Add prepare step. Check regions and context and skip move if not right.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/SplitTableRegionProcedure.java
 Add check parent region is online to constructor.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/AbstractStateMachineTableProcedure.java
 Add generic check region is online utility function for use by subclasses.

M hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionMove.java
 Add test that we fail if we try to move an offlined region.
2018-03-16 09:35:33 -07:00
Michael Stack c200bf8f78
HBASE-20190 Fix default for MIGRATE_TABLE_STATE_FROM_ZK_KEY 2018-03-15 10:34:02 -07:00
Chia-Ping Tsai 4f2133ee32 HBASE-20119 Introduce a pojo class to carry coprocessor information in order to make TableDescriptorBuilder accept multiple cp at once
Signed-off-by: Ted Yu <yuzhihong@gmail.com>
Signed-off-by: Michael Stack <stack@apache.org>
2018-03-16 01:21:38 +08:00
Ashish Singhi 82483fad7c HBASE-20146 Addendum Regions are stuck while opening when WAL is disabled
Signed-off-by: zhangduo <zhangduo@apache.org>
Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>
2018-03-15 10:13:30 +08:00
Michael Stack d0e55429b3 HBASE-20178 [AMv2] Throw exception if hostile environment
Add Fail-Fast to Procedures by throwing exception out of Procedure
constructor so if move but table is disabled or if master is going
down, etc., we can give notice before the procedure is scheduled.
Will help guard against scheduling Procedures that will have a hard
time succeeding; e.g. a move when table is offline.

Also fixed bug around table state where we presumed ENABLED though no
entry in hbase:meta (we were using this mechanism for whether a table
existed or not).

M hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionMove.java
 Test stolen from HBASE-20131

M hbase-client/src/main/java/org/apache/hadoop/hbase/client/TableState.java
 Add convenience isEnabled/isDisabled

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
 Promote assert state to throw exception.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java
 Add isClusterUp

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
 Move constructor now throws exception
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MoveRegionProcedure.java
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MergeTableRegionsProcedure.java
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/SplitTableRegionProcedure.java
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DisableTableProcedure.java
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ModifyTableProcedure.java
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/RestoreSnapshotProcedure.java
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/TruncateTableProcedure.java
 Do environment check at construction and fail-fast if hostile.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/AbstractStateMachineTableProcedure.java
 Add preflightCheck utility method.

M hbase-client/src/main/java/org/apache/hadoop/hbase/MetaTableAccessor.java
 Removed setting time setting table state; broke when using other than
 default environment edge masked by presumption that no state meant
 active.
2018-03-14 14:56:22 -07:00
Mike Drob f63a7ff635 HBASE-20180 Avoid Class::newInstance 2018-03-14 13:15:05 -05:00
Apekshit Sharma 84ee32c723 HBASE-20185 Fix ACL check for MasterRpcServices#execProcedure 2018-03-14 19:02:19 +05:30
Chance Li 4f54a66782 HBASE-19389 Limit concurrency of put with dense (hundreds) columns to prevent write handler exhausted
Signed-off-by: Yu Li <liyu@apache.org>
2018-03-14 18:38:33 +08:00
Yu Li 9a26af37b9 Revert "HBASE-19389 Limit concurrency of put with dense (hundreds) columns to prevent write handler exhausted"
This reverts commit 98ac4f12b5.
2018-03-14 18:38:07 +08:00
Yu Li 98ac4f12b5 HBASE-19389 Limit concurrency of put with dense (hundreds) columns to prevent write handler exhausted
Signed-off-by: Yu Li <liyu@apache.org>
2018-03-14 18:25:17 +08:00
huzheng 31978c31bb HBASE-19665 Add table based replication peers/queues storage back 2018-03-14 15:42:16 +08:00
zhangduo b7308ee01c HBASE-20117 Cleanup the unused replication barriers in meta table 2018-03-14 12:08:15 +08:00
Mike Grimes b16e03c130 HBASE-17165 Make use of retry setting in LoadIncrementalHFiles & fix test 2018-03-13 15:00:36 -07:00
Sahil Aggarwal 1b66444846 HBASE-19075: Fix the 'tasks' table on master info page to not scroll up on clicking the tab 2018-03-13 14:26:16 -07:00
Michael Stack 72c3d27bf6 HBASE-20173 [AMv2] DisableTableProcedure concurrent to ServerCrashProcedure can deadlock
Allow that DisableTableProcedue can grab a region lock before
ServerCrashProcedure can. Cater to this cricumstance where SCP
was not unable to make progress by running the search for RIT
against the crashed server a second time, post creation of all
crashed-server assignemnts. The second run will uncover such as
the above DisableTableProcedure unassign and will interrupt its
suspend allowing both procedures to make progress.

M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
 Add new procedure step post-assigns that reruns the RIT finder method.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
 Make this important log more specific as to what is going on.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java
 Better explanation as to what is going on.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerCrashProcedure.java
 Add extra step and run handleRIT a second time after we've queued up
 all SCP assigns. Also fix a but. SCP was adding an assign of a RIT
 that was actually trying to unassign (made the deadlock more likely).
2018-03-13 06:04:36 -07:00
BELUGA BEHR f30dfc69bb HBASE-19449 Minor logging change in HFileArchiver
Signed-off-by: Apekshit Sharma <appy@apache.org>
2018-03-12 22:01:56 +05:30
zhangduo 6060d3ba56 HBASE-20167 Optimize the implementation of ReplicationSourceWALReader 2018-03-12 15:14:16 +08:00
Umesh Agashe 45bbee4905 HBASE-20120 Removed unused classes/ java files from hbase-server
deleted:    hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/NoOpHeapMemoryTuner.java
deleted:    hbase-server/src/main/java/org/apache/hadoop/hbase/replication/BaseWALEntryFilter.java
deleted:    hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSMapRUtils.java
deleted:    hbase-server/src/main/java/org/apache/hadoop/hbase/util/ProtoUtil.java
Signed-off-by: Sean Busbey <busbey@apache.org>
Signed-off-by: tedyu <yuzhihong@gmail.com>
Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>
2018-03-10 13:08:38 +08:00
Guangxu Cheng 70240f9732 HBASE-20132 Change the "KV" to "Cell" for web UI
Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>
2018-03-10 12:27:59 +08:00
zhangduo dd6f4525e7 HBASE-20148 Make serial replication as a option for a peer instead of a table 2018-03-10 09:04:44 +08:00
Umesh Agashe 974200fca1 HBASE-20024 Fixed flakyness of TestMergeTableRegionsProcedure
We assumed that we can run for loop from 0 to lastStep sequentially. MergeTableRegionProcedure skips step 2. So, when i is 0 the procedure is already at step 3.
Added a method StateMachineProcedure#getCurrentStateId that can be used from test code only.
2018-03-09 12:45:39 -08:00
Ashish Singhi 06550bc93b HBASE-20146 Regions are stuck while opening when WAL is disabled
Signed-off-by: zhangduo <zhangduo@apache.org>
2018-03-09 21:10:16 +08:00
zhangduo 756cccecff HBASE-19598 Addendum fix typo 2018-03-09 15:37:22 +08:00
zhangduo 04798d6747 HBASE-19598 Addendum increase sync wait time 2018-03-09 15:34:24 +08:00
zhangduo 033485dff3 HBASE-19598 Fix TestAssignmentManagerMetrics flaky test 2018-03-09 11:47:55 +08:00
zhangduo a513678a79 HBASE-20160 TestRestartCluster.testRetainAssignmentOnRestart uses the wrong condition to decide whether the assignment is finished 2018-03-09 11:08:44 +08:00
zhangduo a03d09abd7 HBASE-20144 The shutdown of master will hang if there are no live region server 2018-03-08 15:05:57 +08:00
Michael Stack 37d91cdfbb Revert "HBASE-20137 TestRSGroups is flakey"
Revert. Fix is not right.

This reverts commit 6d1740d498.
2018-03-07 09:26:21 -08:00
zhangduo 6b77786dfc HBASE-20125 Add UT for serial replication after region split and merge 2018-03-07 14:52:23 +08:00
Michael Stack 1f5e93a8f8 HBASE-20137 TestRSGroups is flakey
On failed RPC we expire the server and suspend expecting the
resultant ServerCrashProcedure to wake us back up again. In tests,
TestRSGroup hung because it failed to schedule a server expiration
because the server was already expired undergoing processing (the
test was shutting down). Deal with this case by having expire
servers return false if unable to expire. Callers will then know
where a ServerCrashProcedure has been scheduled or not.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
  Have expireServer return true if successful.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
 The log that included an exception whose message was the current
procedure as a String totally baffled me. Make it more obvious what
exception is.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java
 If failed expire of a server, wake our procedure -- do not suspend --
and presume ok to move region to CLOSED state (because going down or
concurrent crashed server processing ongoing).
2018-03-06 10:55:40 -08:00
zhangduo 1384da7137 HBASE-20129 Add UT for serial replication checker 2018-03-06 16:43:01 +08:00
Josh Elser 4a4c012049 HBASE-18135 Implement mechanism for RegionServers to report file archival for space quotas
This de-couples the snapshot size calculation from the
SpaceQuotaObserverChore into another API which both the periodically
invoked Master chore and the Master service endpoint can invoke. This
allows for multiple sources of snapshot size to reported (from the
multiple sources we have in HBase).

When a file is archived, snapshot sizes can be more quickly realized and
the Master can still perform periodical computations of the total
snapshot size to account for any delayed/missing/lost file archival RPCs.

Signed-off-by: Ted Yu <yuzhihong@gmail.com>
2018-03-05 17:32:42 -05:00
zhangduo b7b8683925 HBASE-20115 Reimplement serial replication based on the new replication storage layer 2018-03-05 20:25:25 +08:00
haxiaolin 1d25b60831 HBASE-20114 Fix IllegalFormatConversionException in rsgroup.jsp
Signed-off-by: tedyu <yuzhihong@gmail.com>
2018-03-02 09:00:28 -08:00
Chia-Ping Tsai db131be39a HBASE-19437 Batch operation can't handle the null result for Append/Increment
Signed-off-by: anoopsamjohn <anoopsamjohn@gmail.com>
Signed-off-by: Michael Stack <stack@apache.org>
2018-03-02 23:31:56 +08:00
Sean Busbey 2a65066b35 HBASE-20070 refactor website generation
* rely on git plumbing commands when checking if we've built the site for a particular commit already
* switch to forcing '-e' for bash
* add command line switches for: path to hbase, working directory, and publishing
* only export JAVA/MAVEN HOME if they aren't already set.
* add some docs about assumptions
* Update javadoc plugin to consistently be version 3.0.0
* avoid duplicative site invocations on reactor modules
* update use of cp command so it works both on linux and mac
* manually skip enforcer plugin during build
* still doing install of all jars due to MJAVADOC-490, but then skip rebuilding during aggregate reports.
* avoid the pager on git-diff by teeing to a log file, which also helps later reviewing in the case of big changesets.

Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: Misty Stanley-Jones <misty@apache.org>
2018-03-02 09:25:10 -06:00
huzheng 99d3edfc82 HBASE-20050 Reimplement updateReplicationPositions logic in serial replication based on the newly introduced replication storage layer
Signed-off-by: zhangduo <zhangduo@apache.org>
2018-03-02 15:53:14 +08:00
Umesh Agashe e86ffedb45 HBASE-20055 Removed declaration of un-thrown exceptions and unused setRegionStateBackToOpen() from MergeTableRegionsProcedure
Plus some minor cleanup.
2018-03-01 08:46:23 -08:00
Chia-Ping Tsai 776eb5d9cb HBASE-20093 (addendum) remove unused import of ServerLoad
Signed-off-by: tedyu <yuzhihong@gmail.com>
2018-03-01 23:47:36 +08:00
Michael Stack 0732ef5ebf Revert "HBASE-19598 Fix TestAssignmentManagerMetrics flaky test"
Pushed prematurely. Revert.

This reverts commit 016cf0c64b.
2018-02-28 23:39:54 -08:00
Michael Stack 016cf0c64b HBASE-19598 Fix TestAssignmentManagerMetrics flaky test
Master never left waitForMasterActive because it never checked state of
the clusterUp flag. The test here was aborting regionserver and then
just exiting. The minihbasecluster shutdown sets the cluster down flag
but we were never looking at it so Master thread was staying up.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
The tableOnMaster check in waitForMasterActive looks wrong. It was
making it so a 'normal' Master was getting stuck in here. This is not
the place to worry about tablesOnMaster. That is for the balancer to be
concerned with. There is a problem with Master hosting
system-tables-only. After further study, Master can carry regions like a
regionserver but making it so it carries system tables only is tricky
given meta assign happens ahead of all others which means that the
Master needs to have checked-in as a regionserver super early... It
needs work. Punted for now.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
Mostly renaming so lists and maps of region infos have same name as they
have elsewhere in code base and cleaning up confusion that may arise
when we talk of servers-for-system-tables....It is talking about
something else in the code changes here that is other than the normal
understanding. It is about filtering regionservers by their version
numbers so we favor regions with higher version numbers. Needs to go
back up into the balancer.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java
It was possible for the Master to be given regions if no regionservers
available (as per the failing unit test in this case).

M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
Minor reordering moving the waitForMasterActive later in the initialize
and wrapping each test in a check if we are to keep looping (which
checks cluster status flag).

M hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManagerMetrics.java
This was an old test from the days when Master carried system tables.
Updated test and fixed metrics. Metrics count the hbase:meta along with
the userspace region so upped expected numbers (previously the
hbase:meta was hosted on the master so metrics were not incremented).

M hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestRegionsOnMasterOptions.java
I took a look at this test again but nope, needs a load of work still to
make it pass.

M hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
Stop being so whiney.
2018-02-28 22:25:24 -08:00