Allow that DisableTableProcedue can grab a region lock before
ServerCrashProcedure can. Cater to this cricumstance where SCP
was not unable to make progress by running the search for RIT
against the crashed server a second time, post creation of all
crashed-server assignemnts. The second run will uncover such as
the above DisableTableProcedure unassign and will interrupt its
suspend allowing both procedures to make progress.
M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
Add new procedure step post-assigns that reruns the RIT finder method.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
Make this important log more specific as to what is going on.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java
Better explanation as to what is going on.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerCrashProcedure.java
Add extra step and run handleRIT a second time after we've queued up
all SCP assigns. Also fix a but. SCP was adding an assign of a RIT
that was actually trying to unassign (made the deadlock more likely).
We assumed that we can run for loop from 0 to lastStep sequentially. MergeTableRegionProcedure skips step 2. So, when i is 0 the procedure is already at step 3.
Added a method StateMachineProcedure#getCurrentStateId that can be used from test code only.
On failed RPC we expire the server and suspend expecting the
resultant ServerCrashProcedure to wake us back up again. In tests,
TestRSGroup hung because it failed to schedule a server expiration
because the server was already expired undergoing processing (the
test was shutting down). Deal with this case by having expire
servers return false if unable to expire. Callers will then know
where a ServerCrashProcedure has been scheduled or not.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
Have expireServer return true if successful.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
The log that included an exception whose message was the current
procedure as a String totally baffled me. Make it more obvious what
exception is.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java
If failed expire of a server, wake our procedure -- do not suspend --
and presume ok to move region to CLOSED state (because going down or
concurrent crashed server processing ongoing).
* rely on git plumbing commands when checking if we've built the site for a particular commit already
* switch to forcing '-e' for bash
* add command line switches for: path to hbase, working directory, and publishing
* only export JAVA/MAVEN HOME if they aren't already set.
* add some docs about assumptions
* Update javadoc plugin to consistently be version 3.0.0
* avoid duplicative site invocations on reactor modules
* update use of cp command so it works both on linux and mac
* manually skip enforcer plugin during build
* still doing install of all jars due to MJAVADOC-490, but then skip rebuilding during aggregate reports.
* avoid the pager on git-diff by teeing to a log file, which also helps later reviewing in the case of big changesets.
Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: Misty Stanley-Jones <misty@apache.org>
Conflicts:
hbase-backup/pom.xml
hbase-spark-it/pom.xml
Revert of the revert, i.e. reapply. Thought this had broken the build
but it was a bad nightly. Putting it back.
Revert "Revert "HBASE-20069 fix existing findbugs errors in hbase-server; ADDENDUM Address review""
This reverts commit 07eae00ec1.
Reapplication of a patch temporarily removed...
I thought this was causing issue but now I don't think it the culprit.
Revert "Revert "HBASE-20092 Fix TestRegionMetrics#testRegionMetrics""
This reverts commit 367d316781.
Allow OPEN as a possible state when update region transition state.
Usually state is OPENING but if crash before finish step is completed,
on replay, master may have read that the state is OPEN from meta table
and so will think it open... When we replay the procedure finish, allow
that the region is already OPEN.
Fix MoveRandomRegionOfTableAction. It depended on old AM behavior.
Make it do explicit move as is required in AMv3; w/o it, it was just
closing region causing test to fail.
Fix pom so hadoop3 profile specifies a different netty3 version.
Bunch of logging format change that came of trying trying to read
the spew from this test.
This reverts commit bc080e7500.
Conflicts:
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
patch reverted changes that happened in parallel without explanation. see jira.
- Added ADMIN permission check for following rpc calls:
normalize, setNormalizerRunning, runCatalogScan, enableCatalogJanitor, runCleanerChore,
setCleanerChoreRunning, execMasterService, execProcedure, execProcedureWithRet
- Moved authorizationEnabled check to start of AccessChecker's functions. Currently, and IDK why,
we call authManager.authorize() first and then discard its result if authorizationEnabled is false. Weird.
Negative Remaining KVs and progress percent greater than 100 is because CompactionProgress#totalCompactingKVs is sometimes less than CompactionProgress#currentCompactedKVs.
Changes add a getter to CompactionProgress#totalCompactingKVs and from inside getter warning is logged. currentCompactedKVs are return when totalCompactingKVs are less than current.
Signed-off-by: Michael Stack <stack@apache.org>
Revert "Revert "HBASE-2004 TestClientClusterStatus is flakey""
This is a revert of a revert, i.e. a reapplication, just so I can fix
the JIRA number.
This reverts commit 6796b8e21f.
LocalHBaseCluster forces random port assignment for sake of concurrent unit test
execution friendliness, but we still need a positive test for RPC and info port
assignment.
The 1.x functionality of Master DDL operations is that "post" observer hooks
are only invoked when the DDL action was successful. With the async-ness of
ProcV2, we find ourselves in a case where the post-hook may be invoked before
the Procedure runs and fails. We need to introduce some blocking to wait and
see if the Procedure is going to fail on a precondition before invoking the hook.
Signed-off-by: Michael Stack <stack@apache.org>
Remove assert in splittableregionprocedure. It was in the prepare.
Was causing fail in legit case where a region split follows a
table split BEFORE the parent has been GC'd. The region split
finds the parent in SPLIT state which is right. The assert was
having us fail. No need.
Also disabled TestHTrace since not supported in 2.0.0 and flakey.
Only call server.checkIfShouldMoveSystemRegionAsync if a node has been
added. Do not call it if only one regionserver in cluster. Make it
so ServerCrashProcedure runs before it. Add logging if
server.checkIfShouldMoveSystemRegionAsync was responsible for
MOVE (Previous was a mystery when it cut in).
Previous we'd call it when there was a nodeChildrenChanged. These
happen before nodeDeleted. If a server crashed,
checkIfShouldMoveSystemRegionAsync could run first, find the
server that had not yet registered as crashed, find system
tables on it and then try to move them. It would fail because
server would not respond to RPC. The region move would then
be waiting on the servercrashprocedure to wake it up when
done processing but this move had locked the region so
SCP couldn't run....
Serializing, if appropriate, write the hbase-1.x version of the
Comparator to the hfile trailer so hbase-1.x files can read hbase-2.x
hfiles (they are the same format).
This patch adds mirroring of table state out to zookeeper. HBase-1.x
clients look for table state in zookeeper, not in hbase:meta where
hbase-2.x maintains table state.
The patch also moves and refactors the 'migration' code that was put in
place by HBASE-13032.
D hbase-client/src/main/java/org/apache/hadoop/hbase/CoordinatedStateException.java
Unused.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
Move table state migration code from Master startup out to
TableStateManager where it belongs. Also start
MirroringTableStateManager dependent on config.
A hbase-server/src/main/java/org/apache/hadoop/hbase/master/MirroringTableStateManager.java
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/TableStateManager.java
Move migration from zookeeper of table state in here. Also plumb in
mechanism so subclass can get a chance to look at table state as we do
the startup fixup full-table scan of meta.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
Bug-fix. Now we create regions in CLOSED state but we fail to check
table state; were presuming table always enabled. Meant on startup
there'd be an unassigned region that never got assigned.
A hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMirroringTableStateManager.java
Test migration and mirroring.
Value of 'htd' is null as it is initialized in the constructor but when the object is deserialized its null. Got rid of member variable htd and made it local to method.
* Added a comment in MergeTableRegionsProcedure and SplitTableRegionProcedure explaining specific rollbacks has side effect that AssignProcedure/s are submitted asynchronously and those procedures may continue to execute even after rollback() is done.
* Updated comments in tests with correct rollback state to abort
* Added overloaded method MasterProcedureTestingUtility#testRollbackAndDoubleExecution which takes additional argument for waiting on all procedures to finish before asserting conditions
* Updated TestMergeTableRegionsProcedure#testRollbackAndDoubleExecution and TestSplitTableRegionProcedure#testRollbackAndDoubleExecution to use newly added method
Signed-off-by: Michael Stack <stack@apache.org>
Fix two issues:
# Meta Replicas can all be assigned to the same server. This
will call the test to hang when we do our kill of the server
hosting meta because there'll be no replicas to read from
as test intends. Check is to look for this condition on
startup and adjust if we come across it. Replicas cross-cut
assignment. They need work.
# Other issue was shutdown. The master started toward the
end of the test may not have come up fully by the time
shutdown is called. We could be stuck assigning the
meta replicas. Have shutdown shutdown the procedure
executor engine.
There is other cleanup and notes in the below.
M HMaster
Remove the silly stops in startup now we have real
means of shutting down Master during init.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterMetaBootstrap.java
This replica stuff was doing stuff it shouldn't be doing
like setting core Master state flags. It may have made
sense once but now meta is assigned by a Pv2 Procedure
so the flag setting in here is meddlesome. Clear out
methods no longer needed.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
Remove unused methods.
Changes local variable names so they align w/ our naming elsewhere in
code base.
M hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestMetaWithReplicas.java
Check for all replicas on the one server.
On Master@shutdown, close the shared Master connection to kill any
ongoing RPCs by hosted clients.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
Call close ont the Master shared clusterconnection to kill any ongoing
rpcs.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
Remove guts of close; we were closing the Masters connection....not
our responsibility.
Added unit test written by Duo Zhang which demonstrates the case where
Master will not go down.
Signed-off-by: zhangduo <zhangduo@apache.org>
Kill backup master first
Add some cleanup around NamespaceManager
Shorten the timeout waiting on namespace manager as workaround
until we have better soln for interrupting ongoing client rpcs.
Do it in general for all tests.
Signed-off-by: zhangduo <zhangduo@apache.org>
Rename the PE Worker threads.
Send an interrupt if worker taking a long time to go down
(it may be RPC'ing out to a dead server, retrying so
interrupt). Also join on the ProcedureExecutor shutting down.
This will make problems shutting down more obvious.
Disable TestRegionsOnMasterOptions. Master carrying Regions is broke.
Set the ProcedureExcecutor worker threads as daemon.
Ditto for the timeout thread.
Remove hack from TestRegionsOnMasterOptions that was
put in place because the test would not go down.
Set the ProcedureExcecutor worker threads as daemon.
Ditto for the timeout thread.
Remove hack from TestRegionsOnMasterOptions that was
put in place because the test would not go down.
M dev-support/make_rc.sh
Disable checkstyle building site. Its an issue being fixed over in HBASE-19780
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
The clusterid was being set into the process only after the
regionserver registers with the Master. That can be too late for some
test clients in particular. e.g. TestZKAsyncRegistry needs it as soon
as it goes to run which could be before Master had called its run
method which is regionserver run method which then calls back to the
master to register itself... and only then do we set the clusterid.
HBASE-19694 changed start order which made it so this test failed.
Setting the clusterid right after we set it in zk makes the test pass.
Another change was that backup masters were not going down on stop.
Backup masters were sleeping for the default zk period which is 90
seconds. They were not being woken up to check for stop. On stop
master now tells active master manager.
M hbase-server/src/test/java/org/apache/hadoop/hbase/TestJMXConnectorServer.java
Prevent creation of acl table. Messes up our being able to go down
promptly.
M hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestRegionsOnMasterOptions.java
M hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestMultiParallel.java
M hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerReadRequestMetrics.java
Disabled for now because it wants to run with regions on the Master...
currently broke!
M hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestZKAsyncRegistry.java
Add a bit of debugging.
M hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestDLSAsyncFSWAL.java
Disabled. Fails 40% of the time.
M hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestDLSFSHLog.java
Disabled. Fails 33% of the time.
Disabled stochastic load balancer for favored nodes because it fails on
occasion and we are not doing favored nodes in branch-2.
Become active Master before calling the super class's run method. Have
the wait-on-becoming-active-Master be in-line rather than off in a
background thread (i.e. undo running thread in startActiveMasterManager)
Purge the fragile HBASE-16367 hackery that attempted to fix this issue
previously by adding a latch to try and hold up superclass RegionServer
until cluster id set by subclass Master.
In some situations, Runtime.getRuntime().getAvailableProcessors()
may return 0 which would result in calculatePoolSize returning 0
which will trigger an exception. Guard against this case.
Signed-off-by: Reid Chan <reidddchan@outlook.com>
Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>
Signed-off-by: Ted Yu <yuzhihong@gmail.com>
* Use HashSet instead of List for a variable which is only used for lookups
* Remove logging code guards in favor of slf4j parameters
* Use CollectionsUtils.isEmpty() consistently
* Small check-style fixes
In order to avoid additional user settings. If no MSLAB is requested the index is going to be CellArrayMap
Signed-off-by: Anastasia Braginsky <anastas@yahoo-inc.com>
Signed-off-by: Michael Stack <stack@apache.org>
Some manual cleanup of changing package names in pom files and getting
rid of the no-longer-needed netty system property.
This commit will break compilation, package renames in source code are
done in follow-on commits using straightforward find and replace.
's/org.apache.hadoop.hbase.shaded.com.google/org.apache.hbase.thirdparty.com.google/'
's/org.apache.hadoop.hbase.shaded.io.netty/org.apache.hbase.thirdparty.io.netty/'
Removed unused:
<name>hbase.fs.tmp.dir</name>
Added hbase.master.loadbalance.bytable
Edit of description text. Moved stuff around to put configs beside each
other.
M hbase-server/src/main/java/org/apache/hadoop/hbase/util/ServerCommandLine.java
Emit some hbase configs in log on startup.
Signed-off-by: Michael Stack <stack@apache.org>
Changes:
- replaced commons-logging to slf4j everywhere
- log.XXX(Throwable) calls were replaced with log.XXX(t.toString(), t)
- log.XXX(Object) calls were replaced with log.XXX(Objects.toString(obj))
- log.fatal() calls were replaced with log.error(HBaseMarkers.FATAL, ...)
- programmatic log4j configuration was removed from the unit test
This commit does not affect the current logging configurations, because log4j
is still on the classpath. slf4j-log4j12 binds log4j to slf4j.
Signed-off-by: Michael Stack <stack@apache.org>
Implement new WALEntrySinkFilter (as opposed to WALEntryFilter) and
specify the implmentation (with a no-param constructor) in config
using property hbase.replication.sink.walentrysinkfilter
Signed-off-by: wolfgang hoschek whoscheck@cloudera.com
Reenables the test. Adds facility to HBaseTestingUtility so
you can pass in ports a restarted cluster should use. This
is needed so retention of region placement, on which this
test depends, can come trigger (this is why it was broke
on AMv2 commit... region placement retention is done
different in AMv2).
Added new bulk assign createRoundRobinAssignProcedure to complement
the existing createAssignProcedure. The former asks the balancer for
target servers to set into the created AssignProcedures. The latter
sets no target server into AssignProcedure. When no target server
is specified, we make effort at assign-time at trying to deploy the
region to its old location if there was one.
The new round robin assign procedure creator does not do this. Use
the new round robin method on table create or reenabling offline
regions. Use the old assign in ServerCrashProcedure or in
EnableTable so there is a chance we retain locality.
Bulk preassigning passing all to-be-assigned to the balancer in one
go is good for ensuring good distribution especially when read
replicas in the mix.
The old assign was single-assign scoped so region replicas could
end up on the same server.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignProcedure.java
Cleanup around forceNewPlan. Was confusing.
Added a Comparator to sort AssignProcedures so meta and system tables
come ahead of user-space tables.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
Remove the forceNewPlan argument on createAssignProcedure. Didn't make
sense given we were creating a new AssignProcedure; the arg had no
effect.
(createRoundRobinAssignProcedures) Recast to feed all regions to the balancer in
bulk and to sort the return so meta and system tables take precedence.
Miscellaneous fixes including keeping the Master around until all
RegionServers are down, documentation on how assignment retention
works, etc.
send assign meta region request to target server fail""
This is a revert of a revert; i.e a reapplication with the
log message fixed up and some added javadoc.
This reverts commit 9ef115163b.
Signed-off-by: Yi Liang <yliang@us.ibm.com>
- Deprecates old checkAnd*() operations in Table
- Adds Table#CheckAndMutateBuilder and implements it in HTable
Commiter note: When committing the patch, noticed redundant {@inheritDoc} being added in HTable.
Removed new and olds ones.
Signed-off-by: Apekshit Sharma <appy@apache.org>
- Adds debug logging for future ease
- Removes 60s timeout since testRecoveryAndDoubleExecutionPreserveSplits is only halfway after a minute.
- Adds some comments
- Logging change: Some places report "regionState=" while others just "state=".
State machine procs also have "state=" in their logs. Let me change all region related logging to "regionState=" so that
1) it's consistent everywhere, 2) more filtered results when searching through logs.
Created a new WALKey Interface and a WALKeyImpl. The WALKey Interface
is surfaced to Coprocessors and throughout most of the code base.
WALKeyImpl is used internally by WAL and by Replication which need
access to WALKey setters.
Methods that were deprecated in WALObserver because they were exposing
Private audience Classes have been undeprecated now we have WALKey.
Moved over to use SequenceId#getSequenceId throughout. Changed
SequenceId#getSequenceId removing the IOE.
- Changed testThreeRSAbort to kill the RSs intead of aborting. Simple aborting will close the regions, we want extreme failure testing here.
- Adds some logging for easier debugging.
- Refactors TestDistributedLogSplitting to use standard junit rules.
Move the hadoop-hdfs guava exclude in modules up to the top pom.
Looks like an exclude in a module is not additive but rather exclusive
blanking out the top level set of exclusions.
Tested by looking in lib dir of the built tarball.
RowCounter and other related HBase's MapReduce classes have been moved
to hbase-mapreduce component by HBASE-18640, related chapter was
out-of-date and this fix replaced hbase-server with hbase-mapreduce
to correct those commands
Also this change moved RowCounter_Counters.properties
to hbase-mapreduce package as well
JIRA https://issues.apache.org/jira/browse/HBASE-19023
Signed-off-by: tedyu <yuzhihong@gmail.com>
null from other coprocessor even on bypass
If 'bypass' is set by a Coprocessor, skip out on calling any subsequent
Coprocessors that might be chained to a bypassable method.
This patch restores some of the 'complete' behavior removed by
HBASE-19123 only 'bypass' now triggers 'complete'.
For a egionserver's view of a table (the regions
that belong to a table hosted on a regionserver),
this change tracks the latencies of operations that
affect the regions for this table.
Tracking at the per-table level avoids the memory bloat
and performance impact that accompanied the previous
per-region latency metrics while still providing important
details for operators to consume.
Signed-Off-By: Andrew Purtell <apurtell@apache.org>
- Adding javadoc comments
- Bug: ServerStateNode#regions is HashSet but there's no synchronization to prevent concurrent addRegion/removeRegion. Let's use concurrent set instead.
- Use getRegionsInTransitionCount() directly to avoid instead of getRegionsInTransition().size() because the latter copies everything into a new array - what a waste for just the size.
- There's mixed use of getRegionNode and getRegionStateNode for same return type - RegionStateNode. Changing everything to getRegionStateNode. Similarly rename other *RegionNode() fns to *RegionStateNode().
- RegionStateNode#transitionState() return value is useless since it always returns it's first param.
- Other minor improvements
It seems like the original reason this execution filter was added is no
longer an issue for 2.0. Actually, these entries actually preclude
Eclipse from correctly using the Java8 source/target version that we
have specified (which creates numerous compilation errors in Eclipse)
Signed-off-by: Guanghao Zhang <zghao@apache.org>
Update timeouts for TestRegionObserverInterface.
Reason: There are ~10 tests there, each with 5 min individual timeout. Too much. The test class is labelled MediumTests,
let's used that with our standard CategoryBasedTimeout. 3 min per test function should be enough even on slower Apache machines.
This avoids the situation where the build machine has sufficient disk
space (a few GB's at most) to run an HBase test, but the default YARN
configuration would preclude the NM's from starting correctly. This
should eliminate a trivial source of build flakiness based on the host
machines being used.
Signed-off-by: Ted Yu <tedyu@apache.org>
Removed TestHRegion#testMemstoreSizeWithFlushCanceling test.
CPs are not able to cancel a flush any more so this test was
failing; removed it (It was testing memory accounting kept
working across a cancelled flush).
Signed-off-by: Michael Stack <stack@apache.org>
Use header and footer in our *.jsp pages to avoid unnecessary redundancy (copy-paste of code)
Misc edits:
- Due to redundancy, new additions make it to some places but not others. For eg there are missing links to "/logLevel", "/processRS.jsp" in few places.
- Fix processMaster.jsp wrongly pointing to rs-status instead of master-status (probably due to copy paste from processRS.jsp)
- Deleted a bunch of extraneous "</a>" in processMaster.jsp & processRS.jsp
- Added missing </div> tag in snapshot.jsp
- Deleted fossils of html5shiv.js. It's uses and the js itself were deleted in the commit "819aed4ccd073d818bfef5931ec8d248bfae5f1f"
- Fixed wrongly matched heading tags
- Deleted some unused variables
Tested:
Ran standalone cluster and opened each page to make sure it looked right.
Sidenote:
Looks like HBASE-3835 started the work of converting from jsp to jamon, but the work didn't finish. Now we have a mix of jsp and jamon. Needs reconciling, but later.
- Moved DrainingServerTracker and RegionServerTracker to hbase-server:o.a.h.h.master.
- Moved SplitOrMergeTracker to oahh.master (because it depends on a PB)
- Moving hbase-client:oahh.zookeeper.* to hbase-zookeeper module. After HBASE-19200, hbase-client doesn't need them anymore (except 3 classes).
- Renamed some classes to use a consistent naming for classes - ZK instead of mix of ZK, Zk , ZooKeeper. Couldn't rename following public classes: MiniZooKeeperCluster, ZooKeeperConnectionException. Left RecoverableZooKeeper for lack of better name. (suggestions?)
- Sadly, can't move tests out because they depend on HBaseTestingUtility (which defeats part of the purpose - trimming down hbase-server tests. We need to promote more use of mocks in our tests)
Remove a few unused imports.
Remove TestAsyncRegionAdminApi#testOffline, a test for a condition that
no longer exists (no offlining supported in hbase2).
M hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController3.java
Uncomment cleanup called in test teardown.
Restore testRegionCaching and testMulti to working state (required
fixing move procedure and looking for a new exception).
testClusterStatus is broke because multicast is broken.
Updated HTrace version to 4.2
Created TraceUtil class to wrap htrace methods. Uses try with resources.
Signed-off-by: Balazs Meszaros <balazs.meszaros@cloudera.com>
Signed-off-by: Michael Stack <stack@apache.org>
There is no functionality change except for below:
* Variable lastIndexExclusive was getting incremented while locking rows corresponding to input
operations. As a result when getRowLockInternal() method throws TimeoutIOException only operations
in range [nextIndexToProcess, lastIndexExclusive) was getting marked as FAILED before raising
exception up the call stack. With these changes all operations are getting marked as FAILED.
* Cluster Ids of first mutation is used consistently for entire batch. Previous behavior was to use
cluster ids of first mutation in a mini-batch
Signed-off-by: Michael Stack <stack@apache.org>
KeyValue.Type, and its corresponding byte value, are not public API. We
shouldn't have methods that are expecting them. Added a basic sanity
test for isPut and isDelete.
Signed-off-by: Ramkrishna <ramkrishna.s.vasudevan@intel.com>
* pull things that don't rely on HDFS in hbase-server/FSUtils into hbase-common/CommonFSUtils
* refactor setStoragePolicy so that it can move into hbase-common/CommonFSUtils, as a side effect update it for Hadoop 2.8,3.0+
* refactor WALProcedureStore so that it handles its own FS interactions
* add a reflection-based lookup of stream capabilities
* call said lookup in places where we make WALs to make sure hflush/hsync is available.
* javadoc / checkstyle cleanup on changes as flagged by yetus
Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>
Last mockito-all release was in Dec'14. Mockito-core has had many releases since then.
From mockito's site:
- "Mockito does not produce the mockito-all artifact anymore ; this one was primarily
aimed at ant users, and contained other dependencies. We felt it was time to move on
and remove such artifacts as they cause problems in dependency management system like
maven or gradle."
- anyX() and any(SomeType.class) matchers now reject nulls and check type.
'bypass' logic case by case
Changes Coprocessor ObserverContext 'bypass' semantic. We flip the
default so bypass is NOT supported on Observer invocations; only a
couple of preXXX methods in RegionObserver allow it: e.g. preGet
and prePut but not preFlush, etc. Everywhere else, we throw
a DoesNotSupportBypassException if a Coprocessor Observer
tries to invoke bypass. Master Observers can no longer stop
or change move, split, assign, create table, etc.
Ditto on complete, the mechanism that allowed a Coprocessor
rule that all subsequent Coprocessors are skipped in an
invocation chain; now, complete is only available to
bypassable methods (and Coprocessors will get an exception if
they try to 'complete' when it is not allowed).
See javadoc for whether a Coprocessor Observer method supports
'bypass'. If no mention, 'bypass' is NOT supported.
M hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java
Added passing of 'bypassable' (and 'completable') and default 'result' argument to
the Operation constructors rather than pass the excecution engine as parameters.
Makes it so can clean up RegionObserverHost and make the calling
clearer regards what is going on.
Methods that support 'bypass' must set this flag on the Observer.
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
Refactoring in here is minor. A few methods that used support bypass
no longer do so removed the check and the need of an if/else meant a
left-shift in some code.
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
Ditto
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java
In here label explicitly those methods that are bypassable.
Some changes to make sure we call the corresponding execOperation.
TestMasterObserver had a bunch of test of bypass method. All removed or
disabled.
TODO: What to do w/ the Scanner methods.
Deprecate Table::existsAll method and add Table::exists.
RemoteHTable already had a deprecated exists method, remove that
and implement the new exists from Table interface.
Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>
Added filterCell method to Filter, it calls filterKeyValue by default
Deprecated filterKeyValue in Filter, bud added default functionality to return Filter.ReturnCode.INCLUDE.
Added filterKeyValue (calling filterCell) to Filters extending FilterBase to be backward compatible.
renamed filterKeyValue to filterCell in all implementations
changed all internal calls to use filterCell instead of filterKeyValue
changed tests too
This way the change is simple and backward compatible.
Any existing custom filter should work since they override filterKeyValue
and the implementation is called by Filter.filterCell.
Moved FilterWrapper to hbase-server
Signed-off-by: anoopsamjohn <anoopsamjohn@gmail.com>
As part of HBASE-18978 the rpc timeout methods gets aligned
between Table and AsyncTable interfaces.
Deprecate the following methods in Table:
- int getRpcTimeout()
- int getReadRpcTimeout()
- int getWriteRpcTimeout()
- int getOperationTimeout()
Add the following methods to Table:
- long getRpcTimeout(TimeUnit)
- long getReadRpcTimeout(TimeUnit)
- long getWriteRpcTimeout(TimeUnit)
- long getOperationTimeout(TimeUnit)
Fix some javadoc issues.
Signed-off-by: Michael Stack <stack@apache.org>
Change name of Interface OnlineRegions to MutableOnlineRegions.
Change name of Interface ImmutableOnlineRegions to OnlineRegions.
Did this since OnlineRegions is for consumer other than internals.
Add a getOnlineRegions to the RegionCoprocessorEnvironment and to
RegionServerCoprocessorEnvironment so CPs can 'access' local
Regions directly.
- Merged BaseCSM class into CSM interface
- Removed config hbase.coordinated.state.manager.class
- Since state manager is not pluggable anymore, we don't need start/stop/initialize to setup unknown classes. Our internal ZkCSM now requires Server in constructor itself. Makes the dependency clearer too.
- Removed CSM from HRegionServer and HMaster constructor. Although it's a step back from dependency injection, but it's more consistent with our current (not good) pattern where we initialize everything in the ctor itself.
Change-Id: Ifca06bb354adec5b11ea1bad4707e014410491fc
Breaks MemStoreSize into MemStoreSize (read-only) and MemStoreSizing
(read/write). MemStoreSize we allow to Coprocesors. MemStoreSizing we
use internally doing MemStore accounting.
Patch to start a standalone RegionServer that register's itself and
optionally stands up Services. Can work w/o a Master in the mix.
Useful testing. Also can be used by hbase-indexer to put up a
Replication sink that extends public-facing APIs w/o need to extend
internals. See JIRA release note for detail.
This patch adds booleans for whether to start Admin and Client Service.
Other refactoring moves all thread and service start into the one fat
location so we can ask to by-pass 'services' if we don't need them.
See JIRA for an example hbase-server.xml that has config to shutdown
WAL, cache, etc.
Adds checks if a service/thread has been setup before going to use it.
Renames the ExecutorService in HRegionServer from service to
executorService.
See JIRA too for example Connection implementation that makes use of
Connection plugin point to receive a replication stream. The default
replication sink catches the incoming replication stream, undoes the
WALEdits and then creates a Table to call a batch with the
edits; up on JIRA, an example Connection plugin (legit, supported)
returns a Table with an overridden batch method where in we do index
inserts returning appropriate results to keep the replication engine
ticking over.
Upsides: an unadulterated RegionServer that will keep replication metrics
and even hosts a web UI if wanted. No hacks. Just ordained configs
shutting down unused services. Injection of the indexing function at a
blessed point with no pollution by hbase internals; only public imports.
No user of Private nor LimitedPrivate classes.
A hack to "hide" the protobufs, but it's not going to be a trivial
change to remove use of protobufs entirely as they're serialized
into the hbase:quota table.
* Change imports from org.codehaus to com.fasterxml
* Exclude transitive jackson1 from hadoop and others
* Minor test cleanup to add assert messages, fix some parameter order
* Add anti-pattern check for using jackson 1 imports
* Add explicit non-null serialization directive to ScannerModel
Purges Server, MasterServices, and RegionServerServices from
CoprocessorEnvironments. Replaces removed functionality with
a set of carefully curated methods on the *CoprocessorEnvironment
implementations (Varies by CoprocessorEnvironment in that the
MasterCoprocessorEnvironment has Master-type facility exposed,
and so on).
A few core Coprocessors that should long ago have been converted
to be integral, violate their context; e.g. a RegionCoprocessor
wants free access to a hosting RegionServer (which may or may not
be present). Rather than let these violators make us corrupte the
CP API, instead, we've made up a hacky system that allows core
Coprocessors access to internals. A new CoreCoprocessor Annotation
has been introduced. When loading Coprocessors, if the instance is
annotated CoreCoprocessor, we pass it an Environment that has been
padded w/ extra-stuff. On invocation, CoreCoprocessors know how to
route their way to these extras in their environment.
See the *CoprocessoHost for how the do the check for CoreCoprocessor
and pass a fatter *Coprocessor, one that allows getting of either
a RegionServerService or MasterService out of the environment
via Marker Interfaces.
Removed org.apache.hadoop.hbase.regionserver.CoprocessorRegionServerServices
M hbase-endpoint/src/main/java/org/apache/hadoop/hbase/security/access/SecureBulkLoadEndpoint.java
This Endpoint has been deprecated because its functionality has been
moved to core. Marking it a CoreCoprocessor in the meantime to
minimize change.
M hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupAdminEndpoint.java
This should be integral to hbase. Meantime, marking it CoreCoprocessor.
M hbase-server/src/main/java/org/apache/hadoop/hbase/Server.java
Added doc on where it is used and added back a few methods we'd
removed.
A hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/CoreCoprocessor.java
New annotation for core hbase coprocessors. They get richer environment
on coprocessor loading.
A hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/HasMasterServices.java
A hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/HasRegionServerServices.java
Marker Interface to access extras if present.
M hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/MasterCoprocessorEnvironment.java
Purge MasterServices access. Allow CPs a Connection.
M hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/RegionCoprocessorEnvironment.java
Purge RegionServerServices access. Allow CPs a Connection.
M hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/RegionServerCoprocessorEnvironment.java
Purge MasterServices access. Allow CPs a Connection.
M hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/MasterSpaceQuotaObserver.java
M hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/QuotaCache.java
We no longer have access to MasterServices. Don't need it actually.
Use short-circuiting Admin instead.
D hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorRegionServerServices.java
Removed. Not needed now we do CP Env differently.
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
No need to go via RSS to getOnlineTables; just use HRS.
And so on. Adds tests to ensure we can only get at extra info
if the CP has been properly marked.
- Table moving to RSG was buggy, because it left the table unassigned.
Now it is fixed we immediately assign to an appropriate RS
(MoveRegionProcedure).
- Table was locked while moving, but unassign operation hung, because
locked table queues are not scheduled while locked. Fixed.
- ProcedureSyncWait was buggy, because it searched the procId in
executor, but executor does not store the return values of internal
operations (they are stored, but immediately removed by the cleaner).
- list_rsgroups in the shell show also the assigned tables and servers.
Signed-off-by: Michael Stack <stack@apache.org>
* batch validation and preparation is done before we start iterating over operations for writes
* durability, familyCellMaps and observedExceptions are batch wide and are now sotred in BatchOperation,
as a result durability is consistent across all operations in a batch
* for all operations done by preBatchMutate() CP hook, operation status is updated to success
* doWALAppend() is modified to habdle replay and is used from doMiniBatchMutate()
* minor improvements
Signed-off-by: Michael Stack <stack@apache.org>
The archived Procedure WALs are moved to <hbase_root>/oldWALs/masterProcedureWALs
directory. TimeToLiveProcedureWALCleaner class was added which
regularly cleans the Procedure WAL files from there.
The TimeToLiveProcedureWALCleaner is now added to
hbase.master.logcleaner.plugins to clean the 2 WALs in one run.
A new config parameter is added hbase.master.procedurewalcleaner.ttl
which specifies how long a Procedure WAL should stay in the
archive directory.
Signed-off-by: Michael Stack <stack@apache.org>
This reverts commit 0d0c330401.
Backing out filterlist regression, see HBASE-18957. Work continuing branch for HBASE-18410.
Signed-off-by: Peter Somogyi <psomogyi@cloudera.com>
Signed-off-by: Michael Stack <stack@apache.org>
This reverts commit 743f3ae221.
Backing out filterlist regression, see HBASE-18957. Work continuing branch for HBASE-18410.
Signed-off-by: Peter Somogyi <psomogyi@cloudera.com>
Signed-off-by: Michael Stack <stack@apache.org>
This reverts commit 7c2622baf7.
Backing out filterlist regression, see HBASE-18957. Work continuing branch for HBASE-18410.
Signed-off-by: Peter Somogyi <psomogyi@cloudera.com>
Signed-off-by: Michael Stack <stack@apache.org>
Patch ArrayIndexOutOfBoundsException when current() is called after
advance() has already returned false
Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>
These functions have been changed to return Optional<T> instead of T, where T = old return type.
- ObserverContext#getCaller
- RpcCallContext#getRequestUser
- RpcCallContext#getRequestUserName
- RpcServer#getCurrentCall
- RpcServer#getRequestUser
- RpcServer#getRequestUserName
- RpcServer#getRemoteAddress
- ServerCall#getRequestUser
Change-Id: Ib7b4e6be637283755f55755dd4c5124729f7052e
Signed-off-by: Apekshit Sharma <appy@apache.org>
CompactionRequest was removed from CP in HBASE-18453, this change reintroduces
CompatcionRequest to CP as a read-only interface called CompactionRequest.
The CompactionRequest class is renamed to CompactionRequestImpl.
Additionally, this change removes selectionTimeInNanos from CompactionRequest and
uses selectionTime as a replacement. This means that CompactionRequest:toString
is modified and compare as well.
Signed-off-by: Michael Stack <stack@apache.org>
- Change Service Coprocessor#getService() to List<Service> Coprocessor#getServices()
- Checkin the finalized design doc into repo
- Added example to javadoc of Coprocessor base interface on how to implement one in the new design
------------------------------------------------------
TL;DR
------------------------------------------------------
We are moving from Inheritence
- Observer *is* Coprocessor
- FooService *is* CoprocessorService
To Composition
- Coprocessor *has* Observer
- Coprocessor *has* Service
------------------------------------------------------
Design Changes
------------------------------------------------------
- Adds four new interfaces - MasterCoprocessor, RegionCoprocessor, RegionServierCoprocessor,
WALCoprocessor
- These new *Coprocessor interfaces have a get*Observer() function for each observer type
supported by them.
- Added Coprocessor#getService() to base interface. All extending *Coprocessor interfaces will
get it from the base interface.
- Added BulkLoadObserver hooks to RegionCoprocessorHost instad of SecureBulkLoadManager doing its
own trickery.
- CoprocessorHost#find*() fuctions: Too many testing hooks digging into CP internals.
Deleted if can, else marked @VisibleForTesting.
------------------------------------------------------
Backward Compatibility
------------------------------------------------------
- Old coprocessors implementing *Observer won't get loaded (no backward compatibility guarantees).
- Third party coprocessors only implementing Coprocessor will not get loaded (just like Observers).
- Old coprocessors implementing CoprocessorService (for master/region host)
/SingletonCoprocessorService (for RegionServer host) will continue to work with 2.0.
- Added test to ensure backward compatibility of CoprocessorService/SingletonCoprocessorService
- Note that if a coprocessor implements both observer and service in same class, its service
component will continue to work but it's observer component won't work.
------------------------------------------------------
Notes
------------------------------------------------------
Did a side-by-side comparison of CPs in master and after patch. These coprocessors which were just
CoprocessorService earlier, needed a home in some coprocessor in new design. For most it was clear
since they were using a particular type of environment. Some were tricky.
- JMXListener - MasterCoprocessor and RSCoprocessor (because jmx listener makes sense for
processes?)
- RSGroupAdminEndpoint --> MasterCP
- VisibilityController -> MasterCP and RegionCP
These were converted to RegionCoprocessor because they were using RegionCoprocessorEnvironment
which can only come from a RegionCPHost.
- AggregateImplementation
- BaseRowProcessorEndpoint
- BulkDeleteEndpoint
- Export
- RefreshHFilesEndpoint
- RowCountEndpoint
- MultiRowMutationEndpoint
- SecureBulkLoadEndpoint
- TokenProvider
Change-Id: I813145f2bc11815f52ac703563b879962c249764
Adjust exception control flow to fix findbugs warning
NP_NULL_ON_SOME_PATH_EXCEPTION, Possible null pointer dereference of
regionSink in org.apache.hadoop.hbase.tool.Canary$RegionMonitor.run()
on exception path
Added assertion checks to make sure that the error code for the
ToolRunner run() method is used.
Testing Done: Checked that TestCanaryTool unit tests fail when there is
an error code in the current Canary run.
Signed-off-by: Andrew Purtell <apurtell@apache.org>
Changed the type hierarchy of Canary sinks to reduce confusion and avoid
cast errors.
Testing Done: Ran the TestCanaryTool.java test suite and confirmed that
the working is correct.
Signed-off-by: Andrew Purtell <apurtell@apache.org>
This provides significant cluster start time reduction for FileSystems which do not surface locality (S3).
Signed-off-by: Andrew Purtell <apurtell@apache.org>
Upgrade commons-math:2.2 to commons-math3:3.6.1
Remove commons-math 2 specific content from LICENSE.vm
Add missing jersey-client dependency to hbase-it module
Signed-off-by: Michael Stack <stack@apache.org>
This is based on patch sent me by Balazs Meszaros. The good stuff in
here is from him. This patch does less than his ambition. It changes
Admin class only. Can work on making AsyncAdmin cohere in a follow-on.
* Deprecates getAlterStatus. Everywhere else we talk of 'modify' rather
'alter' and should use Future returned from async instead.
* isTableAvailable(TableName, byte [][]) has been deprecated to be
removed; use the overrie instead. This is a weird method.
* Changed listTableDescriptor to getDescriptor.
* Renamed other like methods to have same pattern (deprecating the old):
balancer => balance
setBalancerRunning => balancerSwitch
setNormalizerRunning => normalizerSwitch
enableCatalogJanitor => catalogJanitorSwitch
setCleanerChoreRunning => cleanerChoreSwitch
setSplitOrMergeEnabled => splitOrMergeEnabledSwitch
* Renamed (with deprecation of old) runCatalogScan => runCatalogJanitor.
* Reviewed generated javadoc and made some edits; purged reference to
hbase issues from our API, fixed param names, etc.
* Made all the enable services methods have same pattern.
* Renamed takeSnapshotAsync as snapshotAsync (with deprecation of old)
* Renamed execProcedureWithRet as execProcedureWithReturn (with
deprecation)
Signed-off-by: Michael Stack <stack@apache.org>
Includes partial backport of hbase-build-configuration module
Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: Misty Stanley-Jones <misty@apache.org>
Javadoc for following methods are updated:
* Table.put(List<Put> puts)
* Table.delete(List<Delete> deletes)
Added @apiNote for delete regarding input list will not be modied in version 3.0.0
Signed-off-by: Michael Stack <stack@apache.org>
Main changes:
- ProcedureInfo and LockInfo were removed, we use JSON instead of them
- Procedure and LockedResource are their server side equivalent
- Procedure protobuf state_data became obsolate, it is only kept for
reading previously written WAL
- Procedure protobuf contains a state_message field, which stores the internal
state messages (Any type instead of bytes)
- Procedure.serializeStateData and deserializeStateData were changed slightly
- Procedures internal states are available on client side
- Procedures are displayed on web UI and in shell in the following jruby format:
{ ID => '1', PARENT_ID = '-1', PARAMETERS => [ ..extra state information.. ] }
Signed-off-by: Michael Stack <stack@apache.org>
* testSimpleMasterFailover - fixed and verified
* testPendingOpenOrCloseWhenMasterFailover - removed as logic is based on old code and no longer relevant. TestServerCrashProcedure tests assignments with crashing master and region servers
* testMetaInTransitionWhenMasterFailover - verified that it is fixed by patch for HBASE-18511.
Signed-off-by: Michael Stack <stack@apache.org>
Upgrade commons-collections:3.2.2 to commons-collections4:4.1
Add missing dependency for hbase-procedure, hbase-thrift
Replace CircularFifoBuffer with CircularFifoQueue in WALProcedureStore and TaskMonitor
Signed-off-by: Sean Busbey <busbey@apache.org>
Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>
(cherry picked from commit 137b105c67)
This correctly casts the operand to a long to avoid negative offsets created by sign extending the integer operand.
Signed-off-by: tedyu <yuzhihong@gmail.com>
Do a pass with dependency:analyze; remove unused and
explicity list the dependencies we exploit.
Remove the parent dependencies set which had junit, mockito,
log4j, and findbugs annotations (had to put junit back
temporarily in subsequent version of this patch TODO). Listing in
parent set meant these libs were dependencies for all modules
which in practice was not the case. Edited all modules so
those that need any from this parent set now do explicit listing.
Ran the dependency:analyze over the project. Acted on most
suggested removals and requests for explicit listing. Some
grey areas remain around transitives that come in with
hadoop -needs better excludes, another project- and that
the dependency:analyze tool is not always accurate in its
reporting.
Move Driver to be the main-class in hbase-mapreduce jar rather than
in the hbase-server jar.
Reference the hbase-server and shaded protobuf so they get bundled
when you do 'hbase mapredcp'.
classpath issue when running as a developer fixed
removed thrift webapp from server (it's not used at all, since moved to thrift webapp)
Signed-off-by: tedyu <yuzhihong@gmail.com>
- Moves out o.a.h.h.{mapred, mapreduce} to new hbase-mapreduce module which depends
on hbase-server because of classes like *Snapshot{Input,Output}Format.java, WALs, replication, etc
- hbase-backup depends on it for WALPlayer and MR job stuff
- A bunch of tools needed to be pulled into hbase-mapreduce becuase of their dependencies on MR.
These are: CompactionTool, LoadTestTool, PerformanceEvaluation, ExportSnapshot
This is better place of them than hbase-server. But ideal place would be in separate hbase-tools module.
- There were some tests in hbase-server which were digging into these tools for static util funtions or
confs. Moved these to better/easily shared place. For eg. security related stuff to HBaseKerberosUtils.
- Note that hbase-mapreduce has secondPartExecution tests. On my machine they took like 20 min, so maybe
more on apache jenkins. That's basically equal reduction of runtime of hbase-server tests, which is a
big win!
Change-Id: Ieeb7235014717ca83ee5cb13b2a27fddfa6838e8
Breaking change to our ReplicationEndpoint and BaseReplicationEndpoint.
ReplicationEndpoint implemented Guava 0.12 Service. An abstract
subclass, BaseReplicationEndpoint, provided default implementations
and facility, among other things, by extending Guava
AbstractService class.
Both of these HBase classes were marked LimitedPrivate for
REPLICATION so these classes were semi-public and made it so
Guava 0.12 was part of our API.
Having Guava in our API was a mistake. It anchors us and the
implementation of the Interface to Guava 0.12. This is untenable
given Guava changes and that the Service Interface in particular
has had extensive revamp and improvement done. We can't hold to
the Guava Interface. It changed. We can't stay on Guava 0.12;
implementors and others on our CLASSPATH won't abide being stuck
on an old Guava.
So this class makes breaking changes. The unhitching of our Interface
from Guava could only be done in a breaking manner. It undoes the
LimitedPrivate on BaseReplicationEndpoint while keeping it for the RE
Interface. It means consumers will have to copy/paste the
AbstractService-based BRE into their own codebase also supplying their
own Guava; HBase no longer 'supplies' this (our Guava usage has
been internalized, relocated).
This patch then adds into RE the basic methods RE needs of the old
Guava Service rather than return a Service to start/stop only to go
back to the RE instance to do actual work. A few method names had to
be changed so could make implementations with Guava Service internally
and not have RE method names and types clash). Semantics remained the
same otherwise. For example startAsync and stopAsync in Guava are start
and stop in RE.
* Fixed ServerCrashProcedure to set forceNewPlan to false for instances AssignProcedure. This enables balancer to find most suitable target server
* Fixed and enabled TestRestartCluster#testRetainAssignmentOnRestart on master
* Renamed method ServerName@isSameHostnameAndPort() to isSameAddress()
Signed-off-by: Michael Stack <stack@apache.org>
Instead of using an Atomic Reference to data and aborting when we detect
that new data comes in, use the native cancellation/pre-emption features
of Java Future.
Behavior prior to these changes is to call expireServer(), log exception and suppress it. These changes will result in RS receiving the YouAreDeadException and treating it as a fatal error. This 'fail fast' approach will help us stabilize the code. This behavior can be reconsidered later if necessary.
Signed-off-by: Michael Stack <stack@apache.org>
Changes the configuration hbase.balancer.tablesOnMaster from list of
table names to instead be a boolean; true if master carries
tables/regions and false if it does not.
Adds a new configuration hbase.balancer.tablesOnMaster.systemTablesOnly.
If true, hbase.balancer.tablesOnMaster is considered true but only
system tables are put on the master.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
Master was claiming itself active master though it had stopped. Fix
the activeMaster flag. Set it to false on exit.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java
Add new configs and convenience methods for getting current state of
settings.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java
Move configs up into super Interface and now the settings mean
different, remove the no longer needed processing.