Commit Graph

119 Commits

Author SHA1 Message Date
Sean Busbey 4baac1a782 HBASE-20544 ADDENDUM Make HBTU default to random ports.
Signed-off-by: Mike Drob <mdrob@apache.org>
Signed-off-by: Umesh Agashe <uagashe@cloudera.com>
2018-05-14 12:52:30 -05:00
Sean Busbey 61f96b6ffa HBASE-20544 Make HBTU default to random ports.
Signed-off-by: Umesh Agashe <uagashe@cloudera.com>
Signed-off-by: Josh Elser <elserj@apache.org>

 Conflicts:
	hbase-backup/src/test/resources/hbase-site.xml
	hbase-spark-it/src/test/resources/hbase-site.xml
	hbase-spark/src/test/resources/hbase-site.xml
2018-05-09 23:45:39 -07:00
Yechao Chen 75a8e53ce8 HBASE-20500 [rsgroup] should keep at least one server in default group
Signed-off-by: tedyu <yuzhihong@gmail.com>
2018-05-08 08:35:17 -07:00
Balazs Meszaros e93cfb52d0 HBASE-20465 Fix TestEnableRSGroup flaky 2018-04-22 15:39:18 -07:00
tedyu 8df45d33cd HBASE-20224 Web UI is broken in standalone mode - addendum for hbase-endpoint and hbase-rsgroup modules 2018-04-03 08:34:17 -07:00
Chia-Ping Tsai 95596e8ba7 HBASE-20119 Introduce a pojo class to carry coprocessor information in order to make TableDescriptorBuilder accept multiple cp at once
Signed-off-by: Ted Yu <yuzhihong@gmail.com>
Signed-off-by: Michael Stack <stack@apache.org>
2018-03-16 01:26:08 +08:00
haxiaolin a08ade9f1a HBASE-20104 Fix infinite loop of RIT when creating table on a rsgroup that has no online servers
Signed-off-by: tedyu <yuzhihong@gmail.com>
2018-03-01 08:34:58 -08:00
Chia-Ping Tsai c459282fe0 HBASE-20093 Replace ServerLoad by ServerMetrics for ServerManager
Signed-off-by: tedyu <yuzhihong@gmail.com>
2018-02-28 15:05:52 +08:00
haxiaolin 0bf33c802d HBASE-19974 Fix decommissioned servers cannot be removed by remove_servers_rsgroup methods
Signed-off-by: tedyu <yuzhihong@gmail.com>
2018-02-26 07:31:58 -08:00
haxiaolin 285653de3c HBASE-19937 Ensure createRSGroupTable be called after ProcedureExecutor and LoadBalancer are initialized
Signed-off-by: tedyu <yuzhihong@gmail.com>
2018-02-08 22:42:23 -08:00
tedyu 26ccbc49be HBASE-19949 TestRSGroupsWithACL fails with ExceptionInInitializerError 2018-02-07 04:46:09 -08:00
Duo Zhang bbf3bae72a
HBASE-19873 Add a CategoryBasedTimeout ClassRule for all UTs 2018-01-29 12:41:14 -08:00
Peter Somogyi f5779855c6 HBASE-19845 Fix findbugs and error-prone warnings in hbase-rsgroup (branch-2)
Signed-off-by: tedyu <yuzhihong@gmail.com>
2018-01-24 08:41:43 -08:00
tedyu 9ed52ee3e5 HBASE-19752 RSGroupBasedLoadBalancer#getMisplacedRegions() should handle the case where rs group cannot be determined 2018-01-12 12:16:06 -08:00
Guangxu Cheng b31b386f7f HBASE-19483 Add proper privilege check for rsgroup commands addendum fixes unit test
Signed-off-by: tedyu <yuzhihong@gmail.com>
2018-01-10 06:30:03 -08:00
Guangxu Cheng 3fa3dcd9f9 HBASE-19483 Add proper privilege check for rsgroup commands
Signed-off-by: tedyu <yuzhihong@gmail.com>
2018-01-09 08:11:07 -08:00
Chia-Ping Tsai 654edc5fa5 HBASE-19596 RegionMetrics/ServerMetrics/ClusterMetrics should apply to all public classes 2018-01-04 13:05:21 +08:00
Mike Drob 64cb777a8a HBASE-19552 find-and-replace thirdparty offset 2017-12-28 12:01:25 -06:00
Chia-Ping Tsai 7dee1bcd31 HBASE-19644 add the checkstyle rule to reject the illegal imports 2017-12-28 04:17:45 +08:00
Chia-Ping Tsai d4af099e9e HBASE-19496 Reusing the ByteBuffer in rpc layer corrupt the ServerLoad and RegionLoad 2017-12-22 18:58:08 +08:00
Balazs Meszaros 992b5d8630 HBASE-10092 Move up on to log4j2
Changes:
- replaced commons-logging to slf4j everywhere
- log.XXX(Throwable) calls were replaced with log.XXX(t.toString(), t)
- log.XXX(Object) calls were replaced with log.XXX(Objects.toString(obj))
- log.fatal() calls were replaced with log.error(HBaseMarkers.FATAL, ...)
- programmatic log4j configuration was removed from the unit test

This commit does not affect the current logging configurations, because log4j
is still on the classpath. slf4j-log4j12 binds log4j to slf4j.

Signed-off-by: Michael Stack <stack@apache.org>
2017-12-20 22:58:12 -08:00
Michael Stack 3366ebdc56
HBASE-19461 TestRSGroups is broke 2017-12-08 15:10:23 -08:00
Guangxu Cheng 01d366da4a HBASE-19326 Remove decommissioned servers from rsgroup
Signed-off-by: Michael Stack <stack@apache.org>

Conflicts:
	hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupAdminEndpoint.java
2017-12-01 11:12:34 -08:00
Apekshit Sharma 4f4aac77e1 HBASE-19367 Refactoring in RegionStates, and RSProcedureDispatcher
- Adding javadoc comments
- Bug: ServerStateNode#regions is HashSet but there's no synchronization to prevent concurrent addRegion/removeRegion. Let's use concurrent set instead.
- Use getRegionsInTransitionCount() directly to avoid instead of getRegionsInTransition().size() because the latter copies everything into a new array - what a waste for just the size.
- There's mixed use of getRegionNode and getRegionStateNode for same return type - RegionStateNode. Changing everything to getRegionStateNode. Similarly rename other *RegionNode() fns to *RegionStateNode().
- RegionStateNode#transitionState() return value is useless since it always returns it's first param.
- Other minor improvements
2017-11-29 22:42:39 -08:00
Apekshit Sharma e0c4f374b5 HBASE-19114 Split out o.a.h.h.zookeeper from hbase-server and hbase-client
- Moved DrainingServerTracker and RegionServerTracker to hbase-server:o.a.h.h.master.
- Moved SplitOrMergeTracker to oahh.master (because it depends on a PB)
- Moving hbase-client:oahh.zookeeper.*  to hbase-zookeeper module.  After HBASE-19200, hbase-client doesn't need them anymore (except 3 classes).
- Renamed some classes to use a consistent naming for classes - ZK instead of mix of ZK, Zk , ZooKeeper. Couldn't rename following public classes: MiniZooKeeperCluster, ZooKeeperConnectionException. Left RecoverableZooKeeper for lack of better name. (suggestions?)
- Sadly, can't move tests out because they depend on HBaseTestingUtility (which defeats part of the purpose - trimming down hbase-server tests. We need to promote more use of mocks in our tests)
2017-11-17 13:23:28 -08:00
zhangduo 30f55f2316 HBASE-19200 Make hbase-client only depend on ZKAsyncRegistry and ZNodePaths
- Removes zookeeper connection from ClusterConnection
- Deletes class ZooKeeperKeepAliveConnection
- Removes Registry, ZooKeeperRegistry, and RegistryFactory
2017-11-10 10:09:04 -08:00
Andrew Purtell 47304efd7c HBASE-19194 TestRSGroupsBase has some always false checks 2017-11-08 09:33:34 -08:00
Guangxu Cheng 8095f96289 HBASE-19088 move_tables_rsgroup will throw an exception when the table is disabled
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2017-11-07 17:12:55 -08:00
Apekshit Sharma d69570a485 HBASE-18925 Update mockito dependency from mockito-all:1.10.19 to mockito-core:2.1.0 for JDK8 support.
Last mockito-all release was in Dec'14. Mockito-core has had many releases since then.

From mockito's site:
- "Mockito does not produce the mockito-all artifact anymore ; this one was primarily
aimed at ant users, and contained other dependencies. We felt it was time to move on
and remove such artifacts as they cause problems in dependency management system like
maven or gradle."
- anyX() and any(SomeType.class) matchers now reject nulls and check type.
2017-11-01 14:38:50 -07:00
Balazs Meszaros 6765f5e203
HBASE-18350 RSGroups are broken under AMv2
- Table moving to RSG was buggy, because it left the table unassigned.
  Now it is fixed we immediately assign to an appropriate RS
  (MoveRegionProcedure).
- Table was locked while moving, but unassign operation hung, because
  locked table queues are not scheduled while locked. Fixed.
- ProcedureSyncWait was buggy, because it searched the procId in
  executor, but executor does not store the return values of internal
  operations (they are stored, but immediately removed by the cleaner).
- list_rsgroups in the shell show also the assigned tables and servers.

Signed-off-by: Michael Stack <stack@apache.org>
2017-10-17 13:58:57 -07:00
Chia-Ping Tsai 6693f45faf HBASE-18839 Apply RegionInfo to code base 2017-09-28 20:19:41 +08:00
Apekshit Sharma 0c883a23c5 HBASE-17732 Coprocessor Design Improvements
------------------------------------------------------
TL;DR
------------------------------------------------------
We are moving from Inheritence
- Observer *is* Coprocessor
- FooService *is* CoprocessorService
To Composition
- Coprocessor *has* Observer
- Coprocessor *has* Service

------------------------------------------------------
Design Changes
------------------------------------------------------
- Adds four new interfaces - MasterCoprocessor, RegionCoprocessor, RegionServierCoprocessor,
  WALCoprocessor
- These new *Coprocessor interfaces have a get*Observer() function for each observer type
  supported by them.
- Added Coprocessor#getService() to base interface. All extending *Coprocessor interfaces will
  get it from the base interface.
- Added BulkLoadObserver hooks to RegionCoprocessorHost instad of SecureBulkLoadManager doing its
  own trickery.
- CoprocessorHost#find*() fuctions: Too many testing hooks digging into CP internals.
  Deleted if can, else marked @VisibleForTesting.

------------------------------------------------------
Backward Compatibility
------------------------------------------------------
- Old coprocessors implementing *Observer won't get loaded (no backward compatibility guarantees).
- Third party coprocessors only implementing Coprocessor will not get loaded (just like Observers).
- Old coprocessors implementing CoprocessorService (for master/region host)
  /SingletonCoprocessorService (for RegionServer host) will continue to work with 2.0.
- Added test to ensure backward compatibility of CoprocessorService/SingletonCoprocessorService
- Note that if a coprocessor implements both observer and service in same class, its service
  component will continue to work but it's observer component won't work.

------------------------------------------------------
Notes
------------------------------------------------------
Did a side-by-side comparison of CPs in master and after patch. These coprocessors which were just
CoprocessorService earlier, needed a home in some coprocessor in new design. For most it was clear
since they were using a particular type of environment. Some were tricky.

- JMXListener - MasterCoprocessor and RSCoprocessor (because jmx listener makes sense for
  processes?)
- RSGroupAdminEndpoint --> MasterCP
- VisibilityController -> MasterCP and RegionCP

These were converted to RegionCoprocessor because they were using RegionCoprocessorEnvironment
which can only come from a RegionCPHost.
- AggregateImplementation
- BaseRowProcessorEndpoint
- BulkDeleteEndpoint
- Export
- RefreshHFilesEndpoint
- RowCountEndpoint
- MultiRowMutationEndpoint
- SecureBulkLoadEndpoint
- TokenProvider

Change-Id: I813145f2bc11815f52ac703563b879962c249764
2017-09-27 12:45:51 -07:00
anoopsamjohn 0fcc84cadd HBASE-18298 RegionServerServices Interface cleanup for CP expose. 2017-09-27 11:02:57 +05:30
Reid Chan dc1db8c5b3 HBASE-18609 Apply ClusterStatus#getClusterStatus(EnumSet<Option>) in code base
Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>
2017-09-14 01:00:19 +08:00
Sean Busbey d576e5a32d HBASE-17823 Migrate to Apache Yetus Audience Annotations
Includes partial backport of hbase-build-configuration module

Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: Misty Stanley-Jones <misty@apache.org>
2017-09-12 23:15:50 -05:00
Umesh Agashe bd219c0fb8 HBASE-18674 upgrading to commons-lang3
Signed-off-by: Michael Stack <stack@apache.org>
2017-09-05 09:46:49 -07:00
Umesh Agashe fad968d99f HBASE-18575 [AMv2] Fixed and enabled TestRestartCluster#testRetainAssignmentOnRestart on master
* Fixed ServerCrashProcedure to set forceNewPlan to false for instances AssignProcedure. This enables balancer to find most suitable target server
* Fixed and enabled TestRestartCluster#testRetainAssignmentOnRestart on master
* Renamed method ServerName@isSameHostnameAndPort() to isSameAddress()

Signed-off-by: Michael Stack <stack@apache.org>
2017-08-23 10:11:54 -07:00
Michael Stack d5c6e11016 HBASE-17908 Upgrade guava
Pull in guava 22.0 by using the shaded version up in new hbase-thirdparty project.

In poms, exclude guava everywhere except on hadoop-common. Do this so
we minimize transitive includes. hadoop-common is needed because hadoop
Configuration uses guava doing preconditions.

Everywhere we used guava, instead use shaded so fix a load of imports.

Stopwatch API changed as did hashing and toStringHelper which is now
in MoreObjects class. Otherwise, minimal changes to come up on 22.0
2017-07-21 15:41:52 +01:00
tedyu 6e1907919a HBASE-18272 Fix issue about RSGroupBasedLoadBalancer#roundRobinAssignment where BOGUS_SERVER_NAME is involved in two groups (chenxu) 2017-06-27 07:03:59 -07:00
Michael Stack 3975bbd008 HBASE-14614 Procedure v2 - Core Assignment Manager (Matteo Bertozzi) Move to a new AssignmentManager, one that describes Assignment using a State Machine built on top of ProcedureV2 facility.
This doc. keeps state on where we are at w/ the new AM:
https://docs.google.com/document/d/1eVKa7FHdeoJ1-9o8yZcOTAQbv0u0bblBlCCzVSIn69g/edit#heading=h.vfdoxqut9lqn
Includes list of tests disabled by this patch with reasons why.

Based on patches from Matteos' repository and then fix up to get it all to pass cluster
tests, filling in some missing functionality, fix of findbugs, fixing bugs, etc..
including:

    1. HBASE-14616 Procedure v2 - Replace the old AM with the new AM.
    The basis comes from Matteo's repo here:
    689227fcbf

    Patch replaces old AM with the new under subpackage master.assignment.
    Mostly just updating classes to use new AM -- import changes -- rather
    than the old. It also removes old AM and supporting classes.
    See below for more detail.

    2. HBASE-14614 Procedure v2 - Core Assignment Manager (Matteo Bertozzi)
    3622cba4e3

    Adds running of remote procedure. Adds batching of remote calls.
    Adds support for assign/unassign in procedures. Adds version info
    reporting in rpc. Adds start of an AMv2.

    3. Reporting of remote RS version is from here:
    ddb4df3964.patch

    4. And remote dispatch of procedures is from:
    186b9e7c4d

    5. The split merge patches from here are also melded in:
    9a3a95a2c2
    and d6289307a0

We add testing util for new AM and new sets of tests.

Does a bunch of fixup on logging so its possible to follow a procedures' narrative by grepping
procedure id. We spewed loads of log too on big transitions such as master fail; fixed.

Fix CatalogTracker. Make it use Procedures doing clean up of Region data on split/merge.
Without these changes, ITBLL was failing at larger scale (3-4hours 5B rows) because we were
splitting split Regions among other things (CJ would run but wasn't
taking lock on Regions so havoc).

    Added a bunch of doc. on Procedure primitives.

    Added new region-based state machine base class. Moved region-based
    state machines on to it.

    Found bugs in the way procedure locking was doing in a few of the
    region-based Procedures. Having them all have same subclass helps here.

    Added isSplittable and isMergeable to the Region Interface.

    Master would split/merge even though the Regions still had
    references. Fixed it so Master asks RegionServer if Region
    is splittable.

    Messing more w/ logging. Made all procedures log the same and report
    the state the same; helps when logging is regular.

    Rewrote TestCatalogTracker. Enabled TestMergeTableRegionProcedure.

    Added more functionality to MockMasterServices so can use it doing
    standalone testing of Procedures (made TestCatalogTracker use it
    instead of its own version).

    Add to MasterServices ability to wait on Master being up -- makes
    it so can Mock Master and start to implement standalone split testing.
    Start in on a Split region standalone test in TestAM.

    Fix bug where a Split can fail because it comes in in the middle of
    a Move (by holding lock for duration of a Move).

    Breaks CPs that were watching merge/split. These are run by Master now
    so you need to observe on Master, not on RegionServer.

    Details:

    M hbase-client/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java
    Takes List of regionstates on construction rather than a Set.
    NOTE!!!!! This is a change in a public class.

    M hbase-client/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
    Add utility getShortNameToLog

    M hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
    M hbase-client/src/main/java/org/apache/hadoop/hbase/client/ShortCircuitMasterConnection.java
    Add support for dispatching assign, split and merge processes.

    M hbase-client/src/main/java/org/apache/hadoop/hbase/master/RegionState.java
    Purge old overlapping states: PENDING_OPEN, PENDING_CLOSE, etc.

    M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java
    Lots of doc on its inner workings. Bug fixes.

    M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
    Log and doc on workings. Bug fixes.

    A hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/RemoteProcedureDispatcher.java
    Dispatch remote procedures every 150ms or 32 items -- which ever
    happens first (configurable). Runs a timeout thread. This facility is
    not on yet; will come in as part of a later fix. Currently works a
    region at a time. This class carries notion of a remote procedure and of a buffer full of these.
    "hbase.procedure.remote.dispatcher.threadpool.size" with default = 128
    "hbase.procedure.remote.dispatcher.delay.msec" with default = 150ms
    "hbase.procedure.remote.dispatcher.max.queue.size" with default = 32

    M hbase-client/src/main/java/org/apache/hadoop/hbase/shaded/protobuf/ProtobufUtil.java
    Add in support for merge. Remove no-longer used methods.

    M hbase-protocol-shaded/src/main/protobuf/Admin.proto b/hbase-protocol-shaded/src/main/protobuf/Admin.proto
    Add execute procedures call ExecuteProcedures.

    M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
    Add assign and unassign state support for procedures.

    M hbase-server/src/main/java/org/apache/hadoop/hbase/client/VersionInfoUtil.java
    Adds getting RS version out of RPC
    Examples: (1.3.4 is 0x0103004, 2.1.0 is 0x0201000)

    M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
    Remove periodic metrics chore. This is done over in new AM now.
    Replace AM with the new. Host the procedures executor.

    M hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterMetaBootstrap.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterMetaBootstrap.java
    Have AMv2 handle assigning meta.

    M hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java
    Extract version number of the server making rpc.

    A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignProcedure.java
    Add new assign procedure. Runs assign via Procedure Dispatch.
    There can only be one RegionTransitionProcedure per region running at the time,
    since each procedure takes a lock on the region.

    D hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignCallable.java
    D hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
    D hbase-server/src/main/java/org/apache/hadoop/hbase/master/BulkAssigner.java
    D hbase-server/src/main/java/org/apache/hadoop/hbase/master/GeneralBulkAssigner.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/GeneralBulkAssigner.java
    Remove these hacky classes that were never supposed to live longer than
    a month or so to be replaced with real assigners.

    D hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStateStore.java
    D hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java
    D hbase-server/src/main/java/org/apache/hadoop/hbase/master/UnAssignCallable.java

    A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
    A procedure-based AM (AMv2).

    TODO
     - handle region migration
     - handle meta assignment first
     - handle sys table assignment first (e.g. acl, namespace)
     - handle table priorities
      "hbase.assignment.bootstrap.thread.pool.size"; default size is 16.
      "hbase.assignment.dispatch.wait.msec"; default wait is 150
      "hbase.assignment.dispatch.wait.queue.max.size"; wait max default is 100
      "hbase.assignment.rit.chore.interval.msec"; default is 5 * 1000;
      "hbase.assignment.maximum.attempts"; default is 10;

     A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MoveRegionProcedure.java
     Procedure that runs subprocedure to unassign and then assign to new location

     A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStateStore.java
     Manage store of region state (in hbase:meta by default).

     A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStates.java
     In-memory state of all regions. Used by AMv2.

     A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
     Base RIT procedure for Assign and Unassign.

     A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java
     Unassign procedure.

     A hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/RSProcedureDispatcher.java
     Run region assignement in a manner that pays attention to target server version.
     Adds "hbase.regionserver.rpc.startup.waittime"; defaults 60 seconds.
2017-05-31 17:49:11 -07:00
Michael Stack a3c5a74487 Revert "HBASE-14614 Procedure v2 - Core Assignment Manager (Matteo Bertozzi)"
Revert a mistaken commit!!!

This reverts commit dc1065a85d.
2017-05-24 23:31:36 -07:00
Michael Stack dc1065a85d HBASE-14614 Procedure v2 - Core Assignment Manager (Matteo Bertozzi)
Move to a new AssignmentManager, one that describes Assignment using
a State Machine built on top of ProcedureV2 facility.

This doc. keeps state on where we are at w/ the new AM:
https://docs.google.com/document/d/1eVKa7FHdeoJ1-9o8yZcOTAQbv0u0bblBlCCzVSIn69g/edit#heading=h.vfdoxqut9lqn
Includes list of tests disabled by this patch with reasons why.

Based on patches from Matteos' repository and then fix up to get it all to pass cluster
tests, filling in some missing functionality, fix of findbugs, fixing bugs, etc..
including:

1. HBASE-14616 Procedure v2 - Replace the old AM with the new AM.
The basis comes from Matteo's repo here:
689227fcbf

Patch replaces old AM with the new under subpackage master.assignment.
Mostly just updating classes to use new AM -- import changes -- rather
than the old. It also removes old AM and supporting classes.
See below for more detail.

2. HBASE-14614 Procedure v2 - Core Assignment Manager (Matteo Bertozzi)
3622cba4e3

Adds running of remote procedure. Adds batching of remote calls.
Adds support for assign/unassign in procedures. Adds version info
reporting in rpc. Adds start of an AMv2.

3. Reporting of remote RS version is from here:
ddb4df3964.patch

4. And remote dispatch of procedures is from:
186b9e7c4d

5. The split merge patches from here are also melded in:
9a3a95a2c2
and d6289307a0

We add testing util for new AM and new sets of tests.

Does a bunch of fixup on logging so its possible to follow a procedures' narrative by grepping
procedure id. We spewed loads of log too on big transitions such as master fail; fixed.

Fix CatalogTracker. Make it use Procedures doing clean up of Region data on split/merge.
Without these changes, ITBLL was failing at larger scale (3-4hours 5B rows) because we were
splitting split Regions among other things (CJ would run but wasn't
taking lock on Regions so havoc).

Added a bunch of doc. on Procedure primitives.

Added new region-based state machine base class. Moved region-based
state machines on to it.

Found bugs in the way procedure locking was doing in a few of the
region-based Procedures. Having them all have same subclass helps here.

Added isSplittable and isMergeable to the Region Interface.

Master would split/merge even though the Regions still had
references. Fixed it so Master asks RegionServer if Region
is splittable.

Messing more w/ logging. Made all procedures log the same and report
the state the same; helps when logging is regular.

Rewrote TestCatalogTracker. Enabled TestMergeTableRegionProcedure.

Added more functionality to MockMasterServices so can use it doing
standalone testing of Procedures (made TestCatalogTracker use it
instead of its own version).

Add to MasterServices ability to wait on Master being up -- makes
it so can Mock Master and start to implement standalone split testing.
Start in on a Split region standalone test in TestAM.

Fix bug where a Split can fail because it comes in in the middle of
a Move (by holding lock for duration of a Move).

Breaks CPs that were watching merge/split. These are run by Master now
so you need to observe on Master, not on RegionServer.

Details:

M hbase-client/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java
Takes List of regionstates on construction rather than a Set.
NOTE!!!!! This is a change in a public class.

M hbase-client/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
Add utility getShortNameToLog

M hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/ShortCircuitMasterConnection.java
Add support for dispatching assign, split and merge processes.

M hbase-client/src/main/java/org/apache/hadoop/hbase/master/RegionState.java
Purge old overlapping states: PENDING_OPEN, PENDING_CLOSE, etc.

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java
Lots of doc on its inner workings. Bug fixes.

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
Log and doc on workings. Bug fixes.

A hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/RemoteProcedureDispatcher.java
Dispatch remote procedures every 150ms or 32 items -- which ever
happens first (configurable). Runs a timeout thread. This facility is
not on yet; will come in as part of a later fix. Currently works a
region at a time. This class carries notion of a remote procedure and of a buffer full of these.
"hbase.procedure.remote.dispatcher.threadpool.size" with default = 128
"hbase.procedure.remote.dispatcher.delay.msec" with default = 150ms
"hbase.procedure.remote.dispatcher.max.queue.size" with default = 32

M hbase-client/src/main/java/org/apache/hadoop/hbase/shaded/protobuf/ProtobufUtil.java
Add in support for merge. Remove no-longer used methods.

M hbase-protocol-shaded/src/main/protobuf/Admin.proto b/hbase-protocol-shaded/src/main/protobuf/Admin.proto
Add execute procedures call ExecuteProcedures.

M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
Add assign and unassign state support for procedures.

M hbase-server/src/main/java/org/apache/hadoop/hbase/client/VersionInfoUtil.java
Adds getting RS version out of RPC
Examples: (1.3.4 is 0x0103004, 2.1.0 is 0x0201000)

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
Remove periodic metrics chore. This is done over in new AM now.
Replace AM with the new. Host the procedures executor.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterMetaBootstrap.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterMetaBootstrap.java
Have AMv2 handle assigning meta.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java
Extract version number of the server making rpc.

A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignProcedure.java
Add new assign procedure. Runs assign via Procedure Dispatch.
There can only be one RegionTransitionProcedure per region running at the time,
since each procedure takes a lock on the region.

D hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignCallable.java
D hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
D hbase-server/src/main/java/org/apache/hadoop/hbase/master/BulkAssigner.java
D hbase-server/src/main/java/org/apache/hadoop/hbase/master/GeneralBulkAssigner.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/GeneralBulkAssigner.java
Remove these hacky classes that were never supposed to live longer than
a month or so to be replaced with real assigners.

D hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStateStore.java
D hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java
D hbase-server/src/main/java/org/apache/hadoop/hbase/master/UnAssignCallable.java

A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
A procedure-based AM (AMv2).

TODO
 - handle region migration
 - handle meta assignment first
 - handle sys table assignment first (e.g. acl, namespace)
 - handle table priorities
  "hbase.assignment.bootstrap.thread.pool.size"; default size is 16.
  "hbase.assignment.dispatch.wait.msec"; default wait is 150
  "hbase.assignment.dispatch.wait.queue.max.size"; wait max default is 100
  "hbase.assignment.rit.chore.interval.msec"; default is 5 * 1000;
  "hbase.assignment.maximum.attempts"; default is 10;

 A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MoveRegionProcedure.java
 Procedure that runs subprocedure to unassign and then assign to new location

 A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStateStore.java
 Manage store of region state (in hbase:meta by default).

 A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStates.java
 In-memory state of all regions. Used by AMv2.

 A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
 Base RIT procedure for Assign and Unassign.

 A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java
 Unassign procedure.

 A hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/RSProcedureDispatcher.java
 Run region assignement in a manner that pays attention to target server version.
 Adds "hbase.regionserver.rpc.startup.waittime"; defaults 60 seconds.
2017-05-24 20:47:25 -07:00
Guangxu Cheng a8775b11d2 HBASE-18051 balance_rsgroup still runs when the Load Balancer is not enabled
Signed-off-by: tedyu <yuzhihong@gmail.com>
2017-05-16 08:28:13 -07:00
Chia-Ping Tsai 053e61541e HBASE-15583 Any HTableDescriptor we give out should be immutable 2017-04-27 03:22:29 +08:00
Andrew Purtell 029fa29712 HBASE-17785 RSGroupBasedLoadBalancer fails to assign new table regions when cloning snapshot 2017-04-05 16:25:56 -07:00
tedyu 4088f822a4 HBASE-17806 TestRSGroups#testMoveServersAndTables is flaky in master branch (Guangxu Cheng) 2017-03-20 09:26:34 -07:00
Andrew Purtell 7f0e6f1c9e HBASE-17758 [RSGROUP] Add shell command to move servers and tables at the same time (Guangxu Cheng) 2017-03-16 18:37:40 -07:00
Michael Stack 4b541d6380 HBASE-17772 IntegrationTestRSGroup won't run 2017-03-10 19:50:53 -08:00
Jan Hentschel b53f354763 HBASE-17532 Replaced explicit type with diamond operator
Signed-off-by: Michael Stack <stack@apache.org>
2017-03-07 11:22:51 -08:00
Apekshit Sharma ce64e7eb6e HBASE-17654 RSGroup refactoring.
Changes contain:
- Making rsGroupInfoManager non-static in RSGroupAdminEndpoint
- Encapsulate RSGroupAdminService into an internal class in RSGroupAdminEndpoint (on need of inheritence).
- Change two internal classes in RSGroupAdminServer to non-static (so outer classes' variables can be shared).
- Rename RSGroupSerDe to RSGroupProtobufUtil('ProtobufUtil' is what we use in other places). Moved 2 functions to RSGroupManagerImpl because they are only used there.
- Javadoc comments
- Improving variable names
- Maybe other misc refactoring

Change-Id: I09f0f5aa413150390c91795b8a8fd5e6cdd6c416
2017-02-25 22:12:03 -08:00
Michael Stack b392de3e31 HBASE-17653 HBASE-17624 rsgroup synchronizations will (distributed) deadlock
This patch restores the regime instituted in original rsgroups patch
(HBASE-6721) where reading of rsgroup state runs unimpeded against
COW immutable Maps whereas mutation to state require exclusive locks
(updating the Maps of state when done). HBASE-17624 was
over-enthusiastic with its locking down of access making it likely
we'd deadlock.

Adds documentation on concurrency expectations.
2017-02-17 14:45:56 -08:00
Michael Stack ae840c0ccd HBASE-17656 Move new Address class from util to net package 2017-02-16 12:17:50 -08:00
Michael Stack e019961150 HBASE-17624 Address late review of HBASE-6721, rsgroups feature
Addresses review comments by Sean Busbey and Appy that happened
to come in long after the commit of HBASE-6721, the original
rsgroup issue.

Also includes subsequent accommodation of Duo Zhang review.

Adds a new type to hold hostname and port. It is called
Address. It is a facade over Guava's HostAndPort. Replace
all instances of HostAndPort with Address. In particular,
those places where HostAndPort was part of the rsgroup
public API.

Fix licenses. Add audience annotations.

Cleanup and note concurrency expectation on a few core classes.
In particular, all access on RSGroupInfoManager is made
synchronized.

M hbase-client/src/main/java/org/apache/hadoop/hbase/ServerName.java
 Host the hostname and port in an instance of the new type Address.
 Add a bunch of deprecation of exotic string parses that should never
 have been public.

M hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupAdmin.java
 Make this an Interface rather than abstract class. Creation was a
 static internal method that only chose one type.... Let it be free
 as a true Interface instead.
2017-02-14 09:29:17 -08:00
Jan Hentschel 55c2e2d484 HBASE-9702 Changed unit tests to use method names for tables 2017-02-13 13:27:55 -08:00
Michael Stack 9ec0ec4922 HBASE-17350 Fixup of regionserver group-based assignment
Renamed move_rsgroup_servers as move_servers_rsgroup
Renamed move_rsgroup_tables as move_tables_rsgroup

Minor changes to help text in rsgroup commands making them all same.
Made LOG from RSGroupAdminServer all talk of 'rsgroup' rather than
'group' to be consistent.

Fix for table.jsp where it would fail to display regions because no
type for the protobuf record specified.

Fix it so that move of an offline server to 'default' rsgroup is like
moving the reference to the server to trash (keeps the 'default' group
consistently 'dynamic' regards its server-list).

Fixed another issue where we were stuck in a loop because regions
were in FAILED_OPEN state because no server to assign too so we'd
never recover (a vagary of the current state of Master assignement
but no less a possibility in real world deploys).

Make it so servers are sorted when we list them; its what operator
would expect.
2017-02-06 13:09:57 -08:00
Jan Hentschel aff8de8397 HBASE-17555 Changed calls to deprecated getHBaseAdmin to getAdmin
Signed-off-by: Michael Stack <stack@apache.org>
2017-01-28 21:41:25 -08:00
Jan Hentschel 55a1aa1e73 HBASE-10699 Set capacity on ArrayList where possible and use isEmpty instead of size() == 0
Signed-off-by: Michael Stack <stack@apache.org>
2017-01-20 22:58:20 -08:00
zhangduo 3aa4dfa73d HBASE-16690 Move znode path configs to a separated class 2016-10-05 20:12:44 +08:00
stack 95c1dc93fb HBASE-15638 Shade protobuf
Which includes

    HBASE-16742 Add chapter for devs on how we do protobufs going forward

    HBASE-16741 Amend the generate protobufs out-of-band build step
    to include shade, pulling in protobuf source and a hook for patching protobuf

    Removed ByteStringer from hbase-protocol-shaded. Use the protobuf-3.1.0
    trick directly instead. Makes stuff cleaner. All under 'shaded' dir is
    now generated.

    HBASE-16567 Upgrade to protobuf-3.1.x
    Regenerate all protos in this module with protoc3.
    Redo ByteStringer to use new pb3.1.0 unsafebytesutil
    instead of HBaseZeroCopyByteString

    HBASE-16264 Figure how to deal with endpoints and shaded pb Shade our protobufs.
    Do it in a manner that makes it so we can still have in our API references to
    com.google.protobuf (and in REST). The c.g.p in API is for Coprocessor Endpoints (CPEP)

            This patch is Tactic #4 from Shading Doc attached to the referenced issue.
            Figuring an appoach took a while because we have Coprocessor Endpoints
            mixed in with the core of HBase that are tough to untangle (FIX).

            Tactic #4 (the fourth attempt at addressing this issue) is COPY all but
            the CPEP .proto files currently in hbase-protocol to a new module named
            hbase-protocol-shaded. Generate .protos again in the new location and
            then relocate/shade the generated files. Let CPEPs keep on with the
            old references at com.google.protobuf.* and
            org.apache.hadoop.hbase.protobuf.* but change the hbase core so all
            instead refer to the relocated files in their new location at
            org.apache.hadoop.hbase.shaded.com.google.protobuf.*.

            Let the new module also shade protobufs themselves and change hbase
            core to pick up this shaded protobuf rather than directly reference
            com.google.protobuf.

            This approach allows us to explicitly refer to either the shaded or
            non-shaded version of a protobuf class in any particular context (though
            usually context dictates one or the other). Core runs on shaded protobuf.
            CPEPs continue to use whatever is on the classpath with
            com.google.protobuf.* which is pb2.5.0 for the near future at least.

            See above cited doc for follow-ons and downsides. In short, IDEs will complain
            about not being able to find the shaded protobufs since shading happens at package
            time; will fix by checking in all generated classes and relocated protobuf in
            a follow-on. Also, CPEPs currently suffer an extra-copy as marshalled from
            non-shaded to shaded. To fix. Finally, our .protos are duplicated; once
            shaded, and once not. Pain, but how else to reveal our protos to CPEPs or
            C++ client that wants to talk with HBase AND shade protobuf.

            Details:

            Add a new hbase-protocol-shaded module. It is a copy of hbase-protocol
    i       with all relocated offset from o.a.h.h. to o.a.h.h.shaded. The new module
            also includes the relocated pb. It does not include CPEPs. They stay in
            their old location.

            Add another module hbase-endpoint which has in it all the endpoints
            that ship as part of hbase -- at least the ones that are not
            entangled with core such as AccessControl and Auth. Move all protos
            for these CPEPs here as well as their unit tests (mostly moving a
            bunch of stuff out of hbase-server module)

            Much of the change looks like this:

                 -import org.apache.hadoop.hbase.protobuf.ProtobufUtil;
                 -import org.apache.hadoop.hbase.protobuf.generated.ClusterIdProtos;
                 +import org.apache.hadoop.hbase.protobuf.shaded.ProtobufUtil;
                 +import org.apache.hadoop.hbase.shaded.protobuf.generated.ClusterIdProtos;

            In HTable and in HBaseAdmin, regularize the way Callables are used and also hide
            protobuf usage as much as possible moving it up into Callable super classes or out
            to utility classes. Still TODO is adding in of retries, etc., but can wait on
            procedure which will redo all this.

            Also in HTable and HBaseAdmin as well as in HRegionServer and Server, be explicit
            when using non-shaded protobuf. Do the full-path so it is clear. This is around
            endpoint coprocessors registration of services and execution of CPEP methods.

            Shrunk ProtobufUtil by moving methods used by one CPEP only back to the CPEP either
            into Client class or as new Util class; e.g. AccessControlUtil.

            There are actually two versions of ProtobufUtil now; a shaded one and a subset
            that is used by CPEPs doing non-shaded work.

            Made it so hbase-common no longer depends on hbase-protocol (with Matteo's help)

            R*Converter classes got moved down under shaded package -- they are for internal
            use only. There are no non-shaded versions of these classes.

            D hbase-client/src/main/java/org/apache/hadoop/hbase/client/AbstractRegionServerCallable
            D RetryingCallableBase
             Not used anymore and we have too many tiers of Callables so removed/cleaned-up.

            A ClientServicecallable
             Had to add this one. RegionServerCallable was made generic so it could be used
             for a few Interfaces (Client and Admin). Then added ClientServiceCallable to
             implement RegionServerCallable with the Client Interface.
2016-10-03 21:37:32 -07:00
Jerry He edc0ef3fe4 HBASE-16598 Enable zookeeper useMulti always and clean up in HBase code 2016-09-17 16:51:26 -07:00
tedyu 552400e536 HBASE-16491 A few org.apache.hadoop.hbase.rsgroup classes missing @InterfaceAudience annotation (Umesh Agashe) 2016-09-12 12:16:26 -07:00
tedyu 3642287b2f HBASE-16491 A few org.apache.hadoop.hbase.rsgroup classes missing @InterfaceAudience annotation - Revert due to missing credit 2016-09-12 12:15:50 -07:00
tedyu 6f072809ee HBASE-16491 A few org.apache.hadoop.hbase.rsgroup classes missing @InterfaceAudience annotation 2016-09-12 12:14:56 -07:00
tedyu 44c9021d67 HBASE-16462 TestRSGroupsBas#testGroupBalance may hang due to uneven region distribution (Guangxu Cheng) 2016-08-24 21:55:37 -07:00
tedyu 741d0a4511 HBASE-16430 Fix RegionServer Group's bug when moving multiple tables (Guangxu Cheng) 2016-08-19 07:44:38 -07:00
tedyu fbd87f91bc Revert commit for HBASE-16430 which didn't have JIRA number 2016-08-19 07:44:14 -07:00
tedyu ad16676f92 Fix RegionServer Group's bug when moving multiple tables (Guangxu Cheng) 2016-08-19 07:42:27 -07:00
Jurriaan Mous cdd532da8a HBASE-15610 Remove deprecated HConnection for 2.0 thus removing all PB references for 2.0
Signed-off-by: stack <stack@apache.org>
2016-05-29 07:50:55 -07:00
Enis Soztutar ca816f0780 HBASE-6721 RegionServer Group based Assignment (Francis Liu) 2016-03-14 18:28:50 -07:00