Commit Graph

72 Commits

Author SHA1 Message Date
Michael Stack a3c5a74487 Revert "HBASE-14614 Procedure v2 - Core Assignment Manager (Matteo Bertozzi)"
Revert a mistaken commit!!!

This reverts commit dc1065a85d.
2017-05-24 23:31:36 -07:00
Michael Stack dc1065a85d HBASE-14614 Procedure v2 - Core Assignment Manager (Matteo Bertozzi)
Move to a new AssignmentManager, one that describes Assignment using
a State Machine built on top of ProcedureV2 facility.

This doc. keeps state on where we are at w/ the new AM:
https://docs.google.com/document/d/1eVKa7FHdeoJ1-9o8yZcOTAQbv0u0bblBlCCzVSIn69g/edit#heading=h.vfdoxqut9lqn
Includes list of tests disabled by this patch with reasons why.

Based on patches from Matteos' repository and then fix up to get it all to pass cluster
tests, filling in some missing functionality, fix of findbugs, fixing bugs, etc..
including:

1. HBASE-14616 Procedure v2 - Replace the old AM with the new AM.
The basis comes from Matteo's repo here:
689227fcbf

Patch replaces old AM with the new under subpackage master.assignment.
Mostly just updating classes to use new AM -- import changes -- rather
than the old. It also removes old AM and supporting classes.
See below for more detail.

2. HBASE-14614 Procedure v2 - Core Assignment Manager (Matteo Bertozzi)
3622cba4e3

Adds running of remote procedure. Adds batching of remote calls.
Adds support for assign/unassign in procedures. Adds version info
reporting in rpc. Adds start of an AMv2.

3. Reporting of remote RS version is from here:
ddb4df3964.patch

4. And remote dispatch of procedures is from:
186b9e7c4d

5. The split merge patches from here are also melded in:
9a3a95a2c2
and d6289307a0

We add testing util for new AM and new sets of tests.

Does a bunch of fixup on logging so its possible to follow a procedures' narrative by grepping
procedure id. We spewed loads of log too on big transitions such as master fail; fixed.

Fix CatalogTracker. Make it use Procedures doing clean up of Region data on split/merge.
Without these changes, ITBLL was failing at larger scale (3-4hours 5B rows) because we were
splitting split Regions among other things (CJ would run but wasn't
taking lock on Regions so havoc).

Added a bunch of doc. on Procedure primitives.

Added new region-based state machine base class. Moved region-based
state machines on to it.

Found bugs in the way procedure locking was doing in a few of the
region-based Procedures. Having them all have same subclass helps here.

Added isSplittable and isMergeable to the Region Interface.

Master would split/merge even though the Regions still had
references. Fixed it so Master asks RegionServer if Region
is splittable.

Messing more w/ logging. Made all procedures log the same and report
the state the same; helps when logging is regular.

Rewrote TestCatalogTracker. Enabled TestMergeTableRegionProcedure.

Added more functionality to MockMasterServices so can use it doing
standalone testing of Procedures (made TestCatalogTracker use it
instead of its own version).

Add to MasterServices ability to wait on Master being up -- makes
it so can Mock Master and start to implement standalone split testing.
Start in on a Split region standalone test in TestAM.

Fix bug where a Split can fail because it comes in in the middle of
a Move (by holding lock for duration of a Move).

Breaks CPs that were watching merge/split. These are run by Master now
so you need to observe on Master, not on RegionServer.

Details:

M hbase-client/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java
Takes List of regionstates on construction rather than a Set.
NOTE!!!!! This is a change in a public class.

M hbase-client/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
Add utility getShortNameToLog

M hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/ShortCircuitMasterConnection.java
Add support for dispatching assign, split and merge processes.

M hbase-client/src/main/java/org/apache/hadoop/hbase/master/RegionState.java
Purge old overlapping states: PENDING_OPEN, PENDING_CLOSE, etc.

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java
Lots of doc on its inner workings. Bug fixes.

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
Log and doc on workings. Bug fixes.

A hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/RemoteProcedureDispatcher.java
Dispatch remote procedures every 150ms or 32 items -- which ever
happens first (configurable). Runs a timeout thread. This facility is
not on yet; will come in as part of a later fix. Currently works a
region at a time. This class carries notion of a remote procedure and of a buffer full of these.
"hbase.procedure.remote.dispatcher.threadpool.size" with default = 128
"hbase.procedure.remote.dispatcher.delay.msec" with default = 150ms
"hbase.procedure.remote.dispatcher.max.queue.size" with default = 32

M hbase-client/src/main/java/org/apache/hadoop/hbase/shaded/protobuf/ProtobufUtil.java
Add in support for merge. Remove no-longer used methods.

M hbase-protocol-shaded/src/main/protobuf/Admin.proto b/hbase-protocol-shaded/src/main/protobuf/Admin.proto
Add execute procedures call ExecuteProcedures.

M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
Add assign and unassign state support for procedures.

M hbase-server/src/main/java/org/apache/hadoop/hbase/client/VersionInfoUtil.java
Adds getting RS version out of RPC
Examples: (1.3.4 is 0x0103004, 2.1.0 is 0x0201000)

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
Remove periodic metrics chore. This is done over in new AM now.
Replace AM with the new. Host the procedures executor.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterMetaBootstrap.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterMetaBootstrap.java
Have AMv2 handle assigning meta.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java
Extract version number of the server making rpc.

A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignProcedure.java
Add new assign procedure. Runs assign via Procedure Dispatch.
There can only be one RegionTransitionProcedure per region running at the time,
since each procedure takes a lock on the region.

D hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignCallable.java
D hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
D hbase-server/src/main/java/org/apache/hadoop/hbase/master/BulkAssigner.java
D hbase-server/src/main/java/org/apache/hadoop/hbase/master/GeneralBulkAssigner.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/GeneralBulkAssigner.java
Remove these hacky classes that were never supposed to live longer than
a month or so to be replaced with real assigners.

D hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStateStore.java
D hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java
D hbase-server/src/main/java/org/apache/hadoop/hbase/master/UnAssignCallable.java

A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
A procedure-based AM (AMv2).

TODO
 - handle region migration
 - handle meta assignment first
 - handle sys table assignment first (e.g. acl, namespace)
 - handle table priorities
  "hbase.assignment.bootstrap.thread.pool.size"; default size is 16.
  "hbase.assignment.dispatch.wait.msec"; default wait is 150
  "hbase.assignment.dispatch.wait.queue.max.size"; wait max default is 100
  "hbase.assignment.rit.chore.interval.msec"; default is 5 * 1000;
  "hbase.assignment.maximum.attempts"; default is 10;

 A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MoveRegionProcedure.java
 Procedure that runs subprocedure to unassign and then assign to new location

 A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStateStore.java
 Manage store of region state (in hbase:meta by default).

 A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStates.java
 In-memory state of all regions. Used by AMv2.

 A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
 Base RIT procedure for Assign and Unassign.

 A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java
 Unassign procedure.

 A hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/RSProcedureDispatcher.java
 Run region assignement in a manner that pays attention to target server version.
 Adds "hbase.regionserver.rpc.startup.waittime"; defaults 60 seconds.
2017-05-24 20:47:25 -07:00
Josh Elser ed618da906 HBASE-17981 Consolidate the space quota shell commands 2017-05-22 13:41:36 -04:00
Josh Elser d671a1dbc6 HBASE-17955 Various reviewboard improvements to space quota work
Most notable change is to cache SpaceViolationPolicyEnforcement objects
in the write path. When a table has no quota or there is not SpaceQuotaSnapshot
for that table (yet), we want to avoid creating lots of
SpaceViolationPolicyEnforcement instances, caching one instance
instead. This will help reduce GC pressure.
2017-05-22 13:41:36 -04:00
Josh Elser a8460b8bad HBASE-17794 Swap "violation" for "snapshot" where appropriate
A couple of variables and comments in which violation is incorrectly
used to describe what the code is doing. This was a hold over from early
implementation -- need to scrub these out for clarity.
2017-05-22 13:41:35 -04:00
Josh Elser 13af7f8ac6 HBASE-17002 JMX metrics and some UI additions for space quotas 2017-05-22 13:41:35 -04:00
Josh Elser 80a1f8fa2a HBASE-17428 Implement informational RPCs for space quotas
Create some RPCs that can expose the in-memory state that the
RegionServers and Master hold to drive the space quota "state machine".
Then, create some hbase shell commands to interact with those.
2017-05-22 13:41:35 -04:00
Josh Elser 6c9082fe16 HBASE-17259 API to remove space quotas on a table/namespace 2017-05-22 13:41:35 -04:00
Josh Elser 34ba143fc8 HBASE-17001 Enforce quota violation policies in the RegionServer
The nuts-and-bolts of filesystem quotas. The Master must inform
RegionServers of the violation of a quota by a table. The RegionServer
must apply the violation policy as configured. Need to ensure
that the proper interfaces exist to satisfy all necessary policies.

This required a massive rewrite of the internal tracking by
the general space quota feature. Instead of tracking "violations",
we need to start tracking "usage". This allows us to make the decision
at the RegionServer level as to when the files in a bulk load request
should be accept or rejected which ultimately lets us avoid bulk loads
dramatically exceeding a configured space quota.
2017-05-22 13:41:35 -04:00
Josh Elser 6b334cd817 HBASE-17000 Implement computation of online region sizes and report to the Master
Includes a trivial implementation of the Master-side collection to
avoid. Only enough to write a test to verify RS collection.
2017-05-22 13:41:35 -04:00
tedyu 140413c11b HBASE-16995 Build client Java API and client protobuf messages - addendum fixes white spaces (Josh Elser) 2017-05-22 13:41:35 -04:00
tedyu 4dfafd6e50 HBASE-16995 Build client Java API and client protobuf messages (Josh Elser) 2017-05-22 13:41:35 -04:00
huzheng 37dd8ff722 HBASE-11013: Clone Snapshots on Secure Cluster Should provide option to apply Retained User Permissions
Signed-off-by: Guanghao Zhang <zghao@apache.org>
2017-05-18 17:39:50 +08:00
tedyu 5e046151d6 HBASE-11013: Clone Snapshots on Secure Cluster Should provide option to apply Retained User Permissions - revert, pending work in snapshot descriptor 2017-05-11 18:53:14 -07:00
tedyu b3dcfb659e HBASE-17928 Shell tool to clear compaction queues (Guangxu Cheng) 2017-05-11 18:47:12 -07:00
tedyu d8d4ba7c59 HBASE-17928 Shell tool to clear compaction queues - revert pending work in snapshot descriptor 2017-05-11 18:43:59 -07:00
tedyu 815b0f853b HBASE-17928 Shell tool to clear compaction queues (Guangxu Cheng) 2017-05-09 18:32:38 -07:00
huzheng 951b23a44c HBASE-11013: Clone Snapshots on Secure Cluster Should provide option to apply Retained User Permissions
Signed-off-by: tedyu <yuzhihong@gmail.com>
2017-05-09 09:32:48 -07:00
Balazs Meszaros 2557506415 HBASE-15143 Procedure v2 - Web UI displaying queues
Signed-off-by: Michael Stack <stack@apache.org>
2017-04-25 09:39:28 -07:00
Umesh Agashe c8461456d0 HBASE-17888: Added generic methods for updating metrics on submit and finish of a procedure execution
Signed-off-by: Michael Stack <stack@apache.org>
2017-04-14 11:51:08 -07:00
Umesh Agashe 9109803891 HBASE-17863: Procedure V2: Some cleanup around Procedure.isFinished() and procedure executor
Signed-off-by: Michael Stack <stack@apache.org>
2017-04-06 12:05:23 -07:00
Michael Stack e916b79db5 HBASE-16780 Since move to protobuf3.1, Cells are limited to 64MB where previous they had no limit Update internal pb to 3.2 from 3.1.; AMENDMENT -- FORGOT TO REBUILD PBs 2017-04-03 15:26:11 -07:00
Michael Stack 7700a7fac1 HBASE-16780 Since move to protobuf3.1, Cells are limited to 64MB where previous they had no limit Update internal pb to 3.2 from 3.1. 2017-03-31 12:44:59 -07:00
tedyu 75d0f49dcd HBASE-14123 HBase Backup/Restore Phase 2 (Vladimir Rodionov) 2017-03-18 03:04:19 -07:00
Jan Hentschel b53f354763 HBASE-17532 Replaced explicit type with diamond operator
Signed-off-by: Michael Stack <stack@apache.org>
2017-03-07 11:22:51 -08:00
zhangduo 712fe69e4d HBASE-17599 Use mayHaveMoreCellsInRow instead of isPartial 2017-02-09 15:38:02 +08:00
Ajay Jadhav f8b1f57b05 HBASE-17280 Add mechanism to control hbase cleaner behavior
Signed-off-by: anoopsamjohn <anoopsamjohn@gmail.com>
Signed-off-by: tedyu <yuzhihong@gmail.com>
2017-02-01 22:16:01 -06:00
Sean Busbey 2c799fb70a Revert "Add mechanism to control hbase cleaner behavior"
This reverts commit ef052521cd.

Bad commit message.
2017-02-01 22:11:48 -06:00
Ajay Jadhav ef052521cd Add mechanism to control hbase cleaner behavior
Signed-off-by: tedyu <yuzhihong@gmail.com>
2017-02-01 19:33:06 -08:00
zhangduo 85d701892e HBASE-17045 Unify the implementation of small scan and regular scan 2017-01-25 09:53:06 +08:00
Michael Stack 4fdd6ff9ae HBASE-16831 Procedure V2 - Remove org.apache.hadoop.hbase.zookeeper.lock
(Appy)
2017-01-19 16:51:44 -08:00
Stephen Yuan Jiang 805d39fca6 HBASE-17470 Remove merge region code from region server (Stephen Yuan Jiang) 2017-01-17 15:39:51 -08:00
Michael Stack 4cb09a494c HBASE-16744 Procedure V2 - Lock procedures to allow clients to acquire
locks on tables/namespaces/regions (Matteo Bertozzi)

Incorporates review comments from
    https://reviews.apache.org/r/52589/
    https://reviews.apache.org/r/54388/

M hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncTableBase.java
 Fix for eclipse complaint (from Duo Zhang)

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java
M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/store/wal/WALProcedureStore.java
 Log formatting

M hbase-procedure/src/test/java/org/apache/hadoop/hbase/procedure2/ProcedureTestingUtility.java
 Added wait procedures utility.

A hbase-protocol-shaded/src/main/java/org/apache/hadoop/hbase/shaded/protobuf/generated/LockServiceProtos.java
A hbase-protocol-shaded/src/main/protobuf/LockService.proto b/hbase-protocol-shaded/src/main/protobuf/LockService.proto
 Implement new locking CP overrides.

A hbase-server/src/main/java/org/apache/hadoop/hbase/client/locking/EntityLock.java
 New hbase entity lock (ns, table, or regions)

A hbase-server/src/main/java/org/apache/hadoop/hbase/client/locking/LockServiceClient.java
 Client that can use the new internal locking service.
2017-01-13 21:07:03 -08:00
Jan Hentschel 7794c530bd HBASE-17416 Changed size() == 0 to isEmpty in hbase-protocol-shaded
Signed-off-by: Michael Stack <stack@apache.org>
2017-01-13 08:38:58 -08:00
Guanghao Zhang ac3b1c9aa9 HBASE-17337 list replication peers request should be routed through master 2017-01-10 08:57:26 +08:00
Guanghao Zhang e02ae7724d HBASE-17388 Move ReplicationPeer and other replication related PB messages to the replication.proto 2017-01-06 10:01:22 +08:00
Guanghao Zhang 0e48665641 HBASE-17336 get/update replication peer config requests should be routed through master 2016-12-30 10:12:47 +08:00
zhangduo 05b1d918b0 HBASE-17320 Add inclusive/exclusive support for startRow and endRow of scan 2016-12-29 09:43:31 +08:00
Guanghao Zhang 8da7366fc2 HBASE-17348 Remove the unused hbase.replication from javadoc/comment/book completely 2016-12-25 08:46:29 +08:00
Jerry He 992e5717d4 HBASE-16010 Put draining function through Admin API (Matt Warhaftig) 2016-12-23 13:41:36 -08:00
Guanghao Zhang b3f2bec099 HBASE-17335 enable/disable replication peer requests should be routed through master 2016-12-23 09:27:12 +08:00
Guanghao Zhang e1f4aaeacd HBASE-11392 add/remove peer requests should be routed through master 2016-12-21 13:27:13 +08:00
tedyu 2333596279 HBASE-17296 Provide per peer throttling for replication (Guanghao Zhang) 2016-12-13 04:20:20 -08:00
Michael Stack 1f8d8bfa8b HBASE-17239 Add UnsafeByteOperations#wrap(ByteInput, int offset, int len) API
Addendum to make pb compile work again.
2016-12-05 11:56:19 -08:00
Ramkrishna 94302a3d26 HBASE-17239 Add UnsafeByteOperations#wrap(ByteInput, int offset, int len)
API (Ram)
2016-12-05 15:13:04 +05:30
Stephen Yuan Jiang 0a24077841 HBASE-16119 Procedure v2 - Reimplement Merge region (Stephen Yuan Jiang) 2016-12-01 22:41:15 -08:00
zhangduo 890fcbd0e6 HBASE-17167 Pass mvcc to client when scan 2016-11-30 10:11:04 +08:00
thiruvel 80acc2dca5 HBASE-16169: Make RegionSizeCalculator scalable
Signed-off-by: Michael Stack <stack@apache.org>
2016-11-16 23:07:14 -08:00
Michael Stack 48439e5720 HBASE-17082 ForeignExceptionUtil isnt packaged when building shaded protocol with -Pcompile-protobuf; Attempted Fix; Add clarification to README. 2016-11-16 12:26:18 -08:00
Michael Stack 0f7a7f4751 Revert "HBASE-17082 ForeignExceptionUtil isnt packaged when building shaded protocol with -Pcompile-protobuf; Attempted Fix"
This reverts commit 8847a70902.

We committed two 'attempted fixes'. This is a revert of the first
attempt. It did not work. Sorry for confusion. I used the same
commit message so it could be awkward unraveling.
2016-11-15 20:27:32 -08:00