Commit Graph

6017 Commits

Author SHA1 Message Date
Yu Li 679f34e881 HBASE-18469 Correct RegionServer metric of totalRequestCount 2017-08-11 14:13:18 +08:00
Guanghao Zhang 4dd24c52b8 HBASE-17125 Inconsistent result when use filter to read data 2017-08-11 10:58:00 +08:00
Esteban Gutierrez efd211debd HBASE-18024 HRegion#initializeRegionInternals should not re-create .hregioninfo file when the region directory no longer exists 2017-08-10 17:56:17 -05:00
Michael Stack e4ba404a5a Revert "HBASE-18551 [AMv2] UnassignProcedure and crashed regionservers"
This reverts commit 2dd75d10f8.
2017-08-10 14:59:52 -07:00
Umesh Agashe e98b38bf6c HBASE-18560 Fixed master.assignment.TestAssignmentManager hangs on master and it shows up in flaky list 2017-08-10 14:58:52 -07:00
Michael Stack 2dd75d10f8 HBASE-18551 [AMv2] UnassignProcedure and crashed regionservers
If an unassign is unable to communicate with its target server,
expire the server and then wait on a signal from ServerCrashProcedure
before proceeding. The unassign has lock on the region so no one else
can proceed till we complete. We prevent any subsequent assign from
running until logs have been split for crashed server.

In AssignProcedure, do not assign if table is DISABLING or DISABLED.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
 Change remoteCallFailed so it returns boolean on whether implementor
wants to stay suspended or not.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java
  Doc. Also, if we are unable to talk to remote server, expire it and
then wait on SCP to wake us up after it has processed logs for failed
server.
2017-08-10 14:53:35 -07:00
Ashu Pachauri ded0842caf HBASE-18398: Snapshot operation fails with FileNotFoundException 2017-08-10 14:20:08 -07:00
Umesh Agashe d5f34adcdb HBASE-18543 Disabled test TestMasterFailover
This test as it is written currently will not work with AMv2. This needs to be re-written after HBASE-18511 is committed. Disabled the test and update JIRA to re-enable it with dependency on HBASE-18511.

Signed-off-by: Michael Stack <stack@apache.org>
2017-08-10 11:01:27 -07:00
zhangduo 624652373e HBASE-18489 Expose scan cursor in RawScanResultConsumer 2017-08-10 10:11:40 +08:00
Andrew Purtell d0941127d4 HBASE-18248 Warn if monitored RPC task has been tied up beyond a configurable threshold 2017-08-09 18:16:38 -07:00
Umesh Agashe 67eddf5874 HBASE-18525 [AMv2] Fixed test TestAssignmentManager#testSocketTimeout on master branch 2017-08-09 10:15:37 -07:00
Umesh Agashe f314b5911b HBASE-18492 [AMv2] Embed code for selecting highest versioned region server for system table regions in AssignmentManager.processAssignQueue()
* Modified AssignmentManager.processAssignQueue() method to consider only highest versioned region servers for system table regions when
  destination server is not specified for them. Destination server is retained, if specified.
* Modified MoveRegionProcedure to allow null value for destination server i.e. moving a region from specific source server to non-specific/ unknown
  destination server (picked by load-balancer) is supported now.
* Removed destination server selection from HMaster.checkIfShouldMoveSystemRegionAsync(), as destination server will be picked by load balancer

Signed-off-by: Michael Stack <stack@apache.org>
2017-08-08 14:02:11 -07:00
Michael Stack 03390684cc Revert "HBASE-18511 Default no regions on master"
This reverts commit a8e0267c00.
2017-08-08 13:37:56 +08:00
Michael Stack a8e0267c00 HBASE-18511 Default no regions on master 2017-08-08 12:11:02 +08:00
Chia-Ping Tsai fd76eb39d7 HBASE-18502 Change MasterObserver to use TableDescriptor and ColumnFamilyDescriptor 2017-08-07 11:26:15 +08:00
no_apologies a7014ce46c HBASE-18515 Introduce Delete.add as a replacement for Delete#addDeleteMarker
Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>
2017-08-07 11:05:23 +08:00
Guanghao Zhang 5915d73a70 HBASE-18485 Performance issue: ClientAsyncPrefetchScanner is slower than ClientSimpleScanner 2017-08-07 10:35:19 +08:00
Zach York 637f7abf0b HBASE-18520 Add jmx value to determine true Master Start time
This is to determine how long it took in total for the master to start and finish initializing.

Signed-off-by: tedyu <yuzhihong@gmail.com>
2017-08-05 22:32:33 -07:00
Umesh Agashe 62deb8172e HBASE-18516 Removed dead code in ServerManager resulted mostly from AMv2 refactoring
* Call to methods sendRegionOpen(), isServerReachable(), removeRequeuedDeadServers(), getRequeuedDeadServers() got removed in HBASE-14614
* Call to method ServerManager.sendFavoredNodes() got removed in HBASE-17198
2017-08-04 13:47:59 -07:00
Apekshit Sharma de696cf6b6 HBASE-18231 Deprecate Admin#closeRegion*() commands in favor of Admin#unassign().
Other changes:
- Update corresponding tests in TestAdmin2. Removed tests centered around serverName part of old functions.
- Remove dead functions from ProtobufUtil and ServerManager
- Rename closeRegion* functions in HBTU to unassignRegion*

Change-Id: Ib9bdeb185e10750daf652be0bb328306accb73ab
2017-08-02 15:19:06 -07:00
Michael Stack 7a6de1bd42 HBASE-17056 Remove checked in PB generated files
Selective add of dependency on hbase-thirdparty jars.
Update to READMEs on how protobuf is done (and update to refguide).
Removed all checked in generated protobuf files. They are generated
on the fly now as part of mainline build.
2017-08-02 09:33:20 -07:00
Chia-Ping Tsai f260f09865 HBASE-18480 The cost of BaseLoadBalancer.cluster is changed even if the rollback is done 2017-08-02 08:48:01 +08:00
Umesh Agashe ba5e8706de HBASE-18491 [AMv2] Fail UnassignProcedure if source Region Server is not online.
The patch also enables TestServerCrashProcedure.testRecoveryAndDoubleExecutionOnRsWithMeta()

Signed-off-by: Michael Stack <stack@apache.org>
2017-08-01 17:05:00 -07:00
James Taylor 422a57223a HBASE-18487 Minor fixes in row lock implementation
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2017-08-01 11:28:02 -07:00
Umesh Agashe a5db120e60 HBASE-18261 Created RecoverMetaProcedure and used it from ServerCrashProcedure and HMaster.finishActiveMasterInitialization().
This procedure can be used from any code before accessing meta, to initialize/ recover meta

Signed-off-by: Michael Stack <stack@apache.org>
2017-07-31 14:25:03 -07:00
Sean Busbey 331a6cface HBASE-18475 ensure only non-null procedures are sent to requireTableExclusiveLock
Signed-off-by: Umesh Agashe <uagashe@cloudera.com>
Signed-off-by: Michael Stack <stack@apache.org>
2017-07-31 11:05:16 -05:00
Abhishek Singh Chouhan 95808b4672 HBASE-18374 RegionServer Metrics improvements 2017-07-31 12:42:41 +05:30
Guanghao Zhang df90ba58db HBASE-18481 The autoFlush flag was not used in PE tool 2017-07-31 10:54:45 +08:00
Yi Liang 00c1b56665 HBASE-18465: [AMv2] remove old split region code that is no longer needed
Signed-off-by: Michael Stack <stack@apache.org>
2017-07-30 15:24:58 -05:00
Alex Leblang 0e9390bd6d HBASE-18406 Remove NO-OP Method
This patch removes start(MasterProcedureEnv) from
ServerCrashProcedure.java which was a misnomer as a no-op. It
did not start anything.

Change-Id: I4e91864ace912e137471bfce03516746c4aff83e

Signed-off-by: Michael Stack <stack@apache.org>
2017-07-30 13:59:53 +01:00
Abhishek Singh Chouhan 2d06a06ba4 HBASE-15134 Add visibility into Flush and Compaction queues 2017-07-28 12:59:09 +05:30
Esteban Gutierrez 9a1661832d HBASE-18362 hbck should not report split replica parent region from meta as errors (Huaxiang Sun)
Signed-off-by: Esteban Gutierrez <esteban@apache.org>
2017-07-27 15:58:16 -05:00
Chia-Ping Tsai 3d81f7b9e7 HBASE-18449 Fix client.locking.TestEntityLocks 2017-07-26 20:26:24 +08:00
Malcolm Taylor 421029d0c7 HBASE-18434 Address some alerts raised by lgtm.com
Signed-off-by: Ramkrishna <ramkrishna.s.vasudevan@intel.com>
2017-07-26 10:00:23 +05:30
Andrew Purtell 2fd8e824d5 HBASE-18054 log when we add/remove failed servers in client (Ali) 2017-07-25 18:53:09 -07:00
Umesh Agashe 746d1b1819 HBASE-18427 minor cleanup around AssignmentManager
- unused imports
- superfluous exception in method definitions

Change-Id: I156383b9895fa718fe9d5227003c23bd945cf999
Signed-off-by: Apekshit Sharma <appy@apache.org>
2017-07-25 17:46:04 -07:00
Josh Elser 5cd7f630c2 HBASE-18023 Update row threshold warning from 1k to 5k (addendum) 2017-07-25 18:27:53 -04:00
Phil Yang e1cd59bbc4 HBASE-15968 (addendum) revert unrelated PE changing 2017-07-25 15:16:02 +08:00
Phil Yang 1ac4152b19 HBASE-15968 New behavior of versions considering mvcc and ts rather than ts only 2017-07-25 15:00:36 +08:00
Stephen Yuan Jiang fabab8c23f HBASE-18354 Fix TestMasterMetrics that were disabled by Proc-V2 AM in HBASE-14614 (Vladimir Rodionov) 2017-07-24 14:52:04 -07:00
Balazs Meszaros 8f006582e3 HBASE-18367 Reduce ProcedureInfo usage
Signed-off-by: Michael Stack <stack@apache.org>
2017-07-24 10:41:03 +01:00
Yi Liang e9d8a7b6d5 HBASE-18107: [AMv2] Remove DispatchMergingRegionsRequest & DispatchMergingRegions
Signed-off-by: Michael Stack <stack@apache.org>
2017-07-23 10:44:34 +01:00
Mike Drob 317ce73963 HBASE-18433 Convenience method for creating simple ColumnFamilyDescriptor
Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>
2017-07-22 23:42:33 +08:00
rgidwani ec3cb19664 HBASE-15816 Provide client with ability to set priority on Operations
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2017-07-21 17:12:16 -07:00
Michael Stack 890d92a90c HBASE-17908 Upgrade guava
Pull in guava 22.0 by using the shaded version up in new hbase-thirdparty project.

In poms, exclude guava everywhere except on hadoop-common. Do this so
we minimize transitive includes. hadoop-common is needed because hadoop
Configuration uses guava doing preconditions.

Everywhere we used guava, instead use shaded so fix a load of imports.

Stopwatch API changed as did hashing and toStringHelper which is now
in MoreObjects class. Otherwise, minimal changes to come up on 22.0
2017-07-21 15:28:08 +01:00
anoopsamjohn bc93b6610b HBASE-16993 BucketCache throw java.io.IOException: Invalid HFile block magic when configuring hbase.bucketcache.bucket.sizes. 2017-07-20 22:59:06 +05:30
Andrew Purtell 01db60d65b HBASE-18330 NPE in ReplicationZKLockCleanerChore 2017-07-19 15:46:08 -07:00
Chia-Ping Tsai 3574757f74 HBASE-18308 Eliminate the findbugs warnings for hbase-server 2017-07-20 00:35:07 +08:00
Peter Somogyi f10f8198af HBASE-16312 update jquery version
Upgrade jquery from 1.8.3 to 3.2.1 in hbase-server and hbase-thrift modules

Change-Id: I92d479e9802d954f607ba409077bc98581e9e5ca

Signed-off-by: Michael Stack <stack@apache.org>
2017-07-19 11:44:31 +01:00
Phil Yang 6b7ebc019c HBASE-18390 Sleep too long when finding region location failed 2017-07-19 11:34:57 +08:00
tedyu 0c2915b48e HBASE-18377 Error handling for FileNotFoundException should consider RemoteException in openReader() 2017-07-17 20:24:29 -07:00
Michael Stack a9352fe956 HBASE-18366 Fix flaky test TestServerCrashProcedure#testRecoveryAndDoubleExecutionOnRsWithMeta (Umesh Agashe)
Disabled for now. Will be back here when a more fundamental fix.
2017-07-14 22:41:36 +01:00
Yi Liang 353627b39d HBASE-18229: create new Async Split API to embrace AM v2
Signed-off-by: Michael Stack <stack@apache.org>
2017-07-14 22:25:14 +01:00
Mike Drob 9e0f450c0c HBASE-17922 Clean TestRegionServerHostname for hadoop3.
Change-Id: I6f1514b1bc301be553912539e6a4192c2ccc782b
Signed-off-by: Apekshit Sharma <appy@apache.org>
2017-07-13 11:44:18 -07:00
Jan Hentschel c0725ddff1 HBASE-18344 Introduce Append.addColumn as a replacement for Append.add
Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>
2017-07-13 20:04:57 +08:00
Mike Drob cb5299ae9b HBASE-18177 FanOutOneBlockAsyncDFSOutputHelper fails to compile against Hadoop 3
Because ClientProtocol::create has API changes between Hadoop 2/3

Signed-off-by: zhangduo <zhangduo@apache.org>
2017-07-12 13:40:05 +08:00
Guanghao Zhang 22dce22e06 HBASE-18343 (addendum) Track the remaining unimplemented methods for async admin 2017-07-12 09:32:00 +08:00
tedyu c0f743e44f HBASE-18358 Backport HBASE-18099 'FlushSnapshotSubprocedure should wait for concurrent Region#flush() to finish' 2017-07-11 17:26:22 -07:00
Chia-Ping Tsai d215cb4950 HBASE-18295 The result contains the cells across different rows 2017-07-12 02:27:29 +08:00
Guanghao Zhang 1978b78cdf HBASE-18343 Track the remaining unimplemented methods for async admin 2017-07-11 14:01:56 +08:00
zhangduo f8e892d7aa HBASE-18348 The implementation of AsyncTableRegionLocator does not follow the javadoc 2017-07-11 11:43:35 +08:00
Chia-Ping Tsai 43492d2d3b HBASE-18267 The result from the postAppend is ignored 2017-07-11 10:30:06 +08:00
tedyu 7d007eac98 HBASE-17705 Procedure execution must fail fast if procedure is not registered (Vladimir Rodionov) 2017-07-10 09:04:56 -07:00
zhangduo 351703455a HBASE-18307 Share the same EventLoopGroup for NettyRpcServer, NettyRpcClient and AsyncFSWALProvider at RS side 2017-07-10 21:00:44 +08:00
Guanghao Zhang 1ddcc07d65 HBASE-18318 Implement updateConfiguration/stopMaster/stopRegionServer/shutdown methods 2017-07-10 13:41:19 +08:00
Guanghao Zhang c48bb67123 HBASE-18316 Implement async admin operations for draining region servers 2017-07-09 19:51:59 +08:00
Chia-Ping Tsai bc8ebc6f72 HBASE-18241 Change client.Table, client.Admin, Region, Store, and HBaseTestingUtility to not use HTableDescriptor or HColumnDescriptor 2017-07-08 16:54:25 +08:00
Guanghao Zhang 7f93729782 HBASE-18317 Implement async admin operations for Normalizer/CleanerChore/CatalogJanitor 2017-07-08 10:55:10 +08:00
Guanghao Zhang b0a5fa0c2a HBASE-18319 Implement getClusterStatus/getRegionLoad/getCompactionState/getLastMajorCompactionTimestamp methods 2017-07-07 16:21:45 +08:00
Yu Li 4fe7385767 HBASE-18083 Make large/small file clean thread number configurable in HFileCleaner 2017-07-07 14:07:23 +08:00
Michael Stack 6786b2b63e Revert "HBASE-17056 Remove checked in PB generated files Selective add of dependency on"
Revert for now. Build unstable and some interesting issues around
CLASSPATH

This reverts commit df93c13fd2.
2017-07-06 21:58:32 -07:00
Phil Yang 75d2eca8ac HBASE-17931 Assign system tables to servers with highest version 2017-07-06 17:35:54 +08:00
Ramkrishna 50bb045723 HBASE-18002 Investigate why bucket cache filling up in file mode in an
exisiting file  is slower (Ram)
2017-07-06 11:20:00 +05:30
Michael Stack df93c13fd2 HBASE-17056 Remove checked in PB generated files Selective add of dependency on
hbase-thirdparty jars. Update to READMEs on how protobuf is done (and update to
refguide) Removed all checked in generated protobuf files. They are generatedon
the fly now as part of mainline build.
2017-07-05 20:57:11 -07:00
Michael Stack c5abb6cabb Revert "HBASE-14070 - Core HLC"
Revert a push too-early

This reverts commit 9fe94c1169.
2017-07-05 20:11:05 -07:00
Michael Stack 172c662034 HBASE-18325 Disable flakey TestMasterProcedureWalLease 2017-07-05 20:10:43 -07:00
Amit Patel 9fe94c1169 HBASE-14070 - Core HLC
Signed-off-by: Michael Stack <stack@apache.org>
2017-07-05 16:51:02 -07:00
Michael Stack b71509151e HBASE-17201 Edit of HFileBlock comments and javadoc 2017-07-05 13:32:27 -07:00
Stephen Yuan Jiang 05e3f394e2 HBASE-18301 Enable TestSimpleRegionNormalizerOnCluster#testRegionNormalizationMergeOnCluster that was disabled by Proc-V2 AM in HBASE-14614 (Stephen Yuan Jiang) 2017-07-05 09:56:30 -07:00
tedyu 4453472282 HBASE-18312 Ineffective handling of FileNotFoundException in FileLink.tryOpen() 2017-07-05 08:24:37 -07:00
anastas 2843214857 HBASE-18010: CellChunkMap integration into CompactingMemStore. Continuation of the previous commit 2017-07-05 12:56:45 +03:00
anastas 8ac4308411 HBASE-18010: CellChunkMap integration into CompactingMemStore. CellChunkMap usage is currently switched off by default. New tests are included. Review comments addressed. 2017-07-05 12:35:21 +03:00
Guanghao Zhang e71e5ece88 HBASE-18297 Provide a AsyncAdminBuilder to create new AsyncAdmin instance 2017-07-05 09:18:02 +08:00
Samir Ahmic 63607800cd HBASE-18310 LoadTestTool unable to write data
Signed-off-by: tedyu <yuzhihong@gmail.com>
2017-07-04 13:41:32 -07:00
samirMop 193a980338 HBASE-15943 Add page displaying JVM process metrics
Signed-off-by: Michael Stack <stack@apache.org>
2017-07-03 21:35:27 -07:00
Peter Somogyi f2731fc241 HBASE-18264 Update pom plugins
Update plugins in main and subprojects
Unified versions to use variable instead of direct values

Affected plugins:
- apache-rat-plugin 0.11 -> 0.12
- asciidoctor-maven-plugin 1.5.2.1 -> 1.5.5
- asciidoctorj-pdf 1.5.0-alpha.6 -> 1.5.0-alpha.15
- build-helper-maven-plugin 1.9.1 -> 3.0.0
- buildnumber-maven-plugin 1.3 -> 1.4
- exec-maven-plugin 1.2.1/1.4.0 -> 1.6.0
- extra-enforcer-rules 1.0-beta-3 -> 1.0-beta-6
- findbugs-maven-plugin 3.0.0 -> 3.0.4
- jamon-maven-plugin 2.4.1 -> 2.4.2
- maven-bundle-plugin 2.5.3 -> 3.3.0
- maven-compiler-plugin 3.2/3.5.1 -> 3.6.1
- maven-eclipse-plugin 2.9 -> 2.10
- maven-shade-plugin 2.4.1 -> 3.0.0
- maven-surefire-plugin 2.18.1 -> 2.20
- maven-surefire-report-plugin 2.7.2 -> 2.20
- scala-maven-plugin 3.2.0 -> 3.2.2
- spotbugs 3.1.0-RC1 -> 3.1.0-RC3
- wagon-ssh 2.2 -> 2.12
- xml-maven-plugin 1.0 -> 1.0.1

- maven-assembly-plugin 2.4 -> 2.6(inherited)
- maven-dependency-plugin 2.4 -> 2.10 (inherited)
- maven-enforcer-plugin 1.3.1 -> 1.4.1 (inherited)
- maven-javadoc-plugin 2.10.3 -> 2.10.4 (inherited)
- maven-resources-plugin 2.7 (inherited)
- maven-site-plugin 3.4 -> 3.5.1 (inherited)

Change-Id: I84539f555be498dff18caed1e3eea1e1aeb2143a

Signed-off-by: Michael Stack <stack@apache.org>
2017-07-03 19:42:46 -07:00
Guanghao Zhang 14f0423b58 HBASE-18283 Provide a construct method which accept a thread pool for AsyncAdmin 2017-07-04 09:51:41 +08:00
Sean Busbey fc973d0918 HBASE-17995 improve log messages during snapshot tests.
Signed-off-by: Michael Stack <stack@apache.org>
2017-06-30 09:42:14 -05:00
Sean Busbey 74c5742024 HBASE-18288 Declared dependency on specific javax.ws.rs.
Signed-off-by: Huaxiang Sun <huaxiangsun@apache.org>
2017-06-30 08:41:50 -05:00
zhangduo 21653c31d9 HBASE-16585 Rewrite the delegation token tests with Parameterized pattern 2017-06-30 20:40:23 +08:00
Michael Stack 92f33ad076 Revert "HBASE-18229: create new Async Split API to embrace AM v2"
TestShell is failing.

This reverts commit 5be05e90d4.
2017-06-30 03:30:01 -07:00
Michael Stack 73c225a071 HBASE-16192 Fix the potential problems in TestAcidGuarantees (Colin Ma) 2017-06-30 03:16:46 -07:00
Yi Liang 5be05e90d4 HBASE-18229: create new Async Split API to embrace AM v2
Signed-off-by: Michael Stack <stack@apache.org>
2017-06-29 16:20:18 -07:00
张世彬10204932 07c1e18a55 HBASE-17982 correct spelling error of 'occured'
Signed-off-by: Michael Stack <stack@apache.org>
2017-06-29 15:09:49 -07:00
Umesh Agashe 9189b88647 HBASE-18292 Fixed flaky test hbase.master.locking.TestLockProcedure#testLocalMasterLockRecovery
Signed-off-by: Michael Stack <stack@apache.org>
2017-06-28 21:32:46 -07:00
Umesh Agashe 038d7e8984 HBASE-18278 Enable and Fix for unit test hbase.master.procedure.TestServerCrashProcedure#testRecoveryAndDoubleExecutionOnRsWithMeta
Signed-off-by: Michael Stack <stack@apache.org>
2017-06-28 16:01:55 -07:00
Ben-Epstein aef674264e HBASE-18281 created private static pattern matcher for performance
Signed-off-by: tedyu <yuzhihong@gmail.com>
2017-06-28 11:32:10 -07:00
Kahlil Oppenheimer 8da6f069c3 HBASE-18164 Fast locality computation in balancer - addendum handles NaN
Signed-off-by: tedyu <yuzhihong@gmail.com>
Signed-off-by: Sean Busbey <busbey@apache.org>
2017-06-27 14:57:53 -05:00
Sean Busbey 141482512a Revert "HBASE-18164 Fast locality computation in balancer - addendum handles NaN"
This reverts commit 35693f0583.

early commit missed some review feedback.
2017-06-27 14:57:53 -05:00
tedyu 293cb87d52 HBASE-18161 Incremental Load support for Multiple-Table HFileOutputFormat (Densel Santhmayor) 2017-06-27 12:31:55 -07:00
张世彬10204932 389e142eae HBASE-18265 Correct the link unuseful in regionServer's region state UI
Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>
2017-06-27 10:38:46 +08:00
Kahlil Oppenheimer 35693f0583 HBASE-18164 Fast locality computation in balancer - addendum handles NaN
-Added new LocalityCostFunction and LocalityCandidateGenerator that
cache localities of every region/rack combination and mappings of every
region to its most local server and to its most local rack.

-Made LocalityCostFunction incremental so that it only computes locality
based on most recent region moves/swaps, rather than recomputing the
locality of every region in the cluster at every iteration of the
balancer

-Changed locality cost function to reflect the ratio of:
(Current locality) / (Best locality possible given current cluster)

Signed-off-by: tedyu <yuzhihong@gmail.com>
2017-06-26 12:38:43 -07:00
Guanghao Zhang 2d781aa15c HBASE-18234 Revisit the async admin api 2017-06-26 17:27:09 +08:00
David Harju 0e8e176ebd HBASE-18023 Log multi-* requests for more than threshold number of rows
Signed-off-by: Josh Elser <elserj@apache.org>
2017-06-24 15:23:51 -04:00
张世彬10204932 96aca6b153 HBASE-18263 Resolve NPE in backup Master UI when accessing procedures.jsp
Signed-off-by: tedyu <yuzhihong@gmail.com>
2017-06-24 05:20:05 -07:00
Ramkrishna d092008766 HBASE-18221 Switch from pread to stream should happen under HStore's
reentrant lock (Ram)
2017-06-23 10:32:29 +05:30
Umesh Agashe 7cc458e129 HBASE-18254 ServerCrashProcedure checks and waits for meta initialized, instead should check and wait for meta loaded
After enabling test hbase.master.procedure.TestServerCrashProcedure#testRecoveryAndDoubleExecutionOnRsWithMeta, this bug is found in ServerCrashProcedure

Signed-off-by: Michael Stack <stack@apache.org>
2017-06-21 21:57:46 -07:00
Andrew Purtell 3489a1b821 HBASE-18235 LoadBalancer.BOGUS_SERVER_NAME should not have a bogus hostname
We deliberately use 'localhost' instead of a bogus hostname for
LoadBalancer.BOGUS_SERVER_NAME so the operation will fail fast.
2017-06-21 14:37:47 -07:00
QilinCao 00f657fbeb HBASE-18252 Resolve BaseLoadBalancer bad practice warnings
Signed-off-by: tedyu <yuzhihong@gmail.com>
2017-06-21 10:12:23 -07:00
tedyu 83be50c2ab HBASE-18226 Disable reverse DNS lookup at HMaster and use the hostname provided by RegionServer (Duo Xu) 2017-06-20 21:07:45 -07:00
Josh Elser 5b485d14cd HBASE-17752 Shell command to list snapshot sizes WRT quotas 2017-06-20 14:17:00 -04:00
Ashish Singhi af466bf722 HBASE-18212 reduce log level for unbuffer warning.
In Standalone mode with local filesystem HBase logs Warning message:Failed to invoke 'unbuffer' method in class org.apache.hadoop.fs.FSDataInputStream

Signed-off-by: Umesh Agashe <uagashe@cloudera.com>
Signed-off-by: Sean Busbey <busbey@apache.org>
2017-06-20 01:06:47 -05:00
Kahlil Oppenheimer 5224064d4d HBASE-18164 Fast locality computation in balancer
-Added new LocalityCostFunction and LocalityCandidateGenerator that
cache localities of every region/rack combination and mappings of every
region to its most local server and to its most local rack.

-Made LocalityCostFunction incremental so that it only computes locality
based on most recent region moves/swaps, rather than recomputing the
locality of every region in the cluster at every iteration of the
balancer

-Changed locality cost function to reflect the ratio of:
(Current locality) / (Best locality possible given current cluster)

Signed-off-by: Sean Busbey <busbey@apache.org>
Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>
2017-06-20 01:06:47 -05:00
Pankaj Kumar ce1ce728c6 HBASE-18180 Possible connection leak while closing BufferedMutator in TableOutputFormat
Signed-off-by: tedyu <yuzhihong@gmail.com>
2017-06-18 19:46:47 -07:00
Guanghao Zhang c6e71f159c HBASE-18170 Refactor ReplicationSourceWALReaderThread 2017-06-19 09:26:45 +08:00
Umesh Agashe d49208b056 HBASE-18104 AMv2: Enabled aggregation of RPCs
Unit test (TestAssignmentManager) uses mock which always aggregates. So added trace level log message and verified manually on a single node cluster.

Signed-off-by: Michael Stack <stack@apache.org>
2017-06-16 23:53:39 -07:00
Umesh Agashe b02d3d9ed5 HBASE-18227 Fixed unit test hbase.coprocessor.TestCoprocessorMetrics#testRegionObserverAfterRegionClosed
Calling closeRegion() directly on remote server is not supported post-AMv2. Calling unassign() on master

Signed-off-by: Michael Stack <stack@apache.org>
2017-06-16 20:19:38 -07:00
Josh Elser c7a64a8313 HBASE-18225 Avoid toString() on an array 2017-06-16 08:45:31 -07:00
Ramkrishna c20d9cb1a2 HBASE-18220 - Addendum as per Duo suggestion for readability (Ram) 2017-06-16 12:34:08 +05:30
Ramkrishna 020f520d17 HBASE-18220 Compaction scanners need not reopen storefile scanners while
trying to switch over from pread to stream (Ram)
2017-06-16 11:03:04 +05:30
Michael Stack dd1d81ef5a HBASE-18004 getRegionLocations needs to be called once in
ScannerCallableWithReplicas#call() (Huaxiang Sun)
2017-06-15 13:41:01 -07:00
Michael Stack c2eebfdb61 HBASE-18166 [AMv2] We are splitting already-split files v2 Address Stephen Jiang reivew comments; ADDENDUM TO FIX COMPILE 2017-06-15 11:40:13 -07:00
Michael Stack f64512bee1 HBASE-18166 [AMv2] We are splitting already-split files v2 Address Stephen Jiang reivew comments 2017-06-15 10:26:03 -07:00
tedyu 8b36da1108 HBASE-18209 Include httpclient / httpcore jars in build artifacts 2017-06-14 20:09:42 -07:00
Andrew Purtell 50e28d62a6 HBASE-18219 Fix typo in constant HConstants.HBASE_CLIENT_MEAT_REPLICA_SCAN_TIMEOUT 2017-06-14 16:02:38 -07:00
Vincent 384e308e9f HBASE-18137 Replication gets stuck for empty WALs
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2017-06-10 10:30:40 -07:00
Ashu Pachauri eb2dc5d2a5 HBASE-18192: Replication drops recovered queues on region server shutdown
Signed-off-by: tedyu <yuzhihong@gmail.com>
2017-06-09 19:52:58 -07:00
Josh Elser e5ea457054 HBASE-17748 Include HBase snapshots in space quotas
Introduces a new Chore in the Master which computes the size
of the snapshots included in a cluster. The size of these
snapshots are included in the table's which the snapshot was created
from HDFS usage.

Includes some test stabilization, trying to make the tests more
deterministic by ensuring we observe stable values as we know
that those values are mutable. This should help avoid problems
where size reports are delayed and we see an incomplete value.
2017-06-09 18:43:18 -04:00
Ashu Pachauri 7b40f4f3ec HBASE-18092: Removing a peer does not properly clean up the ReplicationSourceManager state and metrics
Signed-off-by: tedyu <yuzhihong@gmail.com>
2017-06-09 08:23:04 -07:00
Chia-Ping Tsai 30817b922e HBASE-18193 Master web UI presents the incorrect number of regions 2017-06-09 14:44:51 +08:00
Umesh Agashe 61839d7143 HBASE-18195 Removed redundant single quote from start message for HMaster and HRegionServer
Signed-off-by: Michael Stack <stack@apache.org>
2017-06-08 22:21:00 -07:00
Gary Helmling a558d6c57a HBASE-18141 Regionserver fails to shutdown when abort triggered during RPC call 2017-06-08 17:20:29 -07:00
Yi Liang 112bff4ba0 HBASE-18109: Assign system tables first
This issue adds comments and a sort so system tables are queued first
(which will ensure they go out first). This should be good enough
along w/ existing scheduling mechanisms to ensure system/meta get
assigned first.

Signed-off-by: Michael Stack <stack@apache.org>
2017-06-08 13:24:28 -07:00
Chia-Ping Tsai 72cb7d97cc HBASE-18008 Any HColumnDescriptor we give out should be immutable 2017-06-08 23:26:08 +08:00
Alex Araujo 3f891a66ca HBASE-18184 Add hbase-hadoop2-compat jar as MapReduce dependency
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2017-06-07 14:25:55 -07:00
Chia-Ping Tsai 4a1529c89b HBASE-18158 Two running in-memory compaction threads may lose data for flushing 2017-06-07 17:57:32 +08:00
Ramkrishna 1d3252eb59 HBASE-17849 PE tool random read is not totally random (Ram) 2017-06-07 11:28:09 +05:30
Michael Stack 929c9dab14 HBASE-18181 Move master branch to version 3.0.0-SNAPSHOT post creation of branch-2 2017-06-06 22:04:39 -07:00
huzheng 0d0c330401 HBASE-17678 FilterList with MUST_PASS_ONE may lead to redundant cells returned
Signed-off-by: tedyu <yuzhihong@gmail.com>
2017-06-06 21:08:12 -07:00
Phil Yang 2f1923a823 HBASE-15576 Scanning cursor to prevent blocking long time on ResultScanner.next() 2017-06-07 11:32:04 +08:00
tedyu 80e15aac21 HBASE-16392 Backup delete fault tolerance (Vladimir Rodionov) 2017-06-06 20:29:13 -07:00
Chia-Ping Tsai da3c023635 HBASE-18145 The flush may cause the corrupt data for reading
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2017-06-06 18:02:43 -07:00
Andrew Purtell 858bccfcb8 HBASE-18132 Low replication should be checked in period in case of datanode rolling upgrade (Allan Yang) 2017-06-06 17:21:21 -07:00
Ashish Singhi 1950acc67a HBASE-9393 Hbase does not closing a closed socket resulting in many CLOSE_WAIT
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2017-06-06 12:52:46 -07:00
zhangduo ee0f148c73 HBASE-18038 Rename StoreFile to HStoreFile and add a StoreFile interface for CP 2017-06-06 20:36:38 +08:00
Umesh Agashe 07c38e7165 HBASE-16549 Added new metrics for AMv2 procedures
Following AMv2 procedures are modified to override onSubmit(), onFinish() hooks provided by HBASE-17888 to do
metrics calculations when procedures are submitted and finshed:
* AssignProcedure
* UnassignProcedure
* MergeTableRegionProcedure
* SplitTableRegionProcedure
* ServerCrashProcedure

Following metrics is collected for each of the above procedure during lifetime of a process:
* Total number of requests submitted for a type of procedure
* Histogram of runtime in milliseconds for successfully completed procedures
* Total number of failed procedures

As we are moving away from Hadoop's metric2, hbase-metrics-api module is used for newly added metrics.

Modified existing tests to verify count of procedures.

Signed-off-by: Michael Stack <stack@apache.org>
2017-06-05 17:14:14 -07:00
Michael Stack e65d8653e5 HBASE-18155 TestMasterProcedureWalLease is flakey 2017-06-03 12:55:18 -07:00
Enis Soztutar 118429cbac HBASE-15160 Put back HFile's HDFS op latency sampling code and add metrics for monitoring (Yu Li and Enis Soztutar) 2017-06-02 17:41:53 -07:00
tedyu ef46debde8 HBASE-18005 read replica: handle the case that region server hosting both primary replica and meta region is down (huaxiang sun) 2017-06-02 09:29:51 -07:00
Guanghao Zhang 549171465d HBASE-18111 Replication stuck when cluster connection is closed
Signed-off-by: Andrew Purtell <apurtell@apache.org>
2017-06-01 15:12:15 -07:00
Michael Stack e1f3c89b3b HBASE-18143 [AMv2] Backoff on failed report of region transition quickly goes to astronomical time scale
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
  Rather than compound the pause time, just have backoff multiple the
  original INIT_PAUSE_TIME_MS so we go 1, 2, 5, 10, ... etc. rather than
  1, 2, 30, 600... and so on.

  Minor fixup around logging so report of failed transition is no longer
  reported as trace-level.
2017-06-01 13:01:36 -07:00
Jerry He c7a7f880dd HBASE-16261 MultiHFileOutputFormat Enhancement (Yi Liang) 2017-06-01 10:44:17 -07:00
Guanghao Zhang 123086edad HBASE-18130 Refactor ReplicationSource 2017-06-01 14:50:45 +08:00
Chinmay Kulkarni db8ce0566d HBASE 17959 Canary timeout should be configurable on a per-table basis
Added support for configuring read/write timeouts on a per-table basis
when in region mode.
Added unit test for per-table timeout checks.

Signed-off-by: Andrew Purtell <apurtell@apache.org>
2017-05-31 17:58:43 -07:00
Michael Stack 3975bbd008 HBASE-14614 Procedure v2 - Core Assignment Manager (Matteo Bertozzi) Move to a new AssignmentManager, one that describes Assignment using a State Machine built on top of ProcedureV2 facility.
This doc. keeps state on where we are at w/ the new AM:
https://docs.google.com/document/d/1eVKa7FHdeoJ1-9o8yZcOTAQbv0u0bblBlCCzVSIn69g/edit#heading=h.vfdoxqut9lqn
Includes list of tests disabled by this patch with reasons why.

Based on patches from Matteos' repository and then fix up to get it all to pass cluster
tests, filling in some missing functionality, fix of findbugs, fixing bugs, etc..
including:

    1. HBASE-14616 Procedure v2 - Replace the old AM with the new AM.
    The basis comes from Matteo's repo here:
    689227fcbf

    Patch replaces old AM with the new under subpackage master.assignment.
    Mostly just updating classes to use new AM -- import changes -- rather
    than the old. It also removes old AM and supporting classes.
    See below for more detail.

    2. HBASE-14614 Procedure v2 - Core Assignment Manager (Matteo Bertozzi)
    3622cba4e3

    Adds running of remote procedure. Adds batching of remote calls.
    Adds support for assign/unassign in procedures. Adds version info
    reporting in rpc. Adds start of an AMv2.

    3. Reporting of remote RS version is from here:
    ddb4df3964.patch

    4. And remote dispatch of procedures is from:
    186b9e7c4d

    5. The split merge patches from here are also melded in:
    9a3a95a2c2
    and d6289307a0

We add testing util for new AM and new sets of tests.

Does a bunch of fixup on logging so its possible to follow a procedures' narrative by grepping
procedure id. We spewed loads of log too on big transitions such as master fail; fixed.

Fix CatalogTracker. Make it use Procedures doing clean up of Region data on split/merge.
Without these changes, ITBLL was failing at larger scale (3-4hours 5B rows) because we were
splitting split Regions among other things (CJ would run but wasn't
taking lock on Regions so havoc).

    Added a bunch of doc. on Procedure primitives.

    Added new region-based state machine base class. Moved region-based
    state machines on to it.

    Found bugs in the way procedure locking was doing in a few of the
    region-based Procedures. Having them all have same subclass helps here.

    Added isSplittable and isMergeable to the Region Interface.

    Master would split/merge even though the Regions still had
    references. Fixed it so Master asks RegionServer if Region
    is splittable.

    Messing more w/ logging. Made all procedures log the same and report
    the state the same; helps when logging is regular.

    Rewrote TestCatalogTracker. Enabled TestMergeTableRegionProcedure.

    Added more functionality to MockMasterServices so can use it doing
    standalone testing of Procedures (made TestCatalogTracker use it
    instead of its own version).

    Add to MasterServices ability to wait on Master being up -- makes
    it so can Mock Master and start to implement standalone split testing.
    Start in on a Split region standalone test in TestAM.

    Fix bug where a Split can fail because it comes in in the middle of
    a Move (by holding lock for duration of a Move).

    Breaks CPs that were watching merge/split. These are run by Master now
    so you need to observe on Master, not on RegionServer.

    Details:

    M hbase-client/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java
    Takes List of regionstates on construction rather than a Set.
    NOTE!!!!! This is a change in a public class.

    M hbase-client/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
    Add utility getShortNameToLog

    M hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
    M hbase-client/src/main/java/org/apache/hadoop/hbase/client/ShortCircuitMasterConnection.java
    Add support for dispatching assign, split and merge processes.

    M hbase-client/src/main/java/org/apache/hadoop/hbase/master/RegionState.java
    Purge old overlapping states: PENDING_OPEN, PENDING_CLOSE, etc.

    M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java
    Lots of doc on its inner workings. Bug fixes.

    M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
    Log and doc on workings. Bug fixes.

    A hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/RemoteProcedureDispatcher.java
    Dispatch remote procedures every 150ms or 32 items -- which ever
    happens first (configurable). Runs a timeout thread. This facility is
    not on yet; will come in as part of a later fix. Currently works a
    region at a time. This class carries notion of a remote procedure and of a buffer full of these.
    "hbase.procedure.remote.dispatcher.threadpool.size" with default = 128
    "hbase.procedure.remote.dispatcher.delay.msec" with default = 150ms
    "hbase.procedure.remote.dispatcher.max.queue.size" with default = 32

    M hbase-client/src/main/java/org/apache/hadoop/hbase/shaded/protobuf/ProtobufUtil.java
    Add in support for merge. Remove no-longer used methods.

    M hbase-protocol-shaded/src/main/protobuf/Admin.proto b/hbase-protocol-shaded/src/main/protobuf/Admin.proto
    Add execute procedures call ExecuteProcedures.

    M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
    Add assign and unassign state support for procedures.

    M hbase-server/src/main/java/org/apache/hadoop/hbase/client/VersionInfoUtil.java
    Adds getting RS version out of RPC
    Examples: (1.3.4 is 0x0103004, 2.1.0 is 0x0201000)

    M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
    Remove periodic metrics chore. This is done over in new AM now.
    Replace AM with the new. Host the procedures executor.

    M hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterMetaBootstrap.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterMetaBootstrap.java
    Have AMv2 handle assigning meta.

    M hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java
    Extract version number of the server making rpc.

    A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignProcedure.java
    Add new assign procedure. Runs assign via Procedure Dispatch.
    There can only be one RegionTransitionProcedure per region running at the time,
    since each procedure takes a lock on the region.

    D hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignCallable.java
    D hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
    D hbase-server/src/main/java/org/apache/hadoop/hbase/master/BulkAssigner.java
    D hbase-server/src/main/java/org/apache/hadoop/hbase/master/GeneralBulkAssigner.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/GeneralBulkAssigner.java
    Remove these hacky classes that were never supposed to live longer than
    a month or so to be replaced with real assigners.

    D hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStateStore.java
    D hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java
    D hbase-server/src/main/java/org/apache/hadoop/hbase/master/UnAssignCallable.java

    A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
    A procedure-based AM (AMv2).

    TODO
     - handle region migration
     - handle meta assignment first
     - handle sys table assignment first (e.g. acl, namespace)
     - handle table priorities
      "hbase.assignment.bootstrap.thread.pool.size"; default size is 16.
      "hbase.assignment.dispatch.wait.msec"; default wait is 150
      "hbase.assignment.dispatch.wait.queue.max.size"; wait max default is 100
      "hbase.assignment.rit.chore.interval.msec"; default is 5 * 1000;
      "hbase.assignment.maximum.attempts"; default is 10;

     A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MoveRegionProcedure.java
     Procedure that runs subprocedure to unassign and then assign to new location

     A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStateStore.java
     Manage store of region state (in hbase:meta by default).

     A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStates.java
     In-memory state of all regions. Used by AMv2.

     A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
     Base RIT procedure for Assign and Unassign.

     A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java
     Unassign procedure.

     A hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/RSProcedureDispatcher.java
     Run region assignement in a manner that pays attention to target server version.
     Adds "hbase.regionserver.rpc.startup.waittime"; defaults 60 seconds.
2017-05-31 17:49:11 -07:00
Phil Yang 9cf1a08c53 HBASE-18122 Scanner id should include ServerName of region server 2017-05-31 13:57:05 +08:00
Andrew Purtell d547feac6b HBASE-18027 HBaseInterClusterReplicationEndpoint should respect RPC limits when batching edits 2017-05-30 14:24:51 -07:00
zhangduo 6846b03944 HBASE-18042 Client Compatibility breaks between versions 1.2 and 1.3 2017-05-27 17:55:49 +08:00
zhangduo efc7edc81a HBASE-18115 Move SaslServer creation to HBaseSaslRpcServer 2017-05-27 11:38:41 +08:00
Guanghao Zhang 97484f2aaf HBASE-18114 Update the config of TestAsync*AdminApi to make test stable 2017-05-27 11:11:40 +08:00
Ramkrishna 8b5c161cbf HBASE-17777 TestMemstoreLAB#testLABThreading runs too long for a small
test (Ram)
2017-05-26 17:15:04 +05:30
huzheng b076b8e794 HBASE-18120 (addendum) Fix TestAsyncRegionAdminApi
Signed-off-by: Guanghao Zhang <zghao@apache.org>
2017-05-26 16:39:42 +08:00
huzheng 712beb305e HBASE-18120 Fix TestAsyncRegionAdminApi
Signed-off-by: Michael Stack <stack@apache.org>
2017-05-25 23:10:16 -07:00
huzheng f441ca0458 HBASE-16011 TableSnapshotScanner and TableSnapshotInputFormat can produce duplicate rows if split table.
Signed-off-by: tedyu <yuzhihong@gmail.com>
2017-05-25 12:47:43 -07:00
tedyu 3e426b2f85 HBASE-18099 FlushSnapshotSubprocedure should wait for concurrent Region#flush() to finish 2017-05-25 04:41:29 -07:00
Michael Stack a3c5a74487 Revert "HBASE-14614 Procedure v2 - Core Assignment Manager (Matteo Bertozzi)"
Revert a mistaken commit!!!

This reverts commit dc1065a85d.
2017-05-24 23:31:36 -07:00
Michael Stack dc1065a85d HBASE-14614 Procedure v2 - Core Assignment Manager (Matteo Bertozzi)
Move to a new AssignmentManager, one that describes Assignment using
a State Machine built on top of ProcedureV2 facility.

This doc. keeps state on where we are at w/ the new AM:
https://docs.google.com/document/d/1eVKa7FHdeoJ1-9o8yZcOTAQbv0u0bblBlCCzVSIn69g/edit#heading=h.vfdoxqut9lqn
Includes list of tests disabled by this patch with reasons why.

Based on patches from Matteos' repository and then fix up to get it all to pass cluster
tests, filling in some missing functionality, fix of findbugs, fixing bugs, etc..
including:

1. HBASE-14616 Procedure v2 - Replace the old AM with the new AM.
The basis comes from Matteo's repo here:
689227fcbf

Patch replaces old AM with the new under subpackage master.assignment.
Mostly just updating classes to use new AM -- import changes -- rather
than the old. It also removes old AM and supporting classes.
See below for more detail.

2. HBASE-14614 Procedure v2 - Core Assignment Manager (Matteo Bertozzi)
3622cba4e3

Adds running of remote procedure. Adds batching of remote calls.
Adds support for assign/unassign in procedures. Adds version info
reporting in rpc. Adds start of an AMv2.

3. Reporting of remote RS version is from here:
ddb4df3964.patch

4. And remote dispatch of procedures is from:
186b9e7c4d

5. The split merge patches from here are also melded in:
9a3a95a2c2
and d6289307a0

We add testing util for new AM and new sets of tests.

Does a bunch of fixup on logging so its possible to follow a procedures' narrative by grepping
procedure id. We spewed loads of log too on big transitions such as master fail; fixed.

Fix CatalogTracker. Make it use Procedures doing clean up of Region data on split/merge.
Without these changes, ITBLL was failing at larger scale (3-4hours 5B rows) because we were
splitting split Regions among other things (CJ would run but wasn't
taking lock on Regions so havoc).

Added a bunch of doc. on Procedure primitives.

Added new region-based state machine base class. Moved region-based
state machines on to it.

Found bugs in the way procedure locking was doing in a few of the
region-based Procedures. Having them all have same subclass helps here.

Added isSplittable and isMergeable to the Region Interface.

Master would split/merge even though the Regions still had
references. Fixed it so Master asks RegionServer if Region
is splittable.

Messing more w/ logging. Made all procedures log the same and report
the state the same; helps when logging is regular.

Rewrote TestCatalogTracker. Enabled TestMergeTableRegionProcedure.

Added more functionality to MockMasterServices so can use it doing
standalone testing of Procedures (made TestCatalogTracker use it
instead of its own version).

Add to MasterServices ability to wait on Master being up -- makes
it so can Mock Master and start to implement standalone split testing.
Start in on a Split region standalone test in TestAM.

Fix bug where a Split can fail because it comes in in the middle of
a Move (by holding lock for duration of a Move).

Breaks CPs that were watching merge/split. These are run by Master now
so you need to observe on Master, not on RegionServer.

Details:

M hbase-client/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java
Takes List of regionstates on construction rather than a Set.
NOTE!!!!! This is a change in a public class.

M hbase-client/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
Add utility getShortNameToLog

M hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/ShortCircuitMasterConnection.java
Add support for dispatching assign, split and merge processes.

M hbase-client/src/main/java/org/apache/hadoop/hbase/master/RegionState.java
Purge old overlapping states: PENDING_OPEN, PENDING_CLOSE, etc.

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java
Lots of doc on its inner workings. Bug fixes.

M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
Log and doc on workings. Bug fixes.

A hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/RemoteProcedureDispatcher.java
Dispatch remote procedures every 150ms or 32 items -- which ever
happens first (configurable). Runs a timeout thread. This facility is
not on yet; will come in as part of a later fix. Currently works a
region at a time. This class carries notion of a remote procedure and of a buffer full of these.
"hbase.procedure.remote.dispatcher.threadpool.size" with default = 128
"hbase.procedure.remote.dispatcher.delay.msec" with default = 150ms
"hbase.procedure.remote.dispatcher.max.queue.size" with default = 32

M hbase-client/src/main/java/org/apache/hadoop/hbase/shaded/protobuf/ProtobufUtil.java
Add in support for merge. Remove no-longer used methods.

M hbase-protocol-shaded/src/main/protobuf/Admin.proto b/hbase-protocol-shaded/src/main/protobuf/Admin.proto
Add execute procedures call ExecuteProcedures.

M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
Add assign and unassign state support for procedures.

M hbase-server/src/main/java/org/apache/hadoop/hbase/client/VersionInfoUtil.java
Adds getting RS version out of RPC
Examples: (1.3.4 is 0x0103004, 2.1.0 is 0x0201000)

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
Remove periodic metrics chore. This is done over in new AM now.
Replace AM with the new. Host the procedures executor.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterMetaBootstrap.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterMetaBootstrap.java
Have AMv2 handle assigning meta.

M hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java
Extract version number of the server making rpc.

A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignProcedure.java
Add new assign procedure. Runs assign via Procedure Dispatch.
There can only be one RegionTransitionProcedure per region running at the time,
since each procedure takes a lock on the region.

D hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignCallable.java
D hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
D hbase-server/src/main/java/org/apache/hadoop/hbase/master/BulkAssigner.java
D hbase-server/src/main/java/org/apache/hadoop/hbase/master/GeneralBulkAssigner.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/GeneralBulkAssigner.java
Remove these hacky classes that were never supposed to live longer than
a month or so to be replaced with real assigners.

D hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStateStore.java
D hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java
D hbase-server/src/main/java/org/apache/hadoop/hbase/master/UnAssignCallable.java

A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
A procedure-based AM (AMv2).

TODO
 - handle region migration
 - handle meta assignment first
 - handle sys table assignment first (e.g. acl, namespace)
 - handle table priorities
  "hbase.assignment.bootstrap.thread.pool.size"; default size is 16.
  "hbase.assignment.dispatch.wait.msec"; default wait is 150
  "hbase.assignment.dispatch.wait.queue.max.size"; wait max default is 100
  "hbase.assignment.rit.chore.interval.msec"; default is 5 * 1000;
  "hbase.assignment.maximum.attempts"; default is 10;

 A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MoveRegionProcedure.java
 Procedure that runs subprocedure to unassign and then assign to new location

 A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStateStore.java
 Manage store of region state (in hbase:meta by default).

 A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStates.java
 In-memory state of all regions. Used by AMv2.

 A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
 Base RIT procedure for Assign and Unassign.

 A hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java
 Unassign procedure.

 A hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/RSProcedureDispatcher.java
 Run region assignement in a manner that pays attention to target server version.
 Adds "hbase.regionserver.rpc.startup.waittime"; defaults 60 seconds.
2017-05-24 20:47:25 -07:00
Amit Patel 8b75e9ed91 HBASE-18101 Fix type mismatch on container access
Signed-off-by: Michael Stack <stack@apache.org>
2017-05-24 19:59:16 -07:00
Umesh Agashe 837bb9ece7 HBASE-18091 Added API for getting who currently holds a lock on namespace/ table/ region/ server and log messages when procedure needs to wait to acquire lock
Signed-off-by: Michael Stack <stack@apache.org>
2017-05-24 14:56:22 -07:00
Yu Li 998bd5f90e HBASE-18084 Improve CleanerChore to clean from directory which consumes more disk space 2017-05-24 16:41:04 +08:00
Balazs Meszaros 80dd8bf51b HBASE-18096 Limit HFileUtil visibility and add missing annotations
Signed-off-by: Chia-Ping Tsai <chia7712@gmail.com>
2017-05-24 16:34:59 +08:00
Stephen Yuan Jiang 1d0295f4e2 HBASE-18093 Overloading the meaning of 'enabled' in Quota Manager to indicate either quota disabled or quota manager not ready is not good (Stephen Yuan Jiang) 2017-05-23 06:40:33 -07:00
zhangduo 3f75ba195c HBASE-18013 Write response directly instead of creating a fake call when setup connection 2017-05-23 15:09:08 +08:00
tedyu 28d619b22b HBASE-17850 Backup system repair utility (Vladimir Rodionov) 2017-05-22 16:25:59 -07:00
Josh Elser f1a9990328 HBASE-17977 Enable the MasterSpaceQuotaObserver by default
It should be the normal case that HBase automatically deletes
quotas for deleted tables. Switch the Observer to be on by
default and add an option to instead prevent it from being added.
2017-05-22 13:41:36 -04:00
Josh Elser b971b449e8 HBASE-17978 Ensure superusers can circumvent actions restricted by space quota violations 2017-05-22 13:41:36 -04:00
Josh Elser ed618da906 HBASE-17981 Consolidate the space quota shell commands 2017-05-22 13:41:36 -04:00
Josh Elser d671a1dbc6 HBASE-17955 Various reviewboard improvements to space quota work
Most notable change is to cache SpaceViolationPolicyEnforcement objects
in the write path. When a table has no quota or there is not SpaceQuotaSnapshot
for that table (yet), we want to avoid creating lots of
SpaceViolationPolicyEnforcement instances, caching one instance
instead. This will help reduce GC pressure.
2017-05-22 13:41:36 -04:00
Josh Elser 98ace3d586 HBASE-17447 Implement a MasterObserver for automatically deleting space quotas
When a table or namespace is deleted, it would be nice to automatically
delete the quota on said table/NS. It's possible that not all people
would want this functionality so we can leave it up to the user to
configure this Observer.
2017-05-22 13:41:35 -04:00
Josh Elser a8460b8bad HBASE-17794 Swap "violation" for "snapshot" where appropriate
A couple of variables and comments in which violation is incorrectly
used to describe what the code is doing. This was a hold over from early
implementation -- need to scrub these out for clarity.
2017-05-22 13:41:35 -04:00
Josh Elser 13af7f8ac6 HBASE-17002 JMX metrics and some UI additions for space quotas 2017-05-22 13:41:35 -04:00
Josh Elser 91b4d2e827 HBASE-17568 Better handle stale/missing region size reports
* Expire region reports in the master after a timeout.
* Move regions in violation out of violation when insufficient
    region size reports are observed.
2017-05-22 13:41:35 -04:00
Josh Elser 8159eae781 HBASE-17602 Reduce some quota chore periods/delays 2017-05-22 13:41:35 -04:00
Josh Elser f031b69969 HBASE-17516 Correctly handle case where table and NS quotas both apply
The logic surrounding when a table and namespace quota both apply
to a table was incorrect, leading to a case where a table quota
violation which should have fired did not because of the less-strict
namespace quota.
2017-05-22 13:41:35 -04:00
Josh Elser 80a1f8fa2a HBASE-17428 Implement informational RPCs for space quotas
Create some RPCs that can expose the in-memory state that the
RegionServers and Master hold to drive the space quota "state machine".
Then, create some hbase shell commands to interact with those.
2017-05-22 13:41:35 -04:00
Josh Elser 4ad49bc3ac HBASE-17478 Avoid reporting FS use when quotas are disabled
Also, gracefully produce responses when quotas are disabled.
2017-05-22 13:41:35 -04:00
Josh Elser 6c9082fe16 HBASE-17259 API to remove space quotas on a table/namespace 2017-05-22 13:41:35 -04:00
Josh Elser 34ba143fc8 HBASE-17001 Enforce quota violation policies in the RegionServer
The nuts-and-bolts of filesystem quotas. The Master must inform
RegionServers of the violation of a quota by a table. The RegionServer
must apply the violation policy as configured. Need to ensure
that the proper interfaces exist to satisfy all necessary policies.

This required a massive rewrite of the internal tracking by
the general space quota feature. Instead of tracking "violations",
we need to start tracking "usage". This allows us to make the decision
at the RegionServer level as to when the files in a bulk load request
should be accept or rejected which ultimately lets us avoid bulk loads
dramatically exceeding a configured space quota.
2017-05-22 13:41:35 -04:00
Josh Elser 98b4181f43 HBASE-16999 Implement master and regionserver synchronization of quota state
* Implement the RegionServer reading violation from the quota table
* Implement the Master reporting violations to the quota table
* RegionServers need to track its enforced policies
2017-05-22 13:41:35 -04:00
Josh Elser 533470f8c8 HBASE-16998 Implement Master-side analysis of region space reports
Adds a new Chore to the Master that analyzes the reports that are
sent by RegionServers. The Master must then, for all tables with
quotas, determine the tables that are violating quotas and move
those tables into violation. Similarly, tables no longer violating
the quota can be moved out of violation.

The Chore is the "stateful" bit, managing which tables are and
are not in violation. Everything else is just performing
computation and informing the Chore on the updated state.

Added InterfaceAudience annotations and clean up the QuotaObserverChore
constructor. Cleaned up some javadoc and QuotaObserverChore. Reuse
the QuotaViolationStore impl objects.
2017-05-22 13:41:35 -04:00
tedyu 7fb0ac26e3 HBASE-17557 HRegionServer#reportRegionSizesForQuotas() should respond to UnsupportedOperationException 2017-05-22 13:41:35 -04:00
Josh Elser 6b334cd817 HBASE-17000 Implement computation of online region sizes and report to the Master
Includes a trivial implementation of the Master-side collection to
avoid. Only enough to write a test to verify RS collection.
2017-05-22 13:41:35 -04:00
tedyu f74e051bce HBASE-16996 Implement storage/retrieval of filesystem-use quotas into quota table (Josh Elser) 2017-05-22 13:41:35 -04:00
Apekshit Sharma 23ea2c36f5 HBASE-18068 Fix flaky test TestAsyncSnapshotAdminApi
- internalRestoreSnapshot() returns future which completes by just getting proc_id from master. Changed it to wait for the procedure to complete.
- Refactor TestAsyncSnapshotAdminApi: Add cleanup() which deletes all tables and snapshots after every test run. Simplifies individual tests.

Change-Id: Idc30fb699db32d58fd0f60da220953a430f1d3cc
2017-05-22 09:20:37 -07:00
Guanghao Zhang 3aac047a4f HBASE-18069 Fix flaky test TestReplicationAdminWithClusters#testDisableAndEnableReplication 2017-05-22 17:17:25 +08:00
Josh Elser 709f5a1980 HBASE-18075 Support non-latin table names and namespaces 2017-05-21 22:24:12 -04:00
zhangduo 1ceb25cf09 HBASE-18081 The way we process connection preamble in SimpleRpcServer is broken 2017-05-21 20:36:33 +08:00
anastas 1520c8fd4d HBASE-18056 Make the default behavior of CompactionPipeline to merge it segments into one, due to better read performance in this case 2017-05-21 12:27:57 +03:00
Umesh Agashe 8b70d043e4 HBASE-18071 Fix flaky test TestStochasticLoadBalancer#testBalanceCluster
Test was failing on clusters with large number of servers or regions. Using commonly using config settings like some other tests seems to work.

Signed-off-by: Michael Stack <stack@apache.org>
2017-05-19 11:09:28 -07:00
Guanghao Zhang 3fe4b28bb0 HBASE-15616 Allow null qualifier for all table operations 2017-05-19 17:47:08 +08:00