* HBASE-23304: RPCs needed for client meta information lookup
This patch implements the RPCs needed for the meta information
lookup during connection init. New tests added to cover the RPC
code paths. HBASE-23305 builds on this to implement the client
side logic.
Fixed a bunch of checkstyle nits around the places the patch
touches.
Signed-off-by: Andrew Purtell <apurtell@apache.org>
(cherry picked from commit 4f8fbba0c0)
* HBASE-23257: Track clusterID in stand by masters (#798)
This patch implements a simple cache that all the masters
can lookup to serve cluster ID to clients. Active HMaster
is still responsible for creating it but all the masters
will read it from fs to serve clients.
RPCs exposing it will come in a separate patch as a part of
HBASE-18095.
Signed-off-by: Andrew Purtell <apurtell@apache.org>
Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org>
Signed-off-by: Guangxu Cheng <guangxucheng@gmail.com>
(cherry picked from commit c2e01f2398)
* HBASE-23275: Track active master's address in ActiveMasterManager (#812)
Currently we just track whether an active master exists.
It helps to also track the address of the active master in
all the masters to help serve the client RPC requests to
know which master is active.
Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Signed-off-by: Andrew Purtell <apurtell@apache.org>
(cherry picked from commit efebb843af)
* HBASE-23281: Track meta region locations in masters (#830)
* HBASE-23281: Track meta region changes on masters
This patch adds a simple cache that tracks the meta region replica
locations. It keeps an eye on the region movements so that the
cached locations are not stale.
This information is used for servicing client RPCs for connections
that use master based registry (HBASE-18095). The RPC end points
will be added in a separate patch.
Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
(cherry picked from commit 8571d389cf)
* HBASE-23304: RPCs needed for client meta information lookup (#904)
* HBASE-23304: RPCs needed for client meta information lookup
This patch implements the RPCs needed for the meta information
lookup during connection init. New tests added to cover the RPC
code paths. HBASE-23305 builds on this to implement the client
side logic.
Fixed a bunch of checkstyle nits around the places the patch
touches.
Signed-off-by: Andrew Purtell <apurtell@apache.org>
(cherry picked from commit 4f8fbba0c0)
Includes the following, incorporating HBASE-20439 and HBASE-20440, too.
1)
HBASE-18133 Decrease quota reaction latency by HBase
Certain operations in HBase are known to directly affect
the utilization of tables on HDFS. When these actions
occur, we can circumvent the normal path and notify the
Master directly. This results in a much faster response to
changes in HDFS usage.
This requires FS scanning by the RS to be decoupled from
the reporting of sizes to the Master. An API inside each
RS is made so that any operation can hook into this call
in the face of other operations (e.g. compaction, flush,
bulk load).
2)
HBASE-18135 Implement mechanism for RegionServers to report file archival for space quotas
This de-couples the snapshot size calculation from the
SpaceQuotaObserverChore into another API which both the periodically
invoked Master chore and the Master service endpoint can invoke. This
allows for multiple sources of snapshot size to reported (from the
multiple sources we have in HBase).
When a file is archived, snapshot sizes can be more quickly realized and
the Master can still perform periodical computations of the total
snapshot size to account for any delayed/missing/lost file archival RPCs.
3)
HBASE-20531 RS may throw NPE when close meta regions in shutdown procedure.
Makes MergeTableRegionsProcedure do more than just two regions at a
time. Compatible as MTRP was done considering one day it'd do more than
two at a time.
Changes hardcoded assumption that merge parent regions are named
mergeA and mergeB in a column on the resultant region. Instead
can have N columns on the merged region, one for each parent
merged. Column qualifiers all being with 'merge'.
Most of code below is undoing the assumption that there are two
parents on a merge only.
This is a first cut at this patch. Implements hold fixing only
currently.
Add a fixMeta method to Hbck Interface.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java
Bug fix. If hole is on end of last table, I wasn't seeing it.
A hbase-server/src/main/java/org/apache/hadoop/hbase/master/MetaFixer.java
Add a general meta fixer class. Explains up top why this stuff doesn't
belong inside MetaTableAccessor.
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
Break out the filesystem messing so don't have to copy it nor do more
than is needed doing fixup for Region holes.
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionFileSystem.java
Change behavious slightly. If directory exists, don't fail as we did
but try and keep going and create .regioninfo file if missing (or
overwrite if in place). This should make it idempotent. Can rerun
command. Lets see if any repercussions in test suite.
A hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMetaFixer.java
Add test.
Signed-off-by: Zheng Hu <openinx@gmail.com>
Signed-off-by: Guanghao Zhang <zghao@apache.org>
Rest Server throws NoClassDefFoundError : javax/annotation/Priority after buiding with JDK8 and running on JDK8
Signed-off-by: Sean Busbey <busbey@apache.org>
(cherry picked from commit 68f14c19ff)
Add switching on/off of compactions.
Switching off compactions will also interrupt any currently ongoing compactions.
Adds a "compaction_switch" to hbase shell. Switching off compactions will
interrupt any currently ongoing compactions. State set from shell will be
lost on restart. To persist the changes across region servers modify
hbase.regionserver.compaction.enabled in hbase-site.xml and restart.
Signed-off-by: Umesh Agashe <uagashe@cloudera.com>
Signed-off-by: Michael Stack <stack@apache.org>
Add backoff when stuck in RegionTransitionProcedure, the subclass of
AssignProcedure and UnassignProcedure. Can happen when we go to
transition but the current Region state is not what we expect.
M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java
Add doc on being able to suspend and wait on a timeout.
M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
Add 'attempt' counter so we can do backoff when we get stuck.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignProcedure.java
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java
Add persistence of new 'attempt' counter
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
Doc data members that are persisted by subclasses given this is 'odd'.
Add a counter for 'attempts' used when 'stuck' to implement backoff.
Add suspend with timeout when 'stuck'. Add callback when timeout is
exhausted which does wakeup of this procedure.
A hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestUnexpectedStateException.java
Test of backoff.
Remove commons-cli and commons-collections4 use. Account
for the newer internal protobuf version of 3.5.1.
Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: Mike Drob <mdrob@apache.org>
Adds a prepare step to RecoverMetaProcedure in which we test for
cluster up and master being up. If not up, we fail the run.
Modified hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/HFileCleaner.java
Modified hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ChunkCreator.java
Minor log cleanup.
Modified hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/RecoverMetaProcedure.java
Add pepare step.
Modified hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManagerMetrics.java
Debug for the failing test....
Added hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestRecoverMetaProcedure.java
Test the prepare step goes down if master or cluster are down.
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/DoNotRetryRegionException.java
M hbase-client/src/main/java/org/apache/hadoop/hbase/exceptions/MergeRegionException.java
Allow passing cause to Constructor.
M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
Add prepare step to move procedure.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MergeTableRegionsProcedure.java
Add check that regions to merge are actually online to the Constructor
so we can fail fast if they are offline
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MoveRegionProcedure.java
Add prepare step. Check regions and context and skip move if not right.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/SplitTableRegionProcedure.java
Add check parent region is online to constructor.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/AbstractStateMachineTableProcedure.java
Add generic check region is online utility function for use by subclasses.
M hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionMove.java
Add test that we fail if we try to move an offlined region.
Allow that DisableTableProcedue can grab a region lock before
ServerCrashProcedure can. Cater to this cricumstance where SCP
was not unable to make progress by running the search for RIT
against the crashed server a second time, post creation of all
crashed-server assignemnts. The second run will uncover such as
the above DisableTableProcedure unassign and will interrupt its
suspend allowing both procedures to make progress.
M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
Add new procedure step post-assigns that reruns the RIT finder method.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
Make this important log more specific as to what is going on.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java
Better explanation as to what is going on.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerCrashProcedure.java
Add extra step and run handleRIT a second time after we've queued up
all SCP assigns. Also fix a but. SCP was adding an assign of a RIT
that was actually trying to unassign (made the deadlock more likely).