HBase ITs require junit which requires hamcrest. Hadoop recently
stopped including hamcrest in their installation (and thus our inherited
classpath), which means that we need to ship it.
Signed-off-by: Sean Busbey <busbey@apache.org>
Return 401 sooner when AUTHORIZATION header is missing
HBase Thrift server was checking for the AUTHORIZATION header and assuming it was always present
even when it was the first request. Many clients will not send the AUTHORIZATION header until
a 401 is received. HBase Thrift in the case of no header was throwing multiple exceptions and
filling the logs with exceptions. This was fixed by checking that if the AUTHORIZATION header is
empty then return a 401 immediately if security is enabled.
Signed-off-by: Josh Elser <elserj@apache.org>
A reattempt at fixing HBASE-20173 [AMv2] DisableTableProcedure concurrent to ServerCrashProcedure can deadlock
The scenario is a SCP after processing WALs, goes to assign regions that
were on the crashed server but a concurrent Procedure gets in there
first and tries to unassign a region that was on the crashed server
(could be part of a move procedure or a disable table, etc.). The
unassign happens to run AFTER SCP has released all RPCs that
were going against the crashed server. The unassign fails because the
server is crashed. The unassign used to suspend itself only it would
never be woken up because the server it was going against had already
been processed. Worse, the SCP could not make progress because the
unassign was suspended with the lock on a region that it wanted to
assign held making it so it could make no progress.
In here, we add to the unassign recognition of the state where it is
running post SCP cleanup of RPCs. If present, unassign moves to finish
instead of suspending itself.
Includes a nice unit test made by Duo Zhang that reproduces nicely the
hung scenario.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/FailedRemoteDispatchException.java
Moved this class back to hbase-procedure where it belongs.
M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/NoNodeDispatchException.java
M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/NoServerDispatchException.java
M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/NullTargetServerDispatchException.java
Specializiations on FRDE so we can be more particular when we say there
was a problem.
M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/RemoteProcedureDispatcher.java
Change addOperationToNode so we throw exceptions that give more detail
on issue rather than a mysterious true/false
M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
Undo SERVER_CRASH_HANDLE_RIT2. Bad idea (from HBASE-20173)
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
Have expireServer return true if it actually queued an expiration. Used
later in this patch.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
Hide methods that shouldn't be public. Add a particular check used out
in unassign procedure failure processing.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MoveRegionProcedure.java
Check that server we're to move from is actually online (might
catch a few silly move requests early).
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStates.java
Add doc on ServerState. Wasn't being used really. Now we actually stamp
a Server OFFLINE after its WAL has been split. Means its safe to assign
since all WALs have been processed. Add methods to update SPLITTING
and to set it to OFFLINE after splitting done.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
Change logging to be new-style and less repetitive of info.
Cater to new way in which .addOperationToNode returns info (exceptions
rather than true/false).
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java
Add looking for the case where we failed assign AND we should not
suspend because we will never be woken up because SCP is beyond
doing this for all stuck RPCs.
Some cleanup of the failure processing grouping where we can proceed.
TODOs have been handled in this refactor including the TODO that
wonders if it possible that there are concurrent fails coming in
(Yes).
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerCrashProcedure.java
Doc and removing the old HBASE-20173 'fix'.
Also updating ServerStateNode post WAL splitting so it gets marked
OFFLINE.
A hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestServerCrashProcedureStuck.java
Nice test by Duo Zhang.
Signed-off-by: Umesh Agashe <uagashe@cloudera.com>
Signed-off-by: Duo Zhang <palomino219@gmail.com>
Signed-off-by: Mike Drob <mdrob@apache.org>