For RIT Duration, do better than print ms/seconds. Remove redundant UI
column dedicated to duration when we log it in the status field too.
Make bypass log at INFO level.
Make it so on complete of subprocedure, we note count of outstanding
siblings so we have a clue how much further the parent has to go before
it is done (Helpful when hundreds of servers doing SCP).
Have the SCP run the AP preflight check before creating an AP; saves
creation of thousands of APs during fixup.
Don't log tablename three times when reporting remote call failed.
If lock is held already, note who has it. Also log after we get lock
or if we have to wait rather than log on entrance though we may
later have to wait (or we may have just picked up the lock).
Signed-off-by: Mike Drob <mdrob@apache.org>
Adds override to assigns and unassigns. Changes bypass 'force'
to align calling the param 'override' instead.
Adds recursive to 'bypass', a means of calling bypass on
parent and its subprocedures (usually bypass works on
leaf nodes rippling the bypass up to parent -- recursive
has us work in the opposite direction): EXPERIMENTAL.
bypass on an assign/unassign leaves region in RIT and the
RegionStateNode loaded with the bypassed procedure. First
implementation had assign/unassign cleanup leftover state.
Second implementation, on feedback, keeps the state in place
as a fence against other Procedures assuming the region entity,
and instead adds an 'override' function that hbck2 can set on
assigns/unassigns to override the fencing.
Note that the below also converts ProcedureExceptions that
come out of the Pv2 system into DoNotRetryIOEs. It is a
little awkward because DNRIOE is in client-module, not
in procedure module. Previous, we'd just keep retrying
the bypass, etc.
M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java
Have bypass take an environment like all other methods so subclasses.
Fix javadoc issues.
M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
Javadoc issues. Pass environment when we invoke bypass.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
Rename waitUntilNamespace... etc. to align with how these method types
are named elsehwere .. i.e. waitFor rather than waitUntil..
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
Cleanup message we emit when we find an exisitng procedure working
against this entity.
Add support for a force function which allows Assigns/Unassigns force
ownership of the Region entity.
A hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestRegionBypass.java
Test bypass and force.
M hbase-shell/src/main/ruby/shell/commands/list_procedures.rb
Minor cleanup of the json output... do iso8601 timestamps.
This reverts commit b96905d1df.
i.e. a revert of a revert so a reapplication!
Revert so I can add signed-off-by....
Signed-off-by: Allan Yang <allan163@apache.org>
bypass on an assign/unassign leaves region in RIT and the
RegionStateNode loaded with the bypassed procedure. First
implementation had assign/unassign cleanup leftover state.
Second implementation, on feedback, keeps the state in place
as a fence against other Procedures assuming the region entity,
and instead adds an 'override' function that hbck2 can set on
assigns/unassigns to override the fencing.
Note that the below also converts ProcedureExceptions that
come out of the Pv2 system into DoNotRetryIOEs. It is a
little awkward because DNRIOE is in client-module, not
in procedure module. Previous, we'd just keep retrying
the bypass, etc.
M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java
Have bypass take an environment like all other methods so subclasses.
Fix javadoc issues.
M hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
Javadoc issues. Pass environment when we invoke bypass.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
Rename waitUntilNamespace... etc. to align with how these method types
are named elsehwere .. i.e. waitFor rather than waitUntil..
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
Cleanup message we emit when we find an exisitng procedure working
against this entity.
Add support for a force function which allows Assigns/Unassigns force
ownership of the Region entity.
A hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestRegionBypass.java
Test bypass and force.
M hbase-shell/src/main/ruby/shell/commands/list_procedures.rb
Minor cleanup of the json output... do iso8601 timestamps.
Remove unused methods from Sleeper (its ok, its @Private).
Remove notion of startTime from Sleeper handling (it is is unused).
Allow passing in how long to sleep so can maintain externally.
In HRS, use a RetryCounter to calculate backoff sleep time for when
reportForDuty is failing against a struggling Master.
Add a check for hbase:meta being online before we go to read it.
If not online, move into a holding-pattern until rectified, probably
by external operator.
Incorporates bulk of patch made by Allan Yang over on HBASE-21035.
M hbase-common/src/main/java/org/apache/hadoop/hbase/util/RetryCounterFactory.java
Add a Constructor for case where retries are for ever.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
Move stuff around so that the first hbase:meta read is the AM#loadMeta.
Previously, checking table state and/or favored nodes could end up
trying to read a meta that was not onlined holding up master startup.
Do similar for the namespace table. Adds new methods isMeta and
isNamespace which check that the regions/tables are online.. if not,
we wait logging with a back-off that assigns need to be run.
Signed-off-by: Allan Yang <allan163@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
HBASE-21113 Apply the branch-2 version of HBASE-21095, The timeout retry
logic for several procedures are broken after master restarts
I applied the patch HBASE-21095 and then reverted it so could apply the
patch as HBASE-21113 (by reverting the HBASE-21095 revert but pushing
with this message!).
This reverts commit 4978db8102.
Write the hbase-1.x hbck1 lock file to block out hbck1 instances writing
state to an hbase-2.x cluster (could do damage).
Set hbase.write.hbck1.lock.file to false to disable this writing.
Look for the particular case where RS does the close of region w/o
involving Master and log special message in this case. Dodgy. But
until we have Master run shutdown of all regions, better than
the message we currently show.
With this change if hbase.wal.meta_provider is not explicitly set,
it uses whatever set with hbase.wal.provider. this change avoids a use
case of unexpectedly using two different providers when only
hbase.wal.provider is set to non-default but not hbase.wal.meta_provider.
This change also include document (architecture.adoc) update
Also, this is a port from master to branch-2
Signed-off-by: Zach York <zyork@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: Sean Busbey <busbey@apache.org>
Signed-off-by: Duo Zhang <Apache9@apache.org>
This patch switches to the builder pattern by adding a helper method.
It also checks to ensure that the pattern is available (i.e. that
HBase is running on a hadoop version that supports it).
Amending-Author: Mike Drob <mdrob@apache.org>
Signed-off-by: tedyu <yuzhihong@gmail.com>
Signed-off-by: zhangduo <zhangduo@apache.org>
Make the #copyCellInto method smaller so it inlines; we do it by
checking for the common type early and then taking a code path
that presumes ByteBufferExtendedCell -- avoids checks.
- -jar parameter now accepts multiple jar files and directories of jar files.
- observer classes can be verified by -class option.
- -table parameter was added to check table level coprocessors.
- -config parameter was added to obtain the coprocessor classes from
HBase cofiguration.
- -scan option was removed.
Signed-off-by: Mike Drob <mdrob@apache.org>
With prefetch-on-open enabled, the task doing the prefetching was using
non-positional (i.e. streaming) reads. If the main (non-prefetch) thread
was also using non-positional reads, these two would conflict, because
inputstreams are not thread-safe for non-positional reads.
In the case of an encrypted filesystem, this could cause JVM crashes,
etc, as underlying cipher buffers were freed underneath the racing
threads. In the case of a non-encrypted filesystem, less severe errors
would be thrown. The included unit test reproduces the latter case.
(cherry picked from commit 025ddce868)
Signed-off-by: Todd Lipcon <todd@cloudera.com>
Also changes modify table operations to help the case where a MTP spans
two master, avoiding the sanity-checks propagating back to the client
unnecessarily.
Signed-off-by: Josh Elser <elserj@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
ModifyTableProcedure is using MoveRegionProcedure in a way
that was unintended from the original implementation. As such,
we have to guard against certain usages of it. We know we can
re-open OPEN regions, but regions in OPENING will similarly
soon be OPEN (thus, we want to reopen those regions too).
Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: zhangduo <zhangduo@apache.org>