* Add a bit of javadoc around SerialReplicationChecker.
* Miniscule edit to the profiler jsp page and then a bit of doc on how to make it work that might help.
* Add some detail if NPE getting BitSetNode to help w/ debug.
* Change HbckChore to log region names instead of encoded names; helps doing diagnostics; can take region name and query in shell to find out all about the region according to hbase:meta.
* Add some fix-it help inline in the HBCK Report page – how to fix.
* Add counts in procedures page so can see if making progress; move listing of WALs to end of the page.
Have the existing scheduleRecoveries launch a new HBCKSCP
instead of SCP. It gets regions to recover from Master
in-memory context AND from a scan of hbase:meta. This
new HBCKSCP is For processing 'Unknown Servers', servers that
are 'dead' and purged but still have references in
hbase:meta. Rare occurance but needs tooling to address.
Later have catalogjanitor take care of these deviations
between Master in-memory and hbase:meta content (usually
because of overdriven cluster with failed RPCs to hbase:meta,
etc)
Changed expireServers in ServerManager so could pass in
custom reaction to expired server.... This is how we
run our custom HBCKSCP while keeping all other aspects
of expiring services (rather than try replicate it
externally).
* Clean up a bunch of private variable leakage into other
classes. Reduces visibility as much as possible, providing getters
where access remains necessary or making use of getters that
already exist. There remains an insidious relationship between
`HRegionServer` and `RSRpcServices`.
* Rename `fs` to `dataFs`, `rootDir` as `dataRootDir` so as to
distinguish from the new `walFs`, `walRootDir` (and make it easier
to spot bugs).
* Cleanup or delete a bunch of lack-luster javadoc comments.
* Delete a handful of methods that are unused according to static
analysis.
* Reduces the warning count as reported by IntelliJ from 100 to 7.
Signed-off-by: stack <stack@apache.org>
Includes the following, incorporating HBASE-20439 and HBASE-20440, too.
1)
HBASE-18133 Decrease quota reaction latency by HBase
Certain operations in HBase are known to directly affect
the utilization of tables on HDFS. When these actions
occur, we can circumvent the normal path and notify the
Master directly. This results in a much faster response to
changes in HDFS usage.
This requires FS scanning by the RS to be decoupled from
the reporting of sizes to the Master. An API inside each
RS is made so that any operation can hook into this call
in the face of other operations (e.g. compaction, flush,
bulk load).
2)
HBASE-18135 Implement mechanism for RegionServers to report file archival for space quotas
This de-couples the snapshot size calculation from the
SpaceQuotaObserverChore into another API which both the periodically
invoked Master chore and the Master service endpoint can invoke. This
allows for multiple sources of snapshot size to reported (from the
multiple sources we have in HBase).
When a file is archived, snapshot sizes can be more quickly realized and
the Master can still perform periodical computations of the total
snapshot size to account for any delayed/missing/lost file archival RPCs.
3)
HBASE-20531 RS may throw NPE when close meta regions in shutdown procedure.
Removes the closeRegion flag added by HBASE-23181 and instead
relies on reading meta WALEdit content. Modified how qualifier is
written when the meta WALEdit is for a RegionEventDescriptor
so the 'type' is added to the qualifer so can figure type
w/o having to deserialize protobuf value content: e.g.
HBASE::REGION_EVENT::REGION_CLOSE
Added doc on WALEdit and tried to formalize the 'meta' WALEdit
type and how it works. Needs complete redo in part as suggested
by HBASE-8457. Meantime, some doc and cleanup.
Also changed the LogRoller constructor to remove redundant param.
Because of constructor change, need to change also
TestFailedAppendAndSync, TestWALLockup, TestAsyncFSWAL &
WALPerformanceEvaluation.java
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Lijin Bin <binlijin@apache.org>
Adds logging of row and complaint if consistency check fails during CJ
checking. Adds a few more null checks. Does edit on the 'HBCK Report'
top line.
Signed-off-by: Reid Chan <reidchan@apache.org>
* better logging on MOB compaction process
* HFileCleanerDelegate to optionally halt removal of mob hfiles
* use archiving when removing committed mob file after bulkload ref failure
closes#763
Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org>
Signed-off-by: Balazs Meszaros <meszibalu@apache.org>
Introducing property hbase.regionserver.user.metrics.enabled(Default:true)
to disable user metrics in case it accounts for any performance issues
Close#661
Signed-off-by: Josh Elser <elserj@apache.org>
This commit adds table name to the logging context when
StochasticLoadBalancer is configured "per table". Added some
test coverage with per-table balancer enabled and manually
verified the logs to make sure the table name is formatted
correctly.
Signed-off-by: Viraj Jasani <virajjasani007@gmail.com>
Signed-off-by: Wellington Chevreuil <wchevreuil@apache.com>
(cherry picked from commit 06ff478674)
Make it so hbase:meta can be altered. TableState for hbase:meta
is kept in Master. State is in-memory transient so if Master
fails, hbase:meta is ENABLED again. hbase:meta schema will be
bootstrapped from the filesystem. Changes to filesystem schema
are atomic so we should be ok if Master fails mid-edit (TBD)
Undoes a bunch of guards that prevented our being able to edit
hbase:meta. At minimmum, need to add in a bunch of WARNING.
TODO: Tests, more clarity around hbase:meta table state, and undoing
references to hard-coded hbase:meta regioninfo.
M hbase-client/src/main/java/org/apache/hadoop/hbase/MetaTableAccessor.java
Throw illegal access exception if you try to use MetaTableAccessor
getting state of the hbase:meta table.
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
For table state, go to master rather than go to meta direct. Going
to meta won't work for hbase;meta state. Puts load on Master.
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
Change isTableDisabled/Enabled implementation to ask the Master instead.
This will give the Master's TableStateManager's opinion rather than
client figuring it for themselves reading meta table direct.
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/RawAsyncHBaseAdmin.java
TODO: Cleanup in here. Go to master for state, not to meta.
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/ZKAsyncRegistry.java
Logging cleanup.
M hbase-client/src/main/java/org/apache/hadoop/hbase/zookeeper/ZNodePaths.java
Shutdown access.
M hbase-server/src/main/java/org/apache/hadoop/hbase/TableDescriptors.java
Just cleanup.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/TableStateManager.java
Add state holder for hbase:meta.
Removed unused methods.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStateStore.java
Shut down access.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DisableTableProcedure.java
Allow hbase:meta to be disabled.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/EnableTableProcedure.java
Allow hbase:meta to be enabled.
Signed-off-by: Ramkrishna <ramkrishna.s.vasudevan@intel.com>
Space quotas has a feature which intends to avoid enacting a space quota
violation policy when only a subset of the Regions for that Table have
reported their space usage (under the assumption that we cannot make an
informed decision if we do not include all regions in our calculations).
This had the unintended side-effect, when a table is disabled as a part
of a violation policy, of causing the regions for that table to not be
reported which disables the violation policy and enables the table.
Need to make sure that when a table is disabled because of a violation
policy that the code does not automatically move that table out of
violation because region sizes are not being reported (because those
regions are not open).
Closes#572
Signed-off-by: Josh Elser <elserj@apache.org>
* Add chaos monkey action for suspend/resume region servers
* Add chaos monkey action for graceful rolling restart
* Add these to relevant chaos monkeys
Signed-off-by: Balazs Meszaros <meszibalu@apache.org>
Signed-off-by: Peter Somogyi <psomogyi@apache.org>
There was a bug in which we would not drop the RegionSizes
for a table in a namespace, where the namespace had a quota
on it. This allowed a scenario in which recreation of a table
inside of a namespace would unintentionally move into violation
despite the table being empty. Need to make sure the RegionSizes
are dropped on table deletion if there is _any_ quota applying
to that table.
Signed-off-by: Josh Elser <elserj@apache.org>
During startup, it's possible that quotas are enabled but the Master has
not yet created the hbase:quotas table.
Closes#559
Signed-off-by: stack <stack@apache.org>
Signed-off-by: Josh Elser <elserj@apache.org>
* HBASE-22941 merge operation returns parent regions in random order
store and return the merge parent regions in ascending order
remove left over check for exactly two merged regions
add unit test
* use SortedMap type to emphasise that the Map is sorted.
* use regionCount consistently and checkstyle fixes
* Delete tests that expect multiregion merges to fail.
Signed-off-by: stack <stack@apache.org>
Backport of above; only the usage message was changed
in the backport; nothing else. Usage points at refguide.
Signed-off-by: Peter Somogyi <psomogyi@apache.org>
Signed-off-by: Guanghao Zhang <zghao@apache.org>
Check if overlap is split parent.
Cleaned up the HBCK Report page too with some notes that it is made of
two reports; have the two sections display the same.
Adds deprecations on HBaseFsck and on supporting classes such as
the reporting Interface. Provides alternatives in FSUtils for
progress reporting and deprecates methods that use hbck1 facility.
Signed-off-by: Peter Somogyi <psomogyi@apache.org>
* HBASE-22922 Lock all regions to be merged in case of multi region merge
* HBASE-22922 Lock all regions to be merged in case of multi region merge
(addendum)
fix off-by-one error in patch
Signed-off-by: stack <stack@apache.org>
These functions make it easier to possible to use
`org.apache.hadoop.hbase.client.Table.CheckAndMutateBuilder#timeRange`
with more interesting ranges, without being forced to use the
deprecated constructors.
Signed-off-by: huzheng <openinx@gmail.com>
* HBASE-22833: MultiRowRangeFilter should provide a method for creating a filter which is functionally equivalent to multiple prefix filters
* Delete superfluous comments
* Add description for MultiRowRangeFilter constructor
* Add null check for rowKeyPrefixes
* Fix checkstyle
Signed-off-by: huzheng <openinx@gmail.com>
Makes MergeTableRegionsProcedure do more than just two regions at a
time. Compatible as MTRP was done considering one day it'd do more than
two at a time.
Changes hardcoded assumption that merge parent regions are named
mergeA and mergeB in a column on the resultant region. Instead
can have N columns on the merged region, one for each parent
merged. Column qualifiers all being with 'merge'.
Most of code below is undoing the assumption that there are two
parents on a merge only.
This is a first cut at this patch. Implements hold fixing only
currently.
Add a fixMeta method to Hbck Interface.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java
Bug fix. If hole is on end of last table, I wasn't seeing it.
A hbase-server/src/main/java/org/apache/hadoop/hbase/master/MetaFixer.java
Add a general meta fixer class. Explains up top why this stuff doesn't
belong inside MetaTableAccessor.
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
Break out the filesystem messing so don't have to copy it nor do more
than is needed doing fixup for Region holes.
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionFileSystem.java
Change behavious slightly. If directory exists, don't fail as we did
but try and keep going and create .regioninfo file if missing (or
overwrite if in place). This should make it idempotent. Can rerun
command. Lets see if any repercussions in test suite.
A hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMetaFixer.java
Add test.
Signed-off-by: Zheng Hu <openinx@gmail.com>
Signed-off-by: Guanghao Zhang <zghao@apache.org>
This is a first cut at this patch. Implements hold fixing only
currently.
Add a fixMeta method to Hbck Interface.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java
Bug fix. If hole is on end of last table, I wasn't seeing it.
A hbase-server/src/main/java/org/apache/hadoop/hbase/master/MetaFixer.java
Add a general meta fixer class. Explains up top why this stuff doesn't
belong inside MetaTableAccessor.
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
Break out the filesystem messing so don't have to copy it nor do more
than is needed doing fixup for Region holes.
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionFileSystem.java
Change behavious slightly. If directory exists, don't fail as we did
but try and keep going and create .regioninfo file if missing (or
overwrite if in place). This should make it idempotent. Can rerun
command. Lets see if any repercussions in test suite.
A hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMetaFixer.java
Add test.
Signed-off-by: Zheng Hu <openinx@gmail.com>
Signed-off-by: Guanghao Zhang <zghao@apache.org>
Refactor of CatalogJanitor so it generates a
Report on the state of hbase:meta when it runs. Also
refactor so CJ runs even if RIT (previous it would
punt on running if RIT) so it can generate a 'Report'
on the interval regardless. If RIT, it just doesn't
go on to do the merge/split GC as it used to.
If report finds an issue, dump as a WARN message
to the master log.
Follow-on is to make the Report actionable/available
for the Master to pull when it goes to draw the hbck
UI page (could also consider shipping the Report as
part of ClusterMetrics?)
Adds new, fatter Visitor to CJ, one that generates
Report on each run keeping around more findings as
it runs.
Moved some methods around so class reads better;
previous methods were randomly ordered in the class.
M hbase-client/src/main/java/org/apache/hadoop/hbase/MetaTableAccessor.java
Make a few handy methods public.
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/RegionInfo.java
Add utility as defaults on the Inteface; i.e. is this the first region
in table, is it last, does a passed region come next, or does passed
region overlap this region (added tests for this new stuff).
M hbase-common/src/main/java/org/apache/hadoop/hbase/util/Bytes.java
Bugfix... handle case where buffer passed is null.
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java
Lots of change, reorg., but mostly adding consistency checking
to the visitor used scanning hbase:meta on a period and the
generation of a Report on what the scan has found traversing
hbase:meta. Added a main so could try the CatalogJanitor against
a running cluster.
A hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitorCluster.java
Fat ugly test for CatalogJanitor consistency checking.
M hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java
Add tests for new functionality in RI.
M hbase-shell/src/main/ruby/hbase/table.rb
Bug fix for case where meta has a null regioninfo; scan was aborting.
Signed-off-by: Andrew Purtell <apurtell@apache.org>
Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org>
* Deprecated old attribute and introduced a new one
* Removed unnecessary import
* Added two import configs and removed one
Signed-off-by: Reid Chan <reidchan@apache.org>