HBASE-21383 Change refguide to point at hbck2 instead of hbck1

Removes the hbck1-in-depth index. Adds small section on hbck2 and
has all hbck talk defer to it.

Signed-off-by: Duo Zhang <zhangduo@apache.org>
This commit is contained in:
Michael Stack 2018-10-24 16:08:12 -07:00
parent cd943419b6
commit 3a7412d0af
No known key found for this signature in database
GPG Key ID: 9816C7FC8ACC93D2
4 changed files with 32 additions and 232 deletions

View File

@ -1,212 +0,0 @@
////
/**
*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
////
[appendix]
[[hbck.in.depth]]
== hbck In Depth
:doctype: book
:numbered:
:toc: left
:icons: font
:experimental:
HBaseFsck (hbck) is a tool for checking for region consistency and table integrity problems and repairing a corrupted HBase.
It works in two basic modes -- a read-only inconsistency identifying mode and a multi-phase read-write repair mode.
=== Running hbck to identify inconsistencies
To check to see if your HBase cluster has corruptions, run hbck against your HBase cluster:
[source,bourne]
----
$ ./bin/hbase hbck
----
At the end of the commands output it prints OK or tells you the number of INCONSISTENCIES present.
You may also want to run hbck a few times because some inconsistencies can be transient (e.g.
cluster is starting up or a region is splitting). Operationally you may want to run hbck regularly and setup alert (e.g.
via nagios) if it repeatedly reports inconsistencies . A run of hbck will report a list of inconsistencies along with a brief description of the regions and tables affected.
The using the `-details` option will report more details including a representative listing of all the splits present in all the tables.
[source,bourne]
----
$ ./bin/hbase hbck -details
----
If you just want to know if some tables are corrupted, you can limit hbck to identify inconsistencies in only specific tables.
For example the following command would only attempt to check table TableFoo and TableBar.
The benefit is that hbck will run in less time.
[source,bourne]
----
$ ./bin/hbase hbck TableFoo TableBar
----
=== Inconsistencies
If after several runs, inconsistencies continue to be reported, you may have encountered a corruption.
These should be rare, but in the event they occur newer versions of HBase include the hbck tool enabled with automatic repair options.
There are two invariants that when violated create inconsistencies in HBase:
* HBase's region consistency invariant is satisfied if every region is assigned and deployed on exactly one region server, and all places where this state kept is in accordance.
* HBase's table integrity invariant is satisfied if for each table, every possible row key resolves to exactly one region.
Repairs generally work in three phases -- a read-only information gathering phase that identifies inconsistencies, a table integrity repair phase that restores the table integrity invariant, and then finally a region consistency repair phase that restores the region consistency invariant.
Starting from version 0.90.0, hbck could detect region consistency problems report on a subset of possible table integrity problems.
It also included the ability to automatically fix the most common inconsistency, region assignment and deployment consistency problems.
This repair could be done by using the `-fix` command line option.
These problems close regions if they are open on the wrong server or on multiple region servers and also assigns regions to region servers if they are not open.
Starting from HBase versions 0.90.7, 0.92.2 and 0.94.0, several new command line options are introduced to aid repairing a corrupted HBase.
This hbck sometimes goes by the nickname ``uberhbck''. Each particular version of uber hbck is compatible with the HBase's of the same major version (0.90.7 uberhbck can repair a 0.90.4). However, versions <=0.90.6 and versions <=0.92.1 may require restarting the master or failing over to a backup master.
=== Localized repairs
When repairing a corrupted HBase, it is best to repair the lowest risk inconsistencies first.
These are generally region consistency repairs -- localized single region repairs, that only modify in-memory data, ephemeral zookeeper data, or patch holes in the META table.
Region consistency requires that the HBase instance has the state of the region's data in HDFS (.regioninfo files), the region's row in the hbase:meta table., and region's deployment/assignments on region servers and the master in accordance.
Options for repairing region consistency include:
* `-fixAssignments` (equivalent to the 0.90 `-fix` option) repairs unassigned, incorrectly assigned or multiply assigned regions.
* `-fixMeta` which removes meta rows when corresponding regions are not present in HDFS and adds new meta rows if they regions are present in HDFS while not in META. To fix deployment and assignment problems you can run this command:
[source,bourne]
----
$ ./bin/hbase hbck -fixAssignments
----
To fix deployment and assignment problems as well as repairing incorrect meta rows you can run this command:
[source,bourne]
----
$ ./bin/hbase hbck -fixAssignments -fixMeta
----
There are a few classes of table integrity problems that are low risk repairs.
The first two are degenerate (startkey == endkey) regions and backwards regions (startkey > endkey). These are automatically handled by sidelining the data to a temporary directory (/hbck/xxxx). The third low-risk class is hdfs region holes.
This can be repaired by using the:
* `-fixHdfsHoles` option for fabricating new empty regions on the file system.
If holes are detected you can use -fixHdfsHoles and should include -fixMeta and -fixAssignments to make the new region consistent.
[source,bourne]
----
$ ./bin/hbase hbck -fixAssignments -fixMeta -fixHdfsHoles
----
Since this is a common operation, we've added a the `-repairHoles` flag that is equivalent to the previous command:
[source,bourne]
----
$ ./bin/hbase hbck -repairHoles
----
If inconsistencies still remain after these steps, you most likely have table integrity problems related to orphaned or overlapping regions.
=== Region Overlap Repairs
Table integrity problems can require repairs that deal with overlaps.
This is a riskier operation because it requires modifications to the file system, requires some decision making, and may require some manual steps.
For these repairs it is best to analyze the output of a `hbck -details` run so that you isolate repairs attempts only upon problems the checks identify.
Because this is riskier, there are safeguard that should be used to limit the scope of the repairs.
WARNING: This is a relatively new and have only been tested on online but idle HBase instances (no reads/writes). Use at your own risk in an active production environment! The options for repairing table integrity violations include:
* `-fixHdfsOrphans` option for ``adopting'' a region directory that is missing a region metadata file (the .regioninfo file).
* `-fixHdfsOverlaps` ability for fixing overlapping regions
When repairing overlapping regions, a region's data can be modified on the file system in two ways: 1) by merging regions into a larger region or 2) by sidelining regions by moving data to ``sideline'' directory where data could be restored later.
Merging a large number of regions is technically correct but could result in an extremely large region that requires series of costly compactions and splitting operations.
In these cases, it is probably better to sideline the regions that overlap with the most other regions (likely the largest ranges) so that merges can happen on a more reasonable scale.
Since these sidelined regions are already laid out in HBase's native directory and HFile format, they can be restored by using HBase's bulk load mechanism.
The default safeguard thresholds are conservative.
These options let you override the default thresholds and to enable the large region sidelining feature.
* `-maxMerge <n>` maximum number of overlapping regions to merge
* `-sidelineBigOverlaps` if more than maxMerge regions are overlapping, sideline attempt to sideline the regions overlapping with the most other regions.
* `-maxOverlapsToSideline <n>` if sidelining large overlapping regions, sideline at most n regions.
Since often times you would just want to get the tables repaired, you can use this option to turn on all repair options:
* `-repair` includes all the region consistency options and only the hole repairing table integrity options.
Finally, there are safeguards to limit repairs to only specific tables.
For example the following command would only attempt to check and repair table TableFoo and TableBar.
----
$ ./bin/hbase hbck -repair TableFoo TableBar
----
==== Special cases: Meta is not properly assigned
There are a few special cases that hbck can handle as well.
Sometimes the meta table's only region is inconsistently assigned or deployed.
In this case there is a special `-fixMetaOnly` option that can try to fix meta assignments.
----
$ ./bin/hbase hbck -fixMetaOnly -fixAssignments
----
==== Special cases: HBase version file is missing
HBase's data on the file system requires a version file in order to start.
If this file is missing, you can use the `-fixVersionFile` option to fabricating a new HBase version file.
This assumes that the version of hbck you are running is the appropriate version for the HBase cluster.
==== Special case: Root and META are corrupt.
The most drastic corruption scenario is the case where the ROOT or META is corrupted and HBase will not start.
In this case you can use the OfflineMetaRepair tool create new ROOT and META regions and tables.
This tool assumes that HBase is offline.
It then marches through the existing HBase home directory, loads as much information from region metadata files (.regioninfo files) as possible from the file system.
If the region metadata has proper table integrity, it sidelines the original root and meta table directories, and builds new ones with pointers to the region directories and their data.
----
$ ./bin/hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair
----
NOTE: This tool is not as clever as uberhbck but can be used to bootstrap repairs that uberhbck can complete.
If the tool succeeds you should be able to start hbase and run online repairs if necessary.
==== Special cases: Offline split parent
Once a region is split, the offline parent will be cleaned up automatically.
Sometimes, daughter regions are split again before their parents are cleaned up.
HBase can clean up parents in the right order.
However, there could be some lingering offline split parents sometimes.
They are in META, in HDFS, and not deployed.
But HBase can't clean them up.
In this case, you can use the `-fixSplitParents` option to reset them in META to be online and not split.
Therefore, hbck can merge them with other regions if fixing overlapping regions option is used.
This option should not normally be used, and it is not in `-fixAll`.
:numbered:

View File

@ -51,7 +51,8 @@ Options:
Commands:
Some commands take arguments. Pass no args or -h for usage.
shell Run the HBase shell
hbck Run the hbase 'fsck' tool
hbck Run the HBase 'fsck' tool. Defaults read-only hbck1.
Pass '-j /path/to/HBCK2.jar' to run hbase-2.x HBCK2.
snapshot Tool for managing snapshots
wal Write-ahead-log analyzer
hfile Store file analyzer
@ -386,12 +387,33 @@ Each command except `RowCounter` and `CellCounter` accept a single `--help` argu
[[hbck]]
=== HBase `hbck`
To run `hbck` against your HBase cluster run `$./bin/hbase hbck`. At the end of the command's output it prints `OK` or `INCONSISTENCY`.
If your cluster reports inconsistencies, pass `-details` to see more detail emitted.
If inconsistencies, run `hbck` a few times because the inconsistency may be transient (e.g. cluster is starting up or a region is splitting).
Passing `-fix` may correct the inconsistency (This is an experimental feature).
The `hbck` tool that shipped with hbase-1.x has been made read-only in hbase-2.x. It is not able to repair
hbase-2.x clusters as hbase internals have changed. Nor should its assessments in read-only mode be
trusted as it does not understand hbase-2.x operation.
For more information, see <<hbck.in.depth>>.
A new tool, <<HBCK2>>, described in the next section, replaces `hbck`.
[[HBCK2]]
=== HBase `HBCK2`
`HBCK2` is the successor to <<hbck>>, the hbase-1.x fix tool (A.K.A `hbck1`). Use it in place of `hbck1`
making repairs against hbase-2.x installs.
`HBCK2` does not ship as part of hbase. It can be found as a subproject of the companion
link:https://github.com/apache/hbase-operator-tools[hbase-operator-tools] repository at
link:https://github.com/apache/hbase-operator-tools/tree/master/hbase-hbck2[Apache HBase HBCK2 Tool].
`HBCK2` was moved out of hbase so it could evolve at a cadence apart from that of hbase core.
See the [https://github.com/apache/hbase-operator-tools/tree/master/hbase-hbck2](HBCK2) Home Page
for how `HBCK2` differs from `hbck1`, and for how to build and use it.
Once built, you can run `HBCK2` as follows:
```
$ hbase hbck -j /path/to/HBCK2.jar
```
This will generate `HBCK2` usage describing commands and options.
[[hfile_tool2]]
=== HFile Tool
@ -1272,15 +1294,6 @@ Monitor the output of the _/tmp/log.txt_ file to follow the progress of the scri
Use the following guidelines if you want to create your own rolling restart script.
. Extract the new release, verify its configuration, and synchronize it to all nodes of your cluster using `rsync`, `scp`, or another secure synchronization mechanism.
. Use the hbck utility to ensure that the cluster is consistent.
+
----
$ ./bin/hbck
----
+
Perform repairs if required.
See <<hbck,hbck>> for details.
. Restart the master first.
You may need to modify these commands if your new HBase directory is different from the old one, such as for an upgrade.
@ -1312,7 +1325,6 @@ $ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart --
----
. Restart the Master again, to clear out the dead servers list and re-enable the load balancer.
. Run the `hbck` utility again, to be sure the cluster is consistent.
[[adding.new.node]]
=== Adding a New Node
@ -2704,8 +2716,6 @@ HDFS replication factor only affects your disk usage and is invisible to most HB
You can view the current number of regions for a given table using the HMaster UI.
In the [label]#Tables# section, the number of online regions for each table is listed in the [label]#Online Regions# column.
This total only includes the in-memory state and does not include disabled or offline regions.
If you do not want to use the HMaster UI, you can determine the number of regions by counting the number of subdirectories of the /hbase/<table>/ subdirectories in HDFS, or by running the `bin/hbase hbck` command.
Each of these methods may return a slightly different number, depending on the status of each region.
[[ops.capacity.regions.count]]
==== Number of regions per RS - upper bound

View File

@ -331,7 +331,10 @@ As noted in the section <<basic.prerequisites>>, HBase 2.0+ requires a minimum o
.HBCK must match HBase server version
You *must not* use an HBase 1.x version of HBCK against an HBase 2.0+ cluster. HBCK is strongly tied to the HBase server version. Using the HBCK tool from an earlier release against an HBase 2.0+ cluster will destructively alter said cluster in unrecoverable ways.
As of HBase 2.0, HBCK is a read-only tool that can report the status of some non-public system internals. You should not rely on the format nor content of these internals to remain consistent across HBase releases.
As of HBase 2.0, HBCK (A.K.A _HBCK1_ or _hbck1_) is a read-only tool that can report the status of some non-public system internals. You should not rely on the format nor content of these internals to remain consistent across HBase releases.
To read about HBCK's replacement, see <<HBCK2>> in <<ops_mgt>>.
////
Link to a ref guide section on HBCK in 2.0 that explains use and calls out the inability of clients and server sides to detect version of each other.

View File

@ -86,7 +86,6 @@ include::_chapters/community.adoc[]
include::_chapters/appendix_contributing_to_documentation.adoc[]
include::_chapters/faq.adoc[]
include::_chapters/hbck_in_depth.adoc[]
include::_chapters/appendix_acl_matrix.adoc[]
include::_chapters/compression.adoc[]
include::_chapters/sql.adoc[]