HBASE-15612: Minor improvements to CellCounter and RowCounter documentation

Signed-off-by: stack <stack@apache.org>
This commit is contained in:
Esteban Gutierrez 2016-04-07 10:04:07 -07:00 committed by stack
parent 2eced6f039
commit a6e29676db
2 changed files with 11 additions and 4 deletions

View File

@ -64,10 +64,11 @@ import com.google.common.base.Preconditions;
* 6. Total number of versions of each qualifier. * 6. Total number of versions of each qualifier.
* </pre> * </pre>
* *
* The cellcounter takes two optional parameters one to use a user * The cellcounter can take optional parameters to use a user
* supplied row/family/qualifier string to use in the report and * supplied row/family/qualifier string to use in the report and
* second a regex based or prefix based row filter to restrict the * second a regex based or prefix based row filter to restrict the
* count operation to a limited subset of rows from the table. * count operation to a limited subset of rows from the table or a
* start time and/or end time to limit the count to a time range.
*/ */
@InterfaceAudience.Public @InterfaceAudience.Public
@InterfaceStability.Stable @InterfaceStability.Stable

View File

@ -304,12 +304,15 @@ The following utilities are available:
`RowCounter`:: `RowCounter`::
Count rows in an HBase table. Count rows in an HBase table.
`CellCounter`::
Count cells in an HBase table.
`replication.VerifyReplication`:: `replication.VerifyReplication`::
Compare the data from tables in two different clusters. Compare the data from tables in two different clusters.
WARNING: It doesn't work for incrementColumnValues'd cells since the timestamp is changed. WARNING: It doesn't work for incrementColumnValues'd cells since the timestamp is changed.
Note that this command is in a different package than the others. Note that this command is in a different package than the others.
Each command except `RowCounter` accepts a single `--help` argument to print usage instructions. Each command except `RowCounter` and `CellCounter` accept a single `--help` argument to print usage instructions.
[[hbck]] [[hbck]]
=== HBase `hbck` === HBase `hbck`
@ -619,7 +622,8 @@ To NOT run WALPlayer as a mapreduce job on your cluster, force it to run all in
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/RowCounter.html[RowCounter] is a mapreduce job to count all the rows of a table. link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/RowCounter.html[RowCounter] is a mapreduce job to count all the rows of a table.
This is a good utility to use as a sanity check to ensure that HBase can read all the blocks of a table if there are any concerns of metadata inconsistency. This is a good utility to use as a sanity check to ensure that HBase can read all the blocks of a table if there are any concerns of metadata inconsistency.
It will run the mapreduce all in a single process but it will run faster if you have a MapReduce cluster in place for it to exploit. It will run the mapreduce all in a single process but it will run faster if you have a MapReduce cluster in place for it to exploit. It is also possible to limit
the time range of data to be scanned by using the `--starttime=[starttime]` and `--endtime=[endtime]` flags.
---- ----
$ bin/hbase org.apache.hadoop.hbase.mapreduce.RowCounter <tablename> [<column1> <column2>...] $ bin/hbase org.apache.hadoop.hbase.mapreduce.RowCounter <tablename> [<column1> <column2>...]
@ -642,6 +646,8 @@ The statistics gathered by RowCounter are more fine-grained and include:
The program allows you to limit the scope of the run. The program allows you to limit the scope of the run.
Provide a row regex or prefix to limit the rows to analyze. Provide a row regex or prefix to limit the rows to analyze.
Specify a time range to scan the table by using the `--starttime=[starttime]` and `--endtime=[endtime]` flags.
Use `hbase.mapreduce.scan.column.family` to specify scanning a single column family. Use `hbase.mapreduce.scan.column.family` to specify scanning a single column family.
---- ----