diff --git a/src/docbkx/book.xml b/src/docbkx/book.xml
index 3c1216961c1..7f077d29b5a 100644
--- a/src/docbkx/book.xml
+++ b/src/docbkx/book.xml
@@ -271,6 +271,12 @@ for(Result result : htable.getScanner(scan)) {
HTable.delete.
+ HBase does not modify data in place, and so deletes are handled by creating new markers called tombstones.
+ These tombstones, along with the dead values, are cleaned up on major compactions.
+
+ See for more information on deleting versions of columns.
+
+
@@ -428,28 +434,20 @@ htable.put(put);
-
+ Delete
- When performing a delete operation in HBase, there are two
- ways to specify the versions to be deleted
-
-
-
- Delete all versions older than a certain timestamp
+ There are three different types of internal delete markers:
+
+ Delete: for a specific version of a column.
-
-
- Delete the version at a specific timestamp
+ Delete column: for all versions of a column.
+
+ Delete family: for all columns of a particular ColumnFamily
-
- A delete can apply to a complete row, a complete column
- family, or to just one column. It is only in the last case that you
- can delete explicit versions. For the deletion of a row or all the
- columns within a family, it always works by deleting all cells older
- than a certain version.
-
+ When deleting an entire row, HBase will internally create a tombstone for each ColumnFamily (i.e., not each individual column).
+ Deletes work by creating tombstone
markers. For example, let's suppose we want to delete a row. For
this you can specify a version, or else by default the
@@ -466,8 +464,10 @@ htable.put(put);
. If the version you specified when deleting a row is
larger than the version of any value in the row, then you can
consider the complete row to be deleted.
+ Also see for more information on the internal KeyValue format.
+
-
+
Current Limitations
@@ -1113,6 +1113,20 @@ if (!b) {
}
+
+ HBase MapReduce Summary Without Reducer
+ It is also possible to perform summaries without a reducer - if you use HBase as the reducer.
+
+ There would need to exist an HTable target table for the job summary. The HTable method incrementColumnValue
+ would be used to atomically increment values. From a performance perspective, it might make sense to keep a Map
+ of values with their values to be incremeneted for each map-task, and make one update per key at during the
+ cleanup method of the mapper. However, your milage may vary depending on the number of rows to be processed and
+ unique keys.
+
+ In the end, the summary results are in HBase.
+
+
+
Accessing Other HBase Tables in a MapReduce Job
diff --git a/src/docbkx/ops_mgt.xml b/src/docbkx/ops_mgt.xml
index fa534842564..d86d2f6fa6e 100644
--- a/src/docbkx/ops_mgt.xml
+++ b/src/docbkx/ops_mgt.xml
@@ -132,6 +132,30 @@
+
+
+ Region Management
+
+ Major Compaction
+ Major compactions can be requested via the HBase shell or HBaseAdmin.majorCompact.
+
+ Note: major compactions do NOT do region merges. See for more information about compactions.
+
+
+
+
+ Merge
+ Merge is a utility that can merge adjoining regions in the same table (see org.apache.hadoop.hbase.util.Merge).
+$ bin/hbase org.apache.hbase.util.Merge <tablename> <region1> <region2>
+
+ If you feel you have too many regions and want to consolidate them, Merge is the utility you need. Merge must
+ run be done when the cluster is down.
+ See the O'Reilly HBase Book for
+ an example of usage.
+
+
+
+ Node ManagementNode Decommission
@@ -340,7 +364,6 @@ false
See Cluster Replication.
-
HBase BackupThere are two broad strategies for performing HBase backups: backing up with a full cluster shutdown, and backing up on a live cluster.