From a80c7a7e95b1037b632e40699945895d84d7b100 Mon Sep 17 00:00:00 2001 From: mbertozzi Date: Wed, 17 Apr 2013 15:33:21 +0000 Subject: [PATCH] HBASE-7410 add snapshot/clone/restore/export docs to ref guide git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1468972 13f79535-47bb-0310-9956-ffa450edef68 --- src/main/docbkx/ops_mgt.xml | 100 ++++++++++++++++++++++++++++++++++++ 1 file changed, 100 insertions(+) diff --git a/src/main/docbkx/ops_mgt.xml b/src/main/docbkx/ops_mgt.xml index 245cd89e2e1..09c149c7d92 100644 --- a/src/main/docbkx/ops_mgt.xml +++ b/src/main/docbkx/ops_mgt.xml @@ -713,6 +713,106 @@ false + +
+ HBase Snapshots + HBase Snapshots allow you to take a snapshot of a table without too much impact on Region Servers. + Snapshot, Clone and restore operations don't involve data copying. + Also, Exporting the snapshot to another cluster doesn't have impact on the Region Servers. + + Prior to version 0.94.6, the only way to backup or to clone a table is to use CopyTable/ExportTable, + or to copy all the hfiles in HDFS after disabling the table. + The disadvantages of these methods are that you can degrade region server performance + (Copy/Export Table) or you need to disable the table, that means no reads or writes; + and this is usually unacceptable. + +
Configuration + To turn on the snapshot support just set the + hbase.snapshot.enabled property to true. + (Snapshots are enabled by default in 0.95+ and off by default in 0.94.6+) + + <property> + <name>hbase.snapshot.enabled</name> + <value>true</value> + </property> + + +
+
Take a Snapshot + You can take a snapshot of a table regardless of whether it is enabled or disabled. + The snapshot operation doesn't involve any data copying. + + $ ./bin/hbase shell + hbase> snapshot 'myTable', 'myTableSnapshot-122112' + + +
+
Listing Snapshots + List all snapshots taken (by printing the names and relative information). + + $ ./bin/hbase shell + hbase> list_snapshots + + +
+
Deleting Snapshots + You can remove a snapshot, and the files retained for that snapshot will be removed + if no longer needed. + + $ ./bin/hbase shell + hbase> delete_snapshot 'myTableSnapshot-122112' + + +
+
Clone a table from snapshot + From a snapshot you can create a new table (clone operation) with the same data + that you had when the snapshot was taken. + The clone operation, doesn't involve data copies, and a change to the cloned table + doesn't impact the snapshot or the original table. + + $ ./bin/hbase shell + hbase> clone_snapshot 'myTableSnapshot-122112', 'myNewTestTable' + + +
+
Restore a snapshot + The restore operation requires the table to be disabled, and the table will be + restored to the state at the time when the snapshot was taken, + changing both data and schema if required. + + $ ./bin/hbase shell + hbase> disable 'myTable' + hbase> restore_snapshot 'myTableSnapshot-122112' + + + + Since Replication works at log level and snapshots at file-system level, + after a restore, the replicas will be in a different state from the master. + If you want to use restore, you need to stop replication and redo the bootstrap. + + + In case of partial data-loss due to misbehaving client, instead of a full restore + that requires the table to be disabled, you can clone the table from the snapshot + and use a Map-Reduce job to copy the data that you need, from the clone to the main one. + +
+
Snapshots operations and ACLs + If you are using security with the AccessController Coprocessor (See ), + only a global administrator can take, clone, or restore a snapshot, and these actions do not capture the ACL rights. + This means that restoring a table preserves the ACL rights of the existing table, + while cloning a table creates a new table that has no ACL rights until the administrator adds them. +
+
Export to another cluster + The ExportSnapshot tool copies all the data related to a snapshot (hfiles, logs, snapshot metadata) to another cluster. + The tool executes a Map-Reduce job, similar to distcp, to copy files between the two clusters, + and since it works at file-system level the hbase cluster does not have to be online. + To copy a snapshot called MySnapshot to an HBase cluster srv2 (hdfs:///srv2:8082/hbase) using 16 mappers: +$ bin/hbase class org.apache.hadoop.hbase.snapshot.tool.ExportSnapshot -snapshot MySnapshot -copy-to hdfs:///srv2:8082/hbase -mappers 16 + + +
+
+
Capacity Planning
Storage A common question for HBase administrators is estimating how much storage will be required for an HBase cluster.