diff --git a/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-2802.txt b/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-2802.txt index 52e635d6b7e..edbbf199954 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-2802.txt +++ b/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-2802.txt @@ -258,3 +258,5 @@ Branch-2802 Snapshot (Unreleased) HDFS-4717. Change the path parameter type of the snapshot methods in HdfsAdmin from String to Path. (szetszwo) + + HDFS-4708. Add snapshot user documentation. (szetszwo) diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsSnapshots.xml b/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsSnapshots.xml new file mode 100644 index 00000000000..3809143da0b --- /dev/null +++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsSnapshots.xml @@ -0,0 +1,262 @@ + + + + + + HFDS Snapshots + + + + +

HDFS Snapshots

+ + + + + + +
+

+ HDFS Snapshots are read-only point-in-time copies of the file system. + Snapshots can be taken on a subtree of the file system or the entire file system. + Some common use cases of snapshots are data backup, protection against user errors + and disaster recovery. +

+ +

+ The implementation of HDFS Snapshots is efficient: +

+ + + +

+ Snapshots can be taken on any directory once the directory has been set as + snapshottable. + A snapshottable directory is able to accommodate 65,536 simultaneous snapshots. + There is no limit on the number of snapshottable directories. + Administrators may set any directory to be snapshottable. + If there are snapshots in a snapshottable directory, + the directory can be neither deleted nor renamed + before all the snapshots are deleted. +

+ +
+ + +

+ For a snapshottable directory, + the path component ".snapshot" is used for accessing its snapshots. + Suppose /foo is a snapshottable directory, + /foo/bar is a file/directory in /foo, + and /foo has a snapshot s0. + Then, the path /foo/.snapshot/s0/bar + refers to the snapshot copy of /foo/bar. + The usual API and CLI can work with the ".snapshot" paths. + The following are some examples. +

+
    +
  • Listing all the snapshots under a snapshottable directory: + hdfs dfs -ls /foo/.snapshot
  • +
  • Listing the files in snapshot s0: + hdfs dfs -ls /foo/.snapshot/s0
  • +
  • Copying a file from snapshot s0: + hdfs dfs -cp /foo/.snapshot/s0/bar /tmp
  • +
+

+ The name ".snapshot" is now a reserved file name in HDFS + so that users cannot create a file/directory with ".snapshot" as the name. + If ".snapshot" is used in a previous version of HDFS, it must be renamed before upgrade; + otherwise, upgrade will fail. +

+
+
+ +
+ +

+ The operations described in this section require superuser privilege. +

+ +

Allow Snapshots

+

+ Allowing snapshots of a directory to be created. + If the operation completes successfully, the directory becomes snapshottable. +

+
    +
  • Command: + hdfs dfsadmin -allowSnapshot <path>
  • +
  • Arguments: + +
    pathThe path of the snapshottable directory.
  • +
+

+ See also the corresponding Java API + void allowSnapshot(Path path) in HdfsAdmin. +

+ +

Disallow Snapshots

+

+ Disallowing snapshots of a directory to be created. + All snapshots of the directory must be deleted before disallowing snapshots. +

+
    +
  • Command: + hdfs dfsadmin -disallowSnapshot <path>
  • +
  • Arguments: + +
    pathThe path of the snapshottable directory.
  • +
+

+ See also the corresponding Java API + void disallowSnapshot(Path path) in HdfsAdmin. +

+
+ + +

+ The section describes user operations. + Note that HDFS superuser can perform all the operations + without satisfying the permission requirement in the individual operations. +

+ +

Create Snapshots

+

+ Create a snapshot of a snapshottable directory. + This operation requires owner privilege of the snapshottable directory. +

+
    +
  • Command: + hdfs dfs -createSnapshot <path> [<snapshotName>]
  • +
  • Arguments: + + +
    pathThe path of the snapshottable directory.
    snapshotName + The snapshot name, which is an optional argument. + When it is omitted, a default name is generated using a timestamp with the format + "'s'yyyyMMdd-HHmmss.SSS", e.g. "s20130412-151029.033". +
  • +
+

+ See also the corresponding Java API + Path createSnapshot(Path path) and + Path createSnapshot(Path path, String snapshotName) + in FileSystem. + The snapshot path is returned in these methods. +

+ +

Delete Snapshots

+

+ Delete a snapshot of from a snapshottable directory. + This operation requires owner privilege of the snapshottable directory. +

+
    +
  • Command: + hdfs dfs -deleteSnapshot <path> <snapshotName>
  • +
  • Arguments: + + +
    pathThe path of the snapshottable directory.
    snapshotNameThe snapshot name.
  • +
+

+ See also the corresponding Java API + void deleteSnapshot(Path path, String snapshotName) + in FileSystem. +

+ +

Rename Snapshots

+

+ Rename a snapshot. + This operation requires owner privilege of the snapshottable directory. +

+
    +
  • Command: + hdfs dfs -renameSnapshot <path> <oldName> <newName>
  • +
  • Arguments: + + + +
    pathThe path of the snapshottable directory.
    oldNameThe old snapshot name.
    newNameThe new snapshot name.
  • +
+

+ See also the corresponding Java API + void renameSnapshot(Path path, String oldName, String newName) + in FileSystem. +

+ +

Get Snapshottable Directory Listing

+

+ Get all the snapshottable directories where the current user has permission to take snapshtos. +

+
    +
  • Command: + hdfs lsSnapshottableDir
  • +
  • Arguments: none
  • +
+

+ See also the corresponding Java API + SnapshottableDirectoryStatus[] getSnapshottableDirectoryListing() + in DistributedFileSystem. +

+ +

Get Snapshots Difference Report

+

+ Get the differences between two snapshots. + This operation requires read access privilege for all files/directories in both snapshots. +

+
    +
  • Command: + hdfs snapshotDiff <path> <fromSnapshot> <toSnapshot>
  • +
  • Arguments: + + + +
    pathThe path of the snapshottable directory.
    fromSnapshotThe name of the starting snapshot.
    toSnapshotThe name of the ending snapshot.
  • +
+

+ See also the corresponding Java API + SnapshotDiffReport getSnapshotDiffReport(Path path, String fromSnapshot, String toSnapshot) + in DistributedFileSystem. +

+ +
+
+ + +
diff --git a/hadoop-project/src/site/site.xml b/hadoop-project/src/site/site.xml index 7251aacac3a..63ba5f4749b 100644 --- a/hadoop-project/src/site/site.xml +++ b/hadoop-project/src/site/site.xml @@ -64,6 +64,7 @@ +