diff --git a/CHANGES.txt b/CHANGES.txt
index 051b3101279..88ca9c8f0d2 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -472,6 +472,9 @@ Trunk (unreleased changes)
HADOOP-6099. The RPC module can be configured to not send period pings.
The default behaviour of sending periodic pings remain unchanged. (dhruba)
+ HADOOP-6142. Update documentation and use of harchives for relative paths
+ added in MAPREDUCE-739. (Mahadev Konar via cdouglas)
+
OPTIMIZATIONS
HADOOP-5595. NameNode does not need to run a replicator to choose a
diff --git a/bin/hadoop b/bin/hadoop
index 4c680212155..7618e1a0819 100755
--- a/bin/hadoop
+++ b/bin/hadoop
@@ -29,7 +29,7 @@ function print_usage(){
echo " version print the version"
echo " jar
- Usage: hadoop archive -archiveName name <src>* <dest>
+ Usage: hadoop archive -archiveName name -p <parent> <src>* <dest>
-archiveName is the name of the archive you would like to create. An example would be foo.har. The name should have a *.har extension. - The inputs are file system pathnames which work as usual with regular - expressions. The destination directory would contain the archive. + The parent argument is to specify the relative path to which the files should be + archived to. Example would be : +
-p /foo/bar a/b/c e/f/g
+ Here /foo/bar is the parent path and a/b/c, e/f/g are relative paths to parent. Note that this is a Map/Reduce job that creates the archives. You would - need a map reduce cluster to run this. The following is an example:
-
- hadoop archive -archiveName foo.har /user/hadoop/dir1 /user/hadoop/dir2 /user/zoo/
-
- In the above example /user/hadoop/dir1 and /user/hadoop/dir2 will be - archived in the following file system directory -- /user/zoo/foo.har. - The sources are not changed or removed when an archive is created. -
+ need a map reduce cluster to run this. For a detailed example the later sections. +If you just want to archive a single directory /foo/bar then you can just use
+ hadoop archive -archiveName zoo.har -p /foo/bar /outputdir
@@ -61,20 +60,58 @@ an error. URI for Hadoop Archives is
har://scheme-hostname:port/archivepath/fileinarchive
If no scheme is provided it assumes the underlying filesystem. - In that case the URI would look like -
- har:///archivepath/fileinarchive
- Here is an example of archive. The input to the archives is /dir. The directory dir contains - files filea, fileb. To archive /dir to /user/hadoop/foo.har, the command is -
-hadoop archive -archiveName foo.har /dir /user/hadoop
-
- To get file listing for files in the created archive -
-hadoop dfs -lsr har:///user/hadoop/foo.har
To cat filea in archive - -
hadoop dfs -cat har:///user/hadoop/foo.har/dir/filea
har:///archivepath/fileinarchive