diff --git a/hadoop-mapreduce-project/CHANGES.txt b/hadoop-mapreduce-project/CHANGES.txt index 6d1594cad88..f2a99b6ef07 100644 --- a/hadoop-mapreduce-project/CHANGES.txt +++ b/hadoop-mapreduce-project/CHANGES.txt @@ -171,6 +171,9 @@ Trunk (Unreleased) MAPREDUCE-5597. Missing alternatives in javadocs for deprecated constructors in mapreduce.Job (Akira AJISAKA via aw) + MAPREDUCE-5950. incorrect description in distcp2 document (Akira AJISAKA + via aw) + Release 2.6.0 - UNRELEASED INCOMPATIBLE CHANGES diff --git a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/DistCp.md.vm b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/DistCp.md.vm index 6271a927a47..41b381ab6ec 100644 --- a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/DistCp.md.vm +++ b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/DistCp.md.vm @@ -118,9 +118,9 @@ $H3 Basic Usage $H3 Update and Overwrite - `-update` is used to copy files from source that don't exist at the target, - or have different contents. `-overwrite` overwrites target-files even if they - exist at the source, or have the same contents. + `-update` is used to copy files from source that don't exist at the target + or differ than the target version. `-overwrite` overwrites target-files that + exist at the target. Update and Overwrite options warrant special attention, since their handling of source-paths varies from the defaults in a very subtle manner. Consider a @@ -201,7 +201,7 @@ Flag | Description | Notes `-log ` | Write logs to \ | DistCp keeps logs of each file it attempts to copy as map output. If a map fails, the log output will not be retained if it is re-executed. `-m ` | Maximum number of simultaneous copies | Specify the number of maps to copy data. Note that more maps may not necessarily improve throughput. `-overwrite` | Overwrite destination | If a map fails and `-i` is not specified, all the files in the split, not only those that failed, will be recopied. As discussed in the Usage documentation, it also changes the semantics for generating destination paths, so users should use this carefully. -`-update` | Overwrite if src size different from dst size | As noted in the preceding, this is not a "sync" operation. The only criterion examined is the source and destination file sizes; if they differ, the source file replaces the destination file. As discussed in the Usage documentation, it also changes the semantics for generating destination paths, so users should use this carefully. +`-update` | Overwrite if source and destination differ in size, blocksize, or checksum | As noted in the preceding, this is not a "sync" operation. The criteria examined are the source and destination file sizes, blocksizes, and checksums; if they differ, the source file replaces the destination file. As discussed in the Usage documentation, it also changes the semantics for generating destination paths, so users should use this carefully. `-f ` | Use list at \ as src list | This is equivalent to listing each source on the command line. The `urilist_uri` list should be a fully qualified URI. `-filelimit ` | Limit the total number of files to be <= n | **Deprecated!** Ignored in the new DistCp. `-sizelimit ` | Limit the total size to be <= n bytes | **Deprecated!** Ignored in the new DistCp.