HDFS-16556. Fix typos in distcp (#4217)
This commit is contained in:
parent
f84b88dd6b
commit
214f369073
|
@ -49,7 +49,7 @@ Overview
|
||||||
|
|
||||||
[The erstwhile implementation of DistCp]
|
[The erstwhile implementation of DistCp]
|
||||||
(http://hadoop.apache.org/docs/r1.2.1/distcp.html) has its share of quirks
|
(http://hadoop.apache.org/docs/r1.2.1/distcp.html) has its share of quirks
|
||||||
and drawbacks, both in its usage, as well as its extensibility and
|
and drawbacks, both in its usage and its extensibility and
|
||||||
performance. The purpose of the DistCp refactor was to fix these
|
performance. The purpose of the DistCp refactor was to fix these
|
||||||
shortcomings, enabling it to be used and extended programmatically. New
|
shortcomings, enabling it to be used and extended programmatically. New
|
||||||
paradigms have been introduced to improve runtime and setup performance,
|
paradigms have been introduced to improve runtime and setup performance,
|
||||||
|
@ -179,7 +179,7 @@ $H3 Update and Overwrite
|
||||||
hdfs://nn2:8020/target/10 32
|
hdfs://nn2:8020/target/10 32
|
||||||
hdfs://nn2:8020/target/20 64
|
hdfs://nn2:8020/target/20 64
|
||||||
|
|
||||||
Will effect:
|
The result will be:
|
||||||
|
|
||||||
hdfs://nn2:8020/target/1 32
|
hdfs://nn2:8020/target/1 32
|
||||||
hdfs://nn2:8020/target/2 32
|
hdfs://nn2:8020/target/2 32
|
||||||
|
@ -190,7 +190,7 @@ $H3 Update and Overwrite
|
||||||
because it doesn't exist at the target. `10` and `20` are overwritten since
|
because it doesn't exist at the target. `10` and `20` are overwritten since
|
||||||
the contents don't match the source.
|
the contents don't match the source.
|
||||||
|
|
||||||
If `-update` is used, `1` is skipped because the file-length and contents match. `2` is copied because it doesn’t exist at the target. `10` and `20` are overwritten since the contents don’t match the source. However, if `-append` is additionally used, then only `10` is overwritten (source length less than destination) and `20` is appended with the change in file (if the files match up to the destination's original length).
|
If `-update` is used, `1` is skipped because the file-length and contents match. `2` is copied because it doesn't exist at the target. `10` and `20` are overwritten since the contents don’t match the source. However, if `-append` is additionally used, then only `10` is overwritten (source length less than destination) and `20` is appended with the change in file (if the files match up to the destination's original length).
|
||||||
|
|
||||||
If `-overwrite` is used, `1` is overwritten as well.
|
If `-overwrite` is used, `1` is overwritten as well.
|
||||||
|
|
||||||
|
@ -269,7 +269,7 @@ $H4 Experiment 1: Syncing diff of two adjacent snapshots
|
||||||
|
|
||||||
$H4 Experiment 2: syncing diff of two non-adjacent snapshots
|
$H4 Experiment 2: syncing diff of two non-adjacent snapshots
|
||||||
|
|
||||||
First do a clean up from Experiment 1.
|
First do a cleanup from Experiment 1.
|
||||||
|
|
||||||
hdfs dfs -rm -skipTrash /dst/1.txt
|
hdfs dfs -rm -skipTrash /dst/1.txt
|
||||||
|
|
||||||
|
@ -514,7 +514,7 @@ $H3 InputFormats and MapReduce Components
|
||||||
* A file with the same name exists at target, but `-overwrite` is
|
* A file with the same name exists at target, but `-overwrite` is
|
||||||
specified.
|
specified.
|
||||||
* A file with the same name exists at target, but differs in block-size
|
* A file with the same name exists at target, but differs in block-size
|
||||||
(and block-size needs to be preserved.
|
and block-size needs to be preserved.
|
||||||
|
|
||||||
* **CopyCommitter:** This class is responsible for the commit-phase of the
|
* **CopyCommitter:** This class is responsible for the commit-phase of the
|
||||||
DistCp job, including:
|
DistCp job, including:
|
||||||
|
@ -576,7 +576,7 @@ $H3 MapReduce and other side-effects
|
||||||
map on a re-execution will be marked as "skipped".
|
map on a re-execution will be marked as "skipped".
|
||||||
* If a map fails `mapreduce.map.maxattempts` times, the remaining map tasks
|
* If a map fails `mapreduce.map.maxattempts` times, the remaining map tasks
|
||||||
will be killed (unless `-i` is set).
|
will be killed (unless `-i` is set).
|
||||||
* If `mapreduce.map.speculative` is set set final and true, the result of the
|
* If `mapreduce.map.speculative` is set to be true, the result of the
|
||||||
copy is undefined.
|
copy is undefined.
|
||||||
|
|
||||||
$H3 DistCp and Object Stores
|
$H3 DistCp and Object Stores
|
||||||
|
@ -691,7 +691,7 @@ Frequently Asked Questions
|
||||||
directory is copied over, rather than the source-directory itself. This
|
directory is copied over, rather than the source-directory itself. This
|
||||||
behaviour is consistent with the legacy DistCp implementation as well.
|
behaviour is consistent with the legacy DistCp implementation as well.
|
||||||
|
|
||||||
2. **How does the new DistCp differ in semantics from the Legacy DistCp?**
|
2. **How does the new DistCp differs in semantics from the Legacy DistCp?**
|
||||||
|
|
||||||
* Files that are skipped during copy used to also have their
|
* Files that are skipped during copy used to also have their
|
||||||
file-attributes (permissions, owner/group info, etc.) unchanged, when
|
file-attributes (permissions, owner/group info, etc.) unchanged, when
|
||||||
|
|
Loading…
Reference in New Issue