HDFS-16556. Fix typos in distcp (#4217)
This commit is contained in:
parent
f84b88dd6b
commit
214f369073
|
@ -49,7 +49,7 @@ Overview
|
|||
|
||||
[The erstwhile implementation of DistCp]
|
||||
(http://hadoop.apache.org/docs/r1.2.1/distcp.html) has its share of quirks
|
||||
and drawbacks, both in its usage, as well as its extensibility and
|
||||
and drawbacks, both in its usage and its extensibility and
|
||||
performance. The purpose of the DistCp refactor was to fix these
|
||||
shortcomings, enabling it to be used and extended programmatically. New
|
||||
paradigms have been introduced to improve runtime and setup performance,
|
||||
|
@ -179,7 +179,7 @@ $H3 Update and Overwrite
|
|||
hdfs://nn2:8020/target/10 32
|
||||
hdfs://nn2:8020/target/20 64
|
||||
|
||||
Will effect:
|
||||
The result will be:
|
||||
|
||||
hdfs://nn2:8020/target/1 32
|
||||
hdfs://nn2:8020/target/2 32
|
||||
|
@ -190,7 +190,7 @@ $H3 Update and Overwrite
|
|||
because it doesn't exist at the target. `10` and `20` are overwritten since
|
||||
the contents don't match the source.
|
||||
|
||||
If `-update` is used, `1` is skipped because the file-length and contents match. `2` is copied because it doesn’t exist at the target. `10` and `20` are overwritten since the contents don’t match the source. However, if `-append` is additionally used, then only `10` is overwritten (source length less than destination) and `20` is appended with the change in file (if the files match up to the destination's original length).
|
||||
If `-update` is used, `1` is skipped because the file-length and contents match. `2` is copied because it doesn't exist at the target. `10` and `20` are overwritten since the contents don’t match the source. However, if `-append` is additionally used, then only `10` is overwritten (source length less than destination) and `20` is appended with the change in file (if the files match up to the destination's original length).
|
||||
|
||||
If `-overwrite` is used, `1` is overwritten as well.
|
||||
|
||||
|
@ -269,7 +269,7 @@ $H4 Experiment 1: Syncing diff of two adjacent snapshots
|
|||
|
||||
$H4 Experiment 2: syncing diff of two non-adjacent snapshots
|
||||
|
||||
First do a clean up from Experiment 1.
|
||||
First do a cleanup from Experiment 1.
|
||||
|
||||
hdfs dfs -rm -skipTrash /dst/1.txt
|
||||
|
||||
|
@ -514,7 +514,7 @@ $H3 InputFormats and MapReduce Components
|
|||
* A file with the same name exists at target, but `-overwrite` is
|
||||
specified.
|
||||
* A file with the same name exists at target, but differs in block-size
|
||||
(and block-size needs to be preserved.
|
||||
and block-size needs to be preserved.
|
||||
|
||||
* **CopyCommitter:** This class is responsible for the commit-phase of the
|
||||
DistCp job, including:
|
||||
|
@ -576,7 +576,7 @@ $H3 MapReduce and other side-effects
|
|||
map on a re-execution will be marked as "skipped".
|
||||
* If a map fails `mapreduce.map.maxattempts` times, the remaining map tasks
|
||||
will be killed (unless `-i` is set).
|
||||
* If `mapreduce.map.speculative` is set set final and true, the result of the
|
||||
* If `mapreduce.map.speculative` is set to be true, the result of the
|
||||
copy is undefined.
|
||||
|
||||
$H3 DistCp and Object Stores
|
||||
|
@ -691,7 +691,7 @@ Frequently Asked Questions
|
|||
directory is copied over, rather than the source-directory itself. This
|
||||
behaviour is consistent with the legacy DistCp implementation as well.
|
||||
|
||||
2. **How does the new DistCp differ in semantics from the Legacy DistCp?**
|
||||
2. **How does the new DistCp differs in semantics from the Legacy DistCp?**
|
||||
|
||||
* Files that are skipped during copy used to also have their
|
||||
file-attributes (permissions, owner/group info, etc.) unchanged, when
|
||||
|
|
Loading…
Reference in New Issue