hadoop/hadoop-tools/hadoop-distcp
Steve Loughran caec6a1945 HADOOP-16775. DistCp reuses the same temp file within the task for different files.
Contributed by Amir Shenavandeh.

This avoids overwrite consistency issues with S3 and other stores

Change-Id: Ic4d05ef3397e963ba28fd9f775bb362b0da36ad9
2020-03-13 19:34:50 +00:00
..
src HADOOP-16775. DistCp reuses the same temp file within the task for different files. 2020-03-13 19:34:50 +00:00
README HADOOP-11437. Remove the version and author information from distcp's README file (Brahma Reddy Battula via aw) 2015-02-11 15:47:36 -08:00
pom.xml HADOOP-16808. Use forkCount and reuseForks parameters instead of forkMode in the config of maven surefire plugin. Contributed by Xieming Li. 2020-01-21 18:05:13 +09:00

README

DistCp (distributed copy) is a tool used for large inter/intra-cluster copying. 
It uses Map/Reduce to effect its distribution, error handling and recovery, 
and reporting. It expands a list of files and directories into input to map tasks, 
each of which will copy a partition of the files specified in the source list.