hadoop/hadoop-tools/hadoop-distcp
Steve Loughran ee466d4b40
HADOOP-17628. Distcp contract test is really slow with ABFS and S3A; timing out. (#3240)
This patch cuts down the size of directory trees used for
distcp contract tests against object stores, so making
them much faster against distant/slow stores.

On abfs, the test only runs with -Dscale (as was the case for s3a already),
and has the larger scale test timeout.

After every test case, the FileSystem IOStatistics are logged,
to provide information about what IO is taking place and
what it's performance is.

There are some test cases which upload files of 1+ MiB; you can
increase the size of the upload in the option
"scale.test.distcp.file.size.kb" 
Set it to zero and the large file tests are skipped.

Contributed by Steve Loughran.
2021-08-02 11:36:43 +01:00
..
src HADOOP-17628. Distcp contract test is really slow with ABFS and S3A; timing out. (#3240) 2021-08-02 11:36:43 +01:00
README HADOOP-11437. Remove the version and author information from distcp's README file (Brahma Reddy Battula via aw) 2015-02-11 15:47:36 -08:00
pom.xml HADOOP-17753. Keep restrict-imports-enforcer-rule for Guava Lists in top level hadoop-main pom (#3087) 2021-06-11 12:15:52 +09:00

README

DistCp (distributed copy) is a tool used for large inter/intra-cluster copying. 
It uses Map/Reduce to effect its distribution, error handling and recovery, 
and reporting. It expands a list of files and directories into input to map tasks, 
each of which will copy a partition of the files specified in the source list.