Mukund Thakur
3937abddbd
HDFS-13660. DistCp job fails when new data is appended in the file while the DistCp copy job is running
...
This uses the length of the file known at the start of the copy to determine the amount of data to copy.
* If a file is appended to during the copy, the original bytes are copied.
* If a file is truncated during a copy, or the attempt to read the data fails with a truncated stream,
distcp will now fail. Until now these failures were not detected.
Contributed by Mukund Thakur.
Change-Id: I576a49d951fa48d37a45a7e4c82c47488aa8e884
(cherry picked from commit 51c64b357d
)
2020-02-27 16:37:03 -08:00
Akira Ajisaka
d4f75e2798
HADOOP-16808. Use forkCount and reuseForks parameters instead of forkMode in the config of maven surefire plugin. Contributed by Xieming Li.
...
(cherry picked from commit f6d20daf40
)
2020-01-21 18:03:56 +09:00
Steve Loughran
5410732cff
HADOOP-16775. DistCp reuses the same temp file within the task for different files.
...
Contributed by Amir Shenavandeh.
This avoids overwrite consistency issues with S3 and other stores -though
given S3's copy operation is O(data), you are still best of using -direct
when distcp-ing to it.
Change-Id: I8dc9f048ad0cc57ff01543b849da1ce4eaadf8c3
2020-01-02 15:37:55 +00:00
Rohith Sharma K S
7d5bb2ebb7
Preparing for 3.2.2-SNAPSHOT development.
2019-09-07 08:52:08 +05:30
KAI XIE
b3c14d4132
HADOOP-16158. DistCp to support checksum validation when copy blocks in parallel ( #919 )
...
* DistCp to support checksum validation when copy blocks in parallel
* address review comments
* add checksums comparison test for combine mode
(cherry picked from commit c765584eb2
)
2019-08-18 18:48:21 -07:00
Ayush Saxena
35ff1ce42c
HADOOP-16440. Distcp can not preserve timestamp with -delete option. Contributed by ludun.
2019-07-20 13:29:45 +05:30
Takanobu Asanuma
6dffad028e
HDFS-12564. Add the documents of swebhdfs configurations on the client side. Contributed by Takanobu Asanuma.
...
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
(cherry picked from commit 98d2065643
)
2019-06-20 20:17:45 -07:00
Takanobu Asanuma
a9a3450560
HADOOP-16331. Fix ASF License check in pom.xml. Contributed by Akira Ajisaka.
...
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2019-05-29 17:34:16 +09:00
Akira Ajisaka
855dc997d6
HADOOP-16323. https everywhere in Maven settings.
2019-05-27 15:27:33 +09:00
Andrew Olson
55603529d0
HADOOP-16294: Enable access to input options by DistCp subclasses.
...
Adding a protected-scope getter for the DistCpOptions, so that a subclass does
not need to save its own copy of the inputOptions supplied to its constructor,
if it wishes to override the createInputFileListing method with logic similar
to the original implementation, i.e. calling CopyListing#buildListing with a path and input options.
Author: Andrew Olson
(cherry picked from commit c15b3bca86
)
2019-05-16 16:13:12 +02:00
Masatake Iwasaki
03079be707
HADOOP-14544. DistCp documentation for command line options is misaligned. Contributed by Masatake Iwasaki.
...
(cherry picked from commit bbdbc7a9a1
)
2019-04-12 11:59:14 +09:00
Siyao Meng
52cfbc39cc
HADOOP-16037. DistCp: Document usage of Sync (-diff option) in detail.
...
Contributed by Siyao Meng
(cherry picked from commit ce4bafdf44
)
2019-03-26 18:43:43 +00:00
Andrew Olson
ade3af6ef2
HADOOP-16147. Allow CopyListing sequence file keys and values to be more easily customized.
...
Author: Andrew Olson
(cherry picked from commit faba3591d3
)
2019-03-22 10:36:34 +00:00
Ranith Sardar
c5eca3f7ee
HADOOP-16032. Distcp It should clear sub directory ACL before applying new ACL on.
...
Contributed by Ranith Sardar.
(cherry picked from commit 546c5d70ef
)
2019-02-07 21:49:18 +00:00
Andrew Olson
36f3e775d4
HADOOP-15281. Distcp to add no-rename copy option.
...
Contributed by Andrew Olson.
(cherry picked from commit de804e53b9
)
2019-02-07 10:09:13 +00:00
Kai Xie
5dce9d75e6
HADOOP-16018. DistCp won't reassemble chunks when blocks per chunk > 0.
...
Contributed by Kai Xie.
(cherry picked from commit 188bebbe7e
)
2019-01-08 13:34:51 +00:00
Arpit Agarwal
351bfa1bcf
HADOOP-12558. distcp documentation is woefully out of date. Contributed by Dinesh Chitlangia.
...
(cherry picked from commit 914b0cf15f
)
2018-11-15 13:58:29 -08:00
Ted Yu
a7dd244a49
HADOOP-15850. CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0. Contributed by Ted Yu.
...
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
(cherry picked from commit e2cecb681e
)
2018-10-19 13:22:01 -07:00
Sunil G
bde4fd5ed9
Preparing for 3.2.0 release
2018-10-18 17:07:45 +05:30
Surendra Singh Lilhore
96c4575d73
HDFS-13805. Journal Nodes should allow to format non-empty directories with -force option. Contributed by Surendra Singh Lilhore.
2018-08-24 08:14:57 +05:30
Akira Ajisaka
3e3963b035
HADOOP-15552. Move logging APIs over to slf4j in hadoop-tools - Part2. Contributed by Ian Pickering.
2018-08-16 00:31:59 +09:00
Steve Loughran
ca8b80bf59
HADOOP-15384. distcp numListstatusThreads option doesn't get to -delete scan.
...
Contributed by Steve Loughran.
2018-07-10 10:43:59 +01:00
Akira Ajisaka
2b2399d623
HADOOP-15495. Upgrade commons-lang version to 3.7 in hadoop-common-project and hadoop-tools. Contributed by Takanobu Asanuma.
2018-06-28 14:37:22 +09:00
Xiao Chen
7c9cdad6d0
HDFS-13056. Expose file-level composite CRCs in HDFS which are comparable across different instances/layouts. Contributed by Dennis Huo.
2018-04-10 21:31:48 -07:00
Steve Loughran
1976e0066e
HADOOP-15209. DistCp to eliminate needless deletion of files under already-deleted directories.
...
Contributed by Steve Loughran.
2018-03-15 18:05:14 +00:00
Chris Douglas
45cccadd2e
HDFS-12780. Fix spelling mistake in DistCpUtils.java. Contributed by Jianfei Jiang
2018-03-13 11:08:11 -07:00
Steve Loughran
7ef4d942dd
HADOOP-15273.distcp can't handle remote stores with different checksum algorithms.
...
Contributed by Steve Loughran.
2018-03-08 11:24:06 +00:00
Steve Loughran
3bd6b1fd85
HADOOP-15292. Distcp's use of pread is slowing it down.
...
Contributed by Virajith Jalaparti.
2018-03-08 11:15:46 +00:00
fang zhenyi
4d4dde5112
HADOOP-15223. Replace Collections.EMPTY* with empty* when available
...
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2018-02-18 22:19:39 +09:00
Wangda Tan
60f9e60b3b
Preparing for 3.2.0 development
...
Change-Id: I6d0e01f3d665d26573ef2b957add1cf0cddf7938
2018-02-11 11:17:38 +08:00
Anu Engineer
4304fcd5bd
HDFS-12990. Change default NameNode RPC port back to 8020. Contributed by Xiao Chen.
2018-02-06 13:43:45 -08:00
Arpit Agarwal
d4e13a4647
HADOOP-15198. Correct the spelling in CopyFilter.java. Contributed by Mukul Kumar Singh.
2018-02-02 11:37:51 -08:00
Surendra Singh Lilhore
00129c5314
HDFS-12833. Distcp : Update the usage of delete option for dependency with update and overwrite option. Contributed by usharani.
2017-12-12 00:28:02 +05:30
Akira Ajisaka
cc3f3eca40
MAPREDUCE-6999. Fix typo onf in DynamicInputChunk.java. Contributed by fang zhenyi.
2017-11-02 18:32:24 +09:00
Steve Loughran
f36cbc8475
HADOOP-14942. DistCp#cleanup() should check whether jobFS is null.
...
Contributed by Andras Bokor.
2017-10-20 22:27:04 +01:00
ChenSammi
e0b3c644e1
HDFS-12414. Ensure to use CLI command to enable/disable erasure coding policy. Contributed by Sammi Chen
2017-09-14 09:15:29 +08:00
Xiaoyu Yao
63720ef574
HADOOP-14839. DistCp log output should contain copied and deleted files and directories. Contributed by Yiqun Lin.
2017-09-05 23:34:55 -07:00
Andrew Wang
0d419c984f
Preparing for 3.1.0 development
2017-09-01 11:53:48 -07:00
Andrew Wang
f29a0fc288
HDFS-12303. Change default EC cell size to 1MB for better performance. Contributed by Wei Zhou.
2017-08-25 14:14:23 -07:00
Andrew Wang
dd7916d3cd
HDFS-12250. Reduce usage of FsPermissionExtension in unit tests. Contributed by Chris Douglas.
2017-08-17 09:35:36 -07:00
Sean Mackrory
1a1bf6b7d0
HADOOP-13595. Rework hadoop_usage to be broken up by clients/daemons/etc. Contributed by Allen Wittenauer.
2017-08-02 12:25:05 -06:00
Wei-Chiu Chuang
44350fdf49
HADOOP-14557. Document HADOOP-8143 (Change distcp to have -pb on by default). Contributed by Bharat Viswanadham.
2017-07-20 18:23:13 -07:00
Andrew Wang
af2773f609
Updating version for 3.0.0-beta1 development
2017-06-29 17:57:40 -07:00
Jason Lowe
dd65eea74b
HADOOP-8143. Change distcp to have -pb on by default. Contributed by Mithun Radhakrishnan
2017-06-20 09:53:47 -05:00
Andrew Wang
16ad896d5c
Update maven version for 3.0.0-alpha4 development
2017-05-26 14:09:44 -07:00
Sunil G
b6f66b0da1
YARN-6584. Correct license headers in hadoop-common, hdfs, yarn and mapreduce. Contributed by Yeliang Cang.
2017-05-22 14:10:06 +05:30
Yongjun Zhang
b4adc8392c
HADOOP-14407. DistCp - Introduce a configurable copy buffer size. (Omkar Aradhya K S via Yongjun Zhang)
2017-05-18 15:35:22 -07:00
Mingliang Liu
26172a94d6
HADOOP-14267. Make DistCpOptions immutable. Contributed by Mingliang Liu
2017-03-31 20:04:26 -07:00
Yongjun Zhang
bf3fb585aa
HADOOP-11794. Enable distcp to copy blocks in parallel. Contributed by Yongjun Zhang, Wei-Chiu Chuang, Xiao Chen, Rosie Li.
2017-03-30 17:38:56 -07:00
Yongjun Zhang
144f1cf765
Revert "HADOOP-11794. Enable distcp to copy blocks in parallel. Contributed by Yongjun Zhang, Wei-Chiu Chuang, Xiao Chen."
...
This reverts commit 064c8b25ec
.
2017-03-30 17:38:18 -07:00