Commit Graph

1111 Commits

Author SHA1 Message Date
Robert Levas ce23e971b4 HADOOP-16340. ABFS driver continues to retry on IOException responses from REST operations.
Contributed by Robert Levas.

This makes the HttpException constructor protected rather than public, so it is possible
to implement custom subclasses of this exception -exceptions which will not be retried.

Change-Id: Ie8aaa23a707233c2db35948784908b6778ff3a8f
2019-08-27 19:08:29 +00:00
Da Zhou a6d50a9054 HADOOP-16376. ABFS: Override access() to no-op.
Contributed by Da Zhou.

Change-Id: Ia0024bba32250189a87eb6247808b2473c331ed0
2019-08-27 19:04:16 +00:00
Da Zhou dd636127e9 HADOOP-16269. ABFS: add listFileStatus with StartFrom.
Author:    Da Zhou
2019-08-27 19:01:21 +00:00
Da Zhou 006ae258b3 HADOOP-16163. NPE in setup/teardown of ITestAbfsDelegationTokens.
Contributed by Da Zhou.

Signed-off-by: Steve Loughran <stevel@apache.org>
2019-08-27 19:01:21 +00:00
Akira Ajisaka afb3f329fd
YARN-9774. Fix order of arguments for assertEquals in TestSLSUtils. Contributed by Nikhil Navadiya.
(cherry picked from commit 84b1982060)
2019-08-23 14:40:15 +09:00
bibinchundatt 69255fa1b9 YARN-9765. SLS runner crashes when run with metrics turned off. Contributed by Abhishek Modi.
(cherry picked from commit 10ec31d20e)
2019-08-21 13:57:53 +05:30
KAI XIE b3c14d4132 HADOOP-16158. DistCp to support checksum validation when copy blocks in parallel (#919)
* DistCp to support checksum validation when copy blocks in parallel

* address review comments

* add checksums comparison test for combine mode

(cherry picked from commit c765584eb2)
2019-08-18 18:48:21 -07:00
Da Zhou 330e450397
HADOOP-16315. ABFS: transform full UPN for named user in AclStatus
Contributed by Da Zhou

Change-Id: Ibc78322415fcbeff89c06c8586c53f5695550290
2019-08-12 09:41:52 +08:00
Ayush Saxena 35ff1ce42c HADOOP-16440. Distcp can not preserve timestamp with -delete option. Contributed by ludun. 2019-07-20 13:29:45 +05:30
Arun Singh 5f2d07af1b
HADOOP-16404. ABFS default blocksize change(256MB from 512MB)
Contributed by: Arun Singh
2019-07-19 20:34:28 -07:00
Masatake Iwasaki b6718c754a HADOOP-16401. ABFS: port Azure doc to 3.2 branch.
Signed-off-by: Masatake Iwasaki <iwasakims@apache.org>
2019-07-10 17:16:43 +09:00
Takanobu Asanuma 6dffad028e HDFS-12564. Add the documents of swebhdfs configurations on the client side. Contributed by Takanobu Asanuma.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
(cherry picked from commit 98d2065643)
2019-06-20 20:17:45 -07:00
DadanielZ 9c8e40fbdb
HADOOP-16251. ABFS: add FSMainOperationsBaseTest. Re-commit to fix git metadata.
Author: Da Zhou
(cherry picked from commit ff27e8eabd)
2019-06-07 18:09:38 +01:00
Da Zhou bf0bb2470f
HADOOP-16242. ABFS: add bufferpool to AbfsOutputStream.
Contributed by Da Zhou.

(cherry picked from commit 1cef194a28)
2019-06-07 18:09:38 +01:00
Vishwajeet Dusane 907a016142
HADOOP-16182. Update abfs storage back-end with "close" flag when application is done writing to a file.
Contributed by Vishwajeet Dusane.

(cherry picked from commit 1edf1914ac)
2019-06-07 18:09:37 +01:00
Shweta Yakkali 6b115966bc
HADOOP-16157. [Clean-up] Remove NULL check before instanceof in AzureNativeFileSystemStore
(Contributed by Shweta Yakkali via Daniel Templeton)

Change-Id: I6269ae66378e46eed440a76f847ae1af1fa95450
(cherry picked from commit bb8ad096e7)
2019-06-07 18:09:37 +01:00
Shweta Yakkali 57c6060c3a
HADOOP-15860. ABFS: Throw exception when directory / file name ends with a period (.).
Contributed by Shweta Yakkali.

(cherry picked from commit 13f0ee21f2)

Change-Id: Ibd010d2e6adc15f53a9c5357482e57313bf84d2e
2019-06-07 18:09:37 +01:00
Da Zhou 3593b66693
HADOOP-15823. ABFS: Stop requiring client ID and tenant ID for MSI
(Contributed by Da Zhou via Daniel Templeton)

Change-Id: I546ab3a1df1efec635c08c388148e718dc4a9843
(cherry picked from commit e374584479)
2019-06-07 18:09:37 +01:00
Denes Gerencser ede5cbd707
HADOOP-16174. Disable wildfly logs to the console.
Follow-on to HADOOP-15851.

Author:    Denes Gerencser <dgerencser@cloudera.com>
(cherry picked from commit ddede7ae6f)
2019-06-07 18:09:37 +01:00
Steve Loughran 96489069b0
HADOOP-15851. Disable wildfly logs to the console.
Contributed by Vishwajeet Dusane.

(cherry picked from commit ef9dc6c44c)
2019-06-07 18:09:37 +01:00
Steve Loughran baa8670256
HADOOP-15825. ABFS: Enable some tests for namespace not enabled account using OAuth.
Contributed by Da Zhou.

(cherry picked from commit bd50fa956b)
2019-06-07 18:09:37 +01:00
Takanobu Asanuma a9a3450560 HADOOP-16331. Fix ASF License check in pom.xml. Contributed by Akira Ajisaka.
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2019-05-29 17:34:16 +09:00
Akira Ajisaka 855dc997d6
HADOOP-16323. https everywhere in Maven settings. 2019-05-27 15:27:33 +09:00
Andrew Olson 55603529d0
HADOOP-16294: Enable access to input options by DistCp subclasses.
Adding a protected-scope getter for the DistCpOptions, so that a subclass does
not need to save its own copy of the inputOptions supplied to its constructor,
if it wishes to override the createInputFileListing method with logic similar
to the original implementation, i.e. calling CopyListing#buildListing with a path and input options.

Author:    Andrew Olson
(cherry picked from commit c15b3bca86)
2019-05-16 16:13:12 +02:00
Weiwei Yang 26eb9f52fb HADOOP-16306. AliyunOSS: Remove temporary files when upload small files to OSS. Contributed by wujinhu.
(cherry picked from commit 2d8282bb82)
2019-05-14 14:06:42 -07:00
Rajat Khandelwal 12e0053932
HADOOP-16278. With S3A Filesystem, Long Running services End up Doing lot of GC and eventually die.
Contributed by Rajat Khandelwal

(cherry picked from commit 591ca69823)
2019-05-09 21:14:37 +01:00
Akira Ajisaka df5d8f05d9
HADOOP-16227. Upgrade checkstyle to 8.19
(cherry picked from commit 4b4fef2f0e)
2019-04-15 10:47:02 +09:00
Masatake Iwasaki 03079be707 HADOOP-14544. DistCp documentation for command line options is misaligned. Contributed by Masatake Iwasaki.
(cherry picked from commit bbdbc7a9a1)
2019-04-12 11:59:14 +09:00
Steve Loughran b6ebe74526
HADOOP-16233. S3AFileStatus to declare that isEncrypted() is always true (#685)
This is needed to fix up some confusion about caching of job.addCache() handling of S3A paths; all parent dirs -the files are downloaded by the NM without  using the DTs of the user submitting the job. This means that when you submit jobs to an EC2 cluster with lower IAM permissions than the user, cached resources don't get downloaded and the job doesn't start.

Production code changes:
* S3AFileStatus Adds "true" to the superclass's encrypted flag during construction.

Tests
* Base AbstractContractOpenTest can control whether zero byte files created in tests are encrypted. Not done via an XML attribute, just a subclass point. Thoughts?
* Verify that the filecache considers paths to not have the permissions which trigger reduce-privilege downloads
* And extend ITestDelegatedMRJob to test a completely different bucket (open street map), to verify that cached resources do get their tokens picked up

Docs:
* Advise FS developers to say all files are encrypted. It's otherwise harmless and it'll stop other people seeing impossible to debug error messages on app launch.

Contributed by Steve Loughran.

Change-Id: Ifaae4c9d735ccc5eafeebd2584b65daf2d4e5da3
(cherry picked from commit 366186d999)
2019-04-03 21:35:19 +01:00
Akira Ajisaka 80a8d3310e
HADOOP-16232. Fix errors in the checkstyle configration xmls. Contributed by Wanqiang Ji.
(cherry picked from commit 8b6deebb1d)
2019-04-03 19:36:17 +09:00
Steve Loughran 60c9042286
HADOOP-16058. S3A tests to include Terasort.
Contributed by Steve Loughran.

This includes
 - HADOOP-15890. Some S3A committer tests don't match ITest* pattern; don't run in maven
 - MAPREDUCE-7090. BigMapOutput example doesn't work with paths off cluster fs
 - MAPREDUCE-7091. Terasort on S3A to switch to new committers
 - MAPREDUCE-7092. MR examples to work better against cloud stores
2019-03-29 15:25:45 +00:00
Siyao Meng 52cfbc39cc
HADOOP-16037. DistCp: Document usage of Sync (-diff option) in detail.
Contributed by Siyao Meng

(cherry picked from commit ce4bafdf44)
2019-03-26 18:43:43 +00:00
Andrew Olson ade3af6ef2
HADOOP-16147. Allow CopyListing sequence file keys and values to be more easily customized.
Author:    Andrew Olson
(cherry picked from commit faba3591d3)
2019-03-22 10:36:34 +00:00
Weiwei Yang 39f60faa60 HADOOP-16191. AliyunOSS: improvements for copyFile/copyDirectory and logging. Contributed by wujinhu.
(cherry picked from commit 568d3ab8b6)
2019-03-19 10:08:11 +08:00
Adam Antal 81a6ba1825
HADOOP-16124. Extend documentation in testing.md about S3 endpoint constants.
Contributed by Adam Antal.

(cherry picked from commit c0427c84dd)
2019-03-18 19:14:43 +00:00
Ben Roling 43e8ac6097
HADOOP-15625. S3A input stream to use etags/version number to detect changed source files.
Author: Ben Roling <ben.roling@gmail.com>

Initial patch from Brahma Reddy Battula.
2019-03-14 19:46:34 +00:00
Steve Loughran b6f6c34223
HADOOP-16109. Parquet reading S3AFileSystem causes EOF
Nobody gets seek right. No matter how many times they think they have.

Reproducible test from: Dave Christianson
Fixed seek() logic: Steve Loughran
2019-03-11 11:15:25 +00:00
Da Zhou cfaf21a4ba
HADOOP-16169. ABFS: Bug fix for getPathProperties.
Author:    Da Zhou <da.zhou@microsoft.com>
(cherry picked from commit e0260417ad)
2019-03-08 13:53:44 +00:00
Da Zhou dc38fc598d
HADOOP-16136. ABFS: Should only transform username to short name
Contributed by Da Zhou.

(cherry picked from commit 3988e75ca3)
Signed-off-by: Steve Loughran <stevel@apache.org>
2019-03-05 10:47:58 +00:00
Da Zhou 075f6b061c
HADOOP-15954. ABFS: Enable owner and group conversion for MSI and login user using OAuth.
Contributed by Da Zhou and Junhua Gu.

(cherry picked from commit 1f1655028e)
Signed-off-by: Steve Loughran <stevel@apache.org>
2019-03-05 10:44:46 +00:00
Da Zhou ae832ccffe
HADOOP-16041. Include Hadoop version in User-Agent string for ABFS.
Contributed by Shweta Yakkali.

Signed-off-by: Sean Mackrory <mackrorysd@apache.org>
(cherry picked from commit 02eb91856e)
Signed-off-by: Steve Loughran <stevel@apache.org>
2019-03-05 10:39:37 +00:00
Steve Loughran 685a41f449
HADOOP-16105. WASB in secure mode does not set connectingUsingSAS.
Contributed by Steve Loughran.

(cherry picked from commit 9cb2f470b759bbe7609a00e8f8f72779e2daae80)
2019-02-21 13:39:37 +00:00
Masatake Iwasaki dc9c3ce30b HADOOP-16104. Wasb tests to downgrade to skip when test a/c is namespace enabled. Contributed by Masatake Iwasaki.
(cherry picked from commit aa3ad36605)
2019-02-20 22:17:18 +09:00
bibinchundatt bdfdf12178 YARN-9309. Improve graph text in SLS to avoid overlapping. Contributed by Bilwa S T.
(cherry picked from commit 779dae4de7)
2019-02-20 00:37:47 +05:30
bibinchundatt f06ac51c37 YARN-9293. Optimize MockAMLauncher event handling. Contributed by Bibin A Chundatt.
(cherry picked from commit 134ae8fc80)
2019-02-14 22:58:37 +05:30
Ranith Sardar c5eca3f7ee
HADOOP-16032. Distcp It should clear sub directory ACL before applying new ACL on.
Contributed by Ranith Sardar.

(cherry picked from commit 546c5d70ef)
2019-02-07 21:49:18 +00:00
Andrew Olson 36f3e775d4
HADOOP-15281. Distcp to add no-rename copy option.
Contributed by Andrew Olson.

(cherry picked from commit de804e53b9)
2019-02-07 10:09:13 +00:00
Da Zhou 84ce0f1bfa
HADOOP-16074. WASB: Update container not found error code.
Contributed by Da Zhou.

(cherry picked from commit ba9efe06fa)
2019-02-05 14:41:15 +00:00
Steve Loughran bdd17be9ec
HDFS-13713. Add specification of Multipart Upload API to FS specification, with contract tests.
Contributed by Ewan Higgs and Steve Loughran.

(cherry picked from commit c1d24f8483)
2019-02-04 17:10:19 +00:00
Akira Ajisaka dc12754ab6
HADOOP-16065. -Ddynamodb should be -Ddynamo in AWS SDK testing document.
(cherry picked from commit 3c60303ac5)
2019-01-25 10:28:46 +09:00