Commit Graph

1133 Commits

Author SHA1 Message Date
Steve Loughran b6ebe74526
HADOOP-16233. S3AFileStatus to declare that isEncrypted() is always true (#685)
This is needed to fix up some confusion about caching of job.addCache() handling of S3A paths; all parent dirs -the files are downloaded by the NM without  using the DTs of the user submitting the job. This means that when you submit jobs to an EC2 cluster with lower IAM permissions than the user, cached resources don't get downloaded and the job doesn't start.

Production code changes:
* S3AFileStatus Adds "true" to the superclass's encrypted flag during construction.

Tests
* Base AbstractContractOpenTest can control whether zero byte files created in tests are encrypted. Not done via an XML attribute, just a subclass point. Thoughts?
* Verify that the filecache considers paths to not have the permissions which trigger reduce-privilege downloads
* And extend ITestDelegatedMRJob to test a completely different bucket (open street map), to verify that cached resources do get their tokens picked up

Docs:
* Advise FS developers to say all files are encrypted. It's otherwise harmless and it'll stop other people seeing impossible to debug error messages on app launch.

Contributed by Steve Loughran.

Change-Id: Ifaae4c9d735ccc5eafeebd2584b65daf2d4e5da3
(cherry picked from commit 366186d999)
2019-04-03 21:35:19 +01:00
Akira Ajisaka 80a8d3310e
HADOOP-16232. Fix errors in the checkstyle configration xmls. Contributed by Wanqiang Ji.
(cherry picked from commit 8b6deebb1d)
2019-04-03 19:36:17 +09:00
Steve Loughran 60c9042286
HADOOP-16058. S3A tests to include Terasort.
Contributed by Steve Loughran.

This includes
 - HADOOP-15890. Some S3A committer tests don't match ITest* pattern; don't run in maven
 - MAPREDUCE-7090. BigMapOutput example doesn't work with paths off cluster fs
 - MAPREDUCE-7091. Terasort on S3A to switch to new committers
 - MAPREDUCE-7092. MR examples to work better against cloud stores
2019-03-29 15:25:45 +00:00
Siyao Meng 52cfbc39cc
HADOOP-16037. DistCp: Document usage of Sync (-diff option) in detail.
Contributed by Siyao Meng

(cherry picked from commit ce4bafdf44)
2019-03-26 18:43:43 +00:00
Andrew Olson ade3af6ef2
HADOOP-16147. Allow CopyListing sequence file keys and values to be more easily customized.
Author:    Andrew Olson
(cherry picked from commit faba3591d3)
2019-03-22 10:36:34 +00:00
Weiwei Yang 39f60faa60 HADOOP-16191. AliyunOSS: improvements for copyFile/copyDirectory and logging. Contributed by wujinhu.
(cherry picked from commit 568d3ab8b6)
2019-03-19 10:08:11 +08:00
Adam Antal 81a6ba1825
HADOOP-16124. Extend documentation in testing.md about S3 endpoint constants.
Contributed by Adam Antal.

(cherry picked from commit c0427c84dd)
2019-03-18 19:14:43 +00:00
Ben Roling 43e8ac6097
HADOOP-15625. S3A input stream to use etags/version number to detect changed source files.
Author: Ben Roling <ben.roling@gmail.com>

Initial patch from Brahma Reddy Battula.
2019-03-14 19:46:34 +00:00
Steve Loughran b6f6c34223
HADOOP-16109. Parquet reading S3AFileSystem causes EOF
Nobody gets seek right. No matter how many times they think they have.

Reproducible test from: Dave Christianson
Fixed seek() logic: Steve Loughran
2019-03-11 11:15:25 +00:00
Da Zhou cfaf21a4ba
HADOOP-16169. ABFS: Bug fix for getPathProperties.
Author:    Da Zhou <da.zhou@microsoft.com>
(cherry picked from commit e0260417ad)
2019-03-08 13:53:44 +00:00
Da Zhou dc38fc598d
HADOOP-16136. ABFS: Should only transform username to short name
Contributed by Da Zhou.

(cherry picked from commit 3988e75ca3)
Signed-off-by: Steve Loughran <stevel@apache.org>
2019-03-05 10:47:58 +00:00
Da Zhou 075f6b061c
HADOOP-15954. ABFS: Enable owner and group conversion for MSI and login user using OAuth.
Contributed by Da Zhou and Junhua Gu.

(cherry picked from commit 1f1655028e)
Signed-off-by: Steve Loughran <stevel@apache.org>
2019-03-05 10:44:46 +00:00
Da Zhou ae832ccffe
HADOOP-16041. Include Hadoop version in User-Agent string for ABFS.
Contributed by Shweta Yakkali.

Signed-off-by: Sean Mackrory <mackrorysd@apache.org>
(cherry picked from commit 02eb91856e)
Signed-off-by: Steve Loughran <stevel@apache.org>
2019-03-05 10:39:37 +00:00
Steve Loughran 685a41f449
HADOOP-16105. WASB in secure mode does not set connectingUsingSAS.
Contributed by Steve Loughran.

(cherry picked from commit 9cb2f470b759bbe7609a00e8f8f72779e2daae80)
2019-02-21 13:39:37 +00:00
Masatake Iwasaki dc9c3ce30b HADOOP-16104. Wasb tests to downgrade to skip when test a/c is namespace enabled. Contributed by Masatake Iwasaki.
(cherry picked from commit aa3ad36605)
2019-02-20 22:17:18 +09:00
bibinchundatt bdfdf12178 YARN-9309. Improve graph text in SLS to avoid overlapping. Contributed by Bilwa S T.
(cherry picked from commit 779dae4de7)
2019-02-20 00:37:47 +05:30
bibinchundatt f06ac51c37 YARN-9293. Optimize MockAMLauncher event handling. Contributed by Bibin A Chundatt.
(cherry picked from commit 134ae8fc80)
2019-02-14 22:58:37 +05:30
Ranith Sardar c5eca3f7ee
HADOOP-16032. Distcp It should clear sub directory ACL before applying new ACL on.
Contributed by Ranith Sardar.

(cherry picked from commit 546c5d70ef)
2019-02-07 21:49:18 +00:00
Andrew Olson 36f3e775d4
HADOOP-15281. Distcp to add no-rename copy option.
Contributed by Andrew Olson.

(cherry picked from commit de804e53b9)
2019-02-07 10:09:13 +00:00
Da Zhou 84ce0f1bfa
HADOOP-16074. WASB: Update container not found error code.
Contributed by Da Zhou.

(cherry picked from commit ba9efe06fa)
2019-02-05 14:41:15 +00:00
Steve Loughran bdd17be9ec
HDFS-13713. Add specification of Multipart Upload API to FS specification, with contract tests.
Contributed by Ewan Higgs and Steve Loughran.

(cherry picked from commit c1d24f8483)
2019-02-04 17:10:19 +00:00
Akira Ajisaka dc12754ab6
HADOOP-16065. -Ddynamodb should be -Ddynamo in AWS SDK testing document.
(cherry picked from commit 3c60303ac5)
2019-01-25 10:28:46 +09:00
Da Zhou 29de303e0a
HADOOP-16048. ABFS: Fix Date format parser.
Contributed by Da Zhou.

(cherry picked from commit 00ad9e23e8)
2019-01-22 16:41:33 +00:00
Da Zhou 1d4390e16b
HADOOP-16044. ABFS: Better exception handling of DNS errors followup
Contributed by Da Zhou.

(cherry picked from commit 30863c5ae3)
2019-01-14 19:45:30 +00:00
Da Zhou 8b5fbe7a12
HADOOP-15975. ABFS: remove timeout check for DELETE and RENAME.
Contributed by Da Zhou.
2019-01-11 11:12:39 +00:00
Da Zhou 9cb6000c8a
HADOOP-16036. WASB: Disable jetty logging configuration announcement.
Contributed by Da Zhou.

(cherry picked from commit 852701f793)
2019-01-10 12:08:27 +00:00
Da Zhou 6c2500d7ca
HADOOP-15662. Better exception handling of DNS errors.
Contributed by Da Zhou.

(cherry picked from commit 7211269142)
2019-01-10 12:03:48 +00:00
Da Zhou f7de630e85
HADOOP-16040. ABFS: Bug fix for tolerateOobAppends configuration.
Contributed by Da Zhou.

(cherry picked from commit e8d1900369)
2019-01-10 11:59:29 +00:00
Kai Xie 5dce9d75e6
HADOOP-16018. DistCp won't reassemble chunks when blocks per chunk > 0.
Contributed by Kai Xie.

(cherry picked from commit 188bebbe7e)
2019-01-08 13:34:51 +00:00
Weiwei Yang 977e0ff8b9 HADOOP-16030. AliyunOSS: bring fixes back from HADOOP-15671. Contributed by wujinhu.
(cherry picked from commit f87b3b11c4)
2019-01-07 16:06:03 +08:00
Sunil G 71bee05339
Revert "HADOOP-15759. AliyunOSS: Update oss-sdk version to 3.0.0. Contributed by Jinhu Wu."
This reverts commit e4fca6aae4.

Revert "HADOOP-15671. AliyunOSS: Support Assume Roles in AliyunOSS. Contributed by Jinhu Wu."

This reverts commit 2b635125fb.

(cherry picked from commit 1f425271a7)
2019-01-05 17:36:15 +09:00
Weiwei Yang 38ef85171d HADOOP-15323. AliyunOSS: Improve copy file performance for AliyunOSSFileSystemStore. Contributed wujinhu.
(cherry picked from commit 040a202b20)
2019-01-03 21:40:43 +08:00
Da Zhou f122ae7279
HADOOP-16004. ABFS: Convert 404 error response in AbfsInputStream and AbfsOutPutStream to FileNotFoundException.
Contributed by Da Zhou.

(cherry picked from commit 346c0c8aff)
2018-12-17 11:18:12 +00:00
Da Zhou d09dbcc8fb
HADOOP-15972 ABFS: reduce list page size to to 500.
Contributed by Da Zhou.
2018-12-17 11:08:17 +00:00
Da Zhou 87d9a54968
HADOOP-15969. ABFS: getNamespaceEnabled can fail blocking user access thru ACLs.
Contributed by Da Zhou.

(cherry picked from commit b2523d8100)
2018-12-17 11:05:39 +00:00
Da Zhou 2d2212a508
HADOOP-15968. ABFS: add try catch for UGI failure when initializing ABFS.
Contributed by Da Zhou.

(cherry picked from commit a8bbd818d5)
2018-12-04 13:40:03 +00:00
Da Zhou 9bc1fd4721
HADOOP-15957. WASB: Add asterisk wildcard support for PageBlobDirSet.
Contributed by Da Zhou.

(cherry picked from commit 7ccb640a66)
2018-11-30 10:13:57 +00:00
Steve Loughran fa1d4ba7d4
HADOOP-15932. Oozie unable to create sharelib in s3a filesystem.
Contributed by Steve Loughran.

(cherry picked from commit 4c106fca0c)
2018-11-27 20:40:48 +00:00
Da Zhou 1a3a4960d9
HADOOP-15940. ABFS: For HNS account, avoid unnecessary get call when doing Rename.
Contributed by Da Zhou <da.zhou@microsoft.com>
2018-11-27 18:11:30 +00:00
Da Zhou f5d2806c81
HADOOP-15872. ABFS: Update to target 2018-11-09 REST version for ADLS Gen 2.
Contributed by Junhua Gu and Da Zhou.

(cherry picked from commit a8302e398c)
2018-11-23 14:19:36 +00:00
Weiwei Yang fea9d37ad5 HADOOP-15943. AliyunOSS: add missing owner & group attributes for oss FileStatus. Contributed by wujinhu.
(cherry picked from commit 5ff0cf86a9)
2018-11-23 14:10:30 +08:00
Weiwei Yang 0b2cfc8ab8 HADOOP-15919. AliyunOSS: Enable Yarn to use OSS. Contributed by wujinhu.
(cherry picked from commit be0708c6eb)
2018-11-19 14:21:38 +08:00
Arpit Agarwal 351bfa1bcf HADOOP-12558. distcp documentation is woefully out of date. Contributed by Dinesh Chitlangia.
(cherry picked from commit 914b0cf15f)
2018-11-15 13:58:29 -08:00
Akira Ajisaka 8c9681d7f0
HADOOP-15926. Document upgrading the section in NOTICE.txt when upgrading the version of AWS SDK. Contributed by Dinesh Chitlangia.
(cherry picked from commit 66b1335bb3)
2018-11-15 16:31:05 +09:00
Sammi Chen 37082a664a HADOOP-15917. AliyunOSS: fix incorrect ReadOps and WriteOps in statistics. Contributed by Jinhu Wu.
(cherry picked from commit 3fade865ce)
(cherry picked from commit 64cb97fb44)
(cherry picked from commit 5d532cfc6f)
2018-11-14 13:48:51 +08:00
Da Zhou 4039840510
HADOOP-15876. Use keySet().removeAll() to remove multiple keys from Map in AzureBlobFileSystemStore
Contributed by Da Zhou.

(cherry picked from commit a13be203b7)
2018-11-13 21:48:05 +00:00
Da Zhou 7440bc5a9c
HADOOP-15812. ABFS: Improve AbfsRestOperationException format to ensure full msg can be displayed on console.
Author:    Da Zhou <da.zhou@microsoft.com>
(cherry picked from commit 9dbb2b67c6)
2018-11-09 11:07:51 +00:00
Junhua Gu 66715005f9
HADOOP-15846. ABFS: fix mask related bugs in setAcl, modifyAclEntries and removeAclEntries.
Contributed by Junhua Gu.
2018-11-08 17:20:52 +00:00
Sammi Chen ca22bf175f HADOOP-15868. AliyunOSS: update document for properties of multiple part download, multiple part upload and directory copy. Contributed by Jinhu Wu.
(cherry picked from commit 7574d18538)
(cherry picked from commit 366541d834)
(cherry picked from commit c5a227062f)
2018-10-26 15:33:05 +08:00
Ted Yu a7dd244a49 HADOOP-15850. CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0. Contributed by Ted Yu.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
(cherry picked from commit e2cecb681e)
2018-10-19 13:22:01 -07:00
Giovanni Matteo Fumarola 0b6a7b5f16 HDFS-14007. Incompatible layout when generating FSImage. Contributed by Inigo Goiri. 2018-10-18 17:26:22 -07:00
Sunil G bde4fd5ed9 Preparing for 3.2.0 release 2018-10-18 17:07:45 +05:30
Steve Loughran c19864d8c8
HADOOP-15848. ITestS3AContractMultipartUploader#testMultipartUploadEmptyPart test error.
Contributed by Ewan Higgs.
2018-10-16 19:58:58 +01:00
bibinchundatt aff21d7271 YARN-8830. SLS tool fix node addition. Contributed by Bibin A Chundatt.
(cherry picked from commit b4a38e7b3e)
2018-10-15 18:11:01 +05:30
Steve Loughran 70d39c74d9
HADOOP-15837. DynamoDB table Update can fail S3A FS init.
Contributed by Steve Loughran.

(cherry picked from commit ee816f1fd7)
2018-10-11 14:58:37 +01:00
Steve Loughran 83b9b25c51
HADOOP-15809. ABFS: better exception handling when making getAccessToken call.
Contributed by Da Zhou

(cherry picked from commit 273cc2d4e9)
2018-10-05 11:29:43 +01:00
Steve Loughran c6942a315b
HADOOP-15792. typo in AzureBlobFileSystem.getIsNamespaceEnabeld.
Contributed by Abhishek Modi.

(cherry picked from commit e8b8604314)
2018-10-03 12:59:16 +01:00
Steve Loughran e5e9d7b595
HADOOP-15795. Make HTTPS the default protocol for ABFS.
Contributed by Da Zhou.

(cherry picked from commit 7051bd78b1)
2018-10-03 12:53:56 +01:00
Steve Loughran a383ac47ca
HADOOP-15801. ABFS: Fixing skipUserGroupMetadata in AzureBlobFileSystemStore.
Contributed by Da Zhou
2018-10-02 11:42:52 +01:00
Steve Loughran 43bc984891
HADOOP-15793. ABFS: Skip unsupported test cases when non namespace enabled in ITestAzureBlobFileSystemAuthorization
Contributed by Yuan Gao.
2018-10-02 11:37:28 +01:00
Steve Loughran a4abf02028
HADOOP-15739. ABFS: remove unused maven dependencies and add used undeclared dependencies.
Contributed by Da Zhou.
2018-09-25 20:58:32 +01:00
Steve Loughran d5da9928c9 HADOOP-15723. ABFS: Ranger Support.
Contributed by Yuan Gao.
2018-09-25 19:13:10 +01:00
Sammi Chen 2b635125fb HADOOP-15671. AliyunOSS: Support Assume Roles in AliyunOSS. Contributed by Jinhu Wu. 2018-09-25 19:48:30 +08:00
Mingliang Liu c07715e378 HADOOP-15781 S3A assumed role tests failing due to changed error text in AWS exceptions. Contributed by Steve Loughran 2018-09-24 12:53:21 -07:00
Sunil G d060cbea48 HDFS-13937. Multipart Uploader APIs to be marked as private/unstable in 3.2.0. Contributed by Steve Loughran. 2018-09-24 21:19:47 +05:30
Sean Mackrory 0def61482b Merge branch 'HADOOP-15407' into trunk 2018-09-22 21:19:12 -06:00
Steve Loughran d0b4624c88
HADOOP-15778. ABFS: Fix client side throttling for read.
Contributed by Sneha Varma.
2018-09-21 11:06:24 +01:00
Steve Loughran a5692c2da5 HADOOP-15704. Mark ABFS extension package and interfaces as LimitedPrivate/Unstable.
Contributed by Steve Loughran.
2018-09-20 17:36:18 +01:00
Sean Mackrory 8e831ba458 HADOOP-15773. Fixing checkstyle and other issues raised by Yetus. 2018-09-19 16:56:33 -06:00
Steve Loughran a55d26b23e
HADOOP-15769. ABFS: distcp tests are always skipped.
Contributed by Steve Loughran
2018-09-19 13:57:39 +01:00
Steve Loughran df2166a643
HADOOP-15719. Fail-fast when using OAuth over http.
Contributed by Da Zhou.
2018-09-18 12:20:52 +01:00
Steve Loughran 51d368982b
HADOOP-15714. Tune abfs/wasb parallel and sequential test execution.
Contributed by Da Zhou.
2018-09-18 12:09:25 +01:00
Steve Loughran 524776625d
HADOOP-15715. ITestAzureBlobFileSystemE2E timing out with non-scale timeout of 10 min.
Contributed by Da Zhou
2018-09-18 11:48:46 +01:00
Steve Loughran 1cf38a38da
HADOOP-15744. AbstractContractAppendTest fails against HDFS on HADOOP-15407 branch.
Contributed by Steve Loughran.
2018-09-18 10:56:56 +01:00
Steve Loughran 26d0c63a1e
HADOOP-15754. s3guard: testDynamoTableTagging should clear existing config.
Contributed by Gabor Bota.
2018-09-17 22:40:08 +01:00
Thomas Marquardt b4c23043d3 HADOOP-15757. ABFS: remove dependency on common-codec Base64.
Contributed by Da Zhou.
2018-09-17 19:54:01 +00:00
Thomas Marquardt 26211019c8 HADOOP-15753. ABFS: support path "abfs://mycluster/file/path"
Contributed by Da Zhou.
2018-09-17 19:54:01 +00:00
Thomas Marquardt e5593cbd83 HADOOP-15694. ABFS: Allow OAuth credentials to not be tied to accounts.
Contributed by Sean Mackrory.
2018-09-17 19:54:01 +00:00
Thomas Marquardt 13c70e9ba3 HADOOP-15740. ABFS: Check variable names during initialization of AbfsClientThrottlingIntercept.
Contributed by Sneha Varma.
2018-09-17 19:54:01 +00:00
Thomas Marquardt 6801b30733 HADOOP-15728. ABFS: Add backward compatibility to handle Unsupported Operation
for storage account with no namespace feature.

Contributed by Da Zhou.
2018-09-17 19:54:01 +00:00
Thomas Marquardt 347a52a866 Fixing findbugs and license issues related to:
HADOOP-15703. ABFS - Implement client-side throttling.
Contributed by Sneha Varma and Thomas Marquardt.
2018-09-17 19:54:01 +00:00
Thomas Marquardt 97f06b3fc7 HADOOP-15703. ABFS - Implement client-side throttling.
Contributed by Sneha Varma and Thomas Marquardt.
2018-09-17 19:54:01 +00:00
Thomas Marquardt 4410eacba7 HADOOP-15664. ABFS: Reduce test run time via parallelization and grouping.
Contributed by Da Zhou.
2018-09-17 19:54:01 +00:00
Thomas Marquardt 81dc4a995c HADOOP-15663. ABFS: Simplify configuration.
Contributed by Da Zhou.
2018-09-17 19:54:01 +00:00
Thomas Marquardt df57c6c3b1 HADOOP-15692. ABFS: extensible support for custom oauth.
Contributed by Junhua Gu and Rajeev Bansal.
2018-09-17 19:54:01 +00:00
Thomas Marquardt dd2b22fa31 HADOOP-15682. ABFS: Add support for StreamCapabilities. Fix javadoc and checkstyle.
Contributed by Thomas Marquardt.
2018-09-17 19:54:01 +00:00
Thomas Marquardt 6b6f8cc2be HADOOP 15688. ABFS: InputStream wrapped in FSDataInputStream twice.
Contributed by Sean Mackrory.
2018-09-17 19:54:01 +00:00
Thomas Marquardt 9c1e4e8139 HADOOP-15661. ABFS: Add support for ACL.
Contributed by Junhua Gu and Da Zhou.
2018-09-17 19:54:01 +00:00
Thomas Marquardt 9149b9703e HADOOP-15660. ABFS: Add support for OAuth
Contributed by Da Zhou, Rajeev Bansal, and Junhua Gu.
2018-09-17 19:54:01 +00:00
Thomas Marquardt d6a4f39bd5 HADOOP-15669. ABFS: Improve HTTPS Performance.
Contributed by Vishwajeet Dusane.
2018-09-17 19:54:01 +00:00
Thomas Marquardt cc5cc60c41 Fixing issue due to commit 2b2399d6 after rebase onto trunk. 2018-09-17 19:54:01 +00:00
Thomas Marquardt b54b0c1b67 HADOOP-15659. Code changes for bug fix and new tests.
Contributed by Da Zhou.
2018-09-17 19:54:01 +00:00
Thomas Marquardt ce03a93f78 HADOOP-15446. ABFS: tune imports & javadocs; stabilise tests.
Contributed by Steve Loughran and Da Zhou.
2018-09-17 19:54:01 +00:00
Steve Loughran a271fd0eca HADOOP-15560. ABFS: removed dependency injection and unnecessary dependencies.
Contributed by Da Zhou.
2018-09-17 19:54:01 +00:00
Steve Loughran f044deedbb HADOOP-15407. HADOOP-15540. Support Windows Azure Storage - Blob file system "ABFS" in Hadoop: Core Commit.
Contributed by Shane Mainali, Thomas Marquardt, Zichen Sun, Georgi Chalakov, Esfandiar Manii, Amit Singh, Dana Kaban, Da Zhou, Junhua Gu, Saher Ahwal, Saurabh Pant, James Baker, Shaoyu Zhang, Lawrence Chen, Kevin Chen and Steve Loughran
2018-09-17 19:54:01 +00:00
Steve Loughran d7c0a08a1c
HADOOP-15426 Make S3guard client resilient to DDB throttle events and network failures (Contributed by Steve Loughran) 2018-09-12 21:04:49 -07:00
Aaron Fabbri d32a8d5d58
HADOOP-14734 add option to tag DDB table(s) created. (Contributed by Gabor Bota and Abe Fine) 2018-09-12 16:36:01 -07:00
Mingliang Liu 1f6c4545cf HADOOP-15750. Remove obsolete S3A test ITestS3ACredentialsInURL. Contributed by Steve Loughran 2018-09-12 10:58:39 -07:00
Sean Mackrory 47b72c87eb HADOOP-15635. s3guard set-capacity command to fail fast if bucket is unguarded.
Contributed by Gabor Bota.
2018-09-12 09:12:38 -06:00
bibinchundatt c44088ac19 YARN-8739. Fix jenkins issues for Node Attributes branch. Contributed by Sunil Govindan. 2018-09-12 16:01:01 +05:30