Commit Graph

241 Commits

Author SHA1 Message Date
Chao Sun d82b0cc439
HADOOP-16080. hadoop-aws does not work with hadoop-client-api (#2510). Contributed by Chao Sun 2020-12-03 16:06:33 -08:00
Steve Loughran 134539f054
HADOOP-17199. S3A Directory Marker HADOOP-13230 backport #2210)
This backports the listing-side changes of HADOOP-13230.

With this patch in, this branch of Hadoop is compatible with S3A clients
which do not delete directory markers when files are created underneath.

It does not allow this version to disable marker deletion; if the
fs.s3a.marker.retention option is changed to request this, a message
is printed at INFO and the policy remains at "delete"

The s3guard bucket-info command has been extended to support
probing for marker retention, as has the hasPathCapability method on
S3AFileSystem.

Read the documentation!
2020-08-25 22:47:43 +01:00
Steve Loughran 42c71a5790
HADOOP-15691. Add PathCapabilities to FileSystem and FileContext.
Contributed by Steve Loughran.

This complements the StreamCapabilities Interface by allowing applications to probe for a specific path on a specific instance of a FileSystem client
to offer a specific capability.

This is intended to allow applications to determine

* Whether a method is implemented before calling it and dealing with UnsupportedOperationException.
* Whether a specific feature is believed to be available in the remote store.

As well as a common set of capabilities defined in CommonPathCapabilities,
file systems are free to add their own capabilities, prefixed with
 fs. + schema + .

The plan is to identify and document more capabilities -and for file systems which add new features, for a declaration of the availability of the feature to always be available.

Note

* The remote store is not expected to be checked for the feature;
  It is more a check of client API and the client's configuration/knowledge
  of the state of the remote system.
* Permissions are not checked.
2020-08-19 17:15:06 +01:00
Ayush Saxena d6a9ed8140 HDFS-15514. Remove useless dfs.webhdfs.enabled. Contributed by Fei Hui. 2020-08-07 22:23:02 +05:30
Masatake Iwasaki 77e69a73da HADOOP-17040. Fix intermittent failure of ITestBlockingThreadPoolExecutorService. (#2020)
(cherry picked from commit 9685314633)
2020-05-22 21:30:57 +09:00
Masatake Iwasaki 89696b66e7 HADOOP-17025. Fix invalid metastore configuration in S3GuardTool tests. (#1994)
(cherry picked from commit 99840aaba6)

 Conflicts:
	hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/AbstractS3GuardToolTestBase.java
2020-05-07 12:11:37 +09:00
Steve Loughran 204d54005a HADOOP-16117. Update AWS SDK to 1.11.563.
Contributed by Steve Loughran.

Change-Id: I7c46ed2a6378e1370f567acf4cdcfeb93e43fa13
2020-04-24 10:46:24 +01:00
Mingliang Liu d19981fe48
HADOOP-16758. Refine testing.md to tell user better how to use auth-keys.xml (#1753)
Contributed by Mingliang Liu
2019-12-11 11:54:12 -08:00
Mingliang Liu 8a60429e0b
HADOOP-16735. Make it clearer in config default that EnvironmentVariableCredentialsProvider supports AWS_SESSION_TOKEN. Contributed by Mingliang Liu
This closes #1733
2019-12-05 17:50:28 -08:00
Rajat Khandelwal 12e0053932
HADOOP-16278. With S3A Filesystem, Long Running services End up Doing lot of GC and eventually die.
Contributed by Rajat Khandelwal

(cherry picked from commit 591ca69823)
2019-05-09 21:14:37 +01:00
Steve Loughran b6ebe74526
HADOOP-16233. S3AFileStatus to declare that isEncrypted() is always true (#685)
This is needed to fix up some confusion about caching of job.addCache() handling of S3A paths; all parent dirs -the files are downloaded by the NM without  using the DTs of the user submitting the job. This means that when you submit jobs to an EC2 cluster with lower IAM permissions than the user, cached resources don't get downloaded and the job doesn't start.

Production code changes:
* S3AFileStatus Adds "true" to the superclass's encrypted flag during construction.

Tests
* Base AbstractContractOpenTest can control whether zero byte files created in tests are encrypted. Not done via an XML attribute, just a subclass point. Thoughts?
* Verify that the filecache considers paths to not have the permissions which trigger reduce-privilege downloads
* And extend ITestDelegatedMRJob to test a completely different bucket (open street map), to verify that cached resources do get their tokens picked up

Docs:
* Advise FS developers to say all files are encrypted. It's otherwise harmless and it'll stop other people seeing impossible to debug error messages on app launch.

Contributed by Steve Loughran.

Change-Id: Ifaae4c9d735ccc5eafeebd2584b65daf2d4e5da3
(cherry picked from commit 366186d999)
2019-04-03 21:35:19 +01:00
Steve Loughran 60c9042286
HADOOP-16058. S3A tests to include Terasort.
Contributed by Steve Loughran.

This includes
 - HADOOP-15890. Some S3A committer tests don't match ITest* pattern; don't run in maven
 - MAPREDUCE-7090. BigMapOutput example doesn't work with paths off cluster fs
 - MAPREDUCE-7091. Terasort on S3A to switch to new committers
 - MAPREDUCE-7092. MR examples to work better against cloud stores
2019-03-29 15:25:45 +00:00
Adam Antal 81a6ba1825
HADOOP-16124. Extend documentation in testing.md about S3 endpoint constants.
Contributed by Adam Antal.

(cherry picked from commit c0427c84dd)
2019-03-18 19:14:43 +00:00
Ben Roling 43e8ac6097
HADOOP-15625. S3A input stream to use etags/version number to detect changed source files.
Author: Ben Roling <ben.roling@gmail.com>

Initial patch from Brahma Reddy Battula.
2019-03-14 19:46:34 +00:00
Steve Loughran b6f6c34223
HADOOP-16109. Parquet reading S3AFileSystem causes EOF
Nobody gets seek right. No matter how many times they think they have.

Reproducible test from: Dave Christianson
Fixed seek() logic: Steve Loughran
2019-03-11 11:15:25 +00:00
Andrew Olson 36f3e775d4
HADOOP-15281. Distcp to add no-rename copy option.
Contributed by Andrew Olson.

(cherry picked from commit de804e53b9)
2019-02-07 10:09:13 +00:00
Steve Loughran bdd17be9ec
HDFS-13713. Add specification of Multipart Upload API to FS specification, with contract tests.
Contributed by Ewan Higgs and Steve Loughran.

(cherry picked from commit c1d24f8483)
2019-02-04 17:10:19 +00:00
Akira Ajisaka dc12754ab6
HADOOP-16065. -Ddynamodb should be -Ddynamo in AWS SDK testing document.
(cherry picked from commit 3c60303ac5)
2019-01-25 10:28:46 +09:00
Steve Loughran fa1d4ba7d4
HADOOP-15932. Oozie unable to create sharelib in s3a filesystem.
Contributed by Steve Loughran.

(cherry picked from commit 4c106fca0c)
2018-11-27 20:40:48 +00:00
Akira Ajisaka 8c9681d7f0
HADOOP-15926. Document upgrading the section in NOTICE.txt when upgrading the version of AWS SDK. Contributed by Dinesh Chitlangia.
(cherry picked from commit 66b1335bb3)
2018-11-15 16:31:05 +09:00
Steve Loughran c19864d8c8
HADOOP-15848. ITestS3AContractMultipartUploader#testMultipartUploadEmptyPart test error.
Contributed by Ewan Higgs.
2018-10-16 19:58:58 +01:00
Steve Loughran 70d39c74d9
HADOOP-15837. DynamoDB table Update can fail S3A FS init.
Contributed by Steve Loughran.

(cherry picked from commit ee816f1fd7)
2018-10-11 14:58:37 +01:00
Mingliang Liu c07715e378 HADOOP-15781 S3A assumed role tests failing due to changed error text in AWS exceptions. Contributed by Steve Loughran 2018-09-24 12:53:21 -07:00
Sunil G d060cbea48 HDFS-13937. Multipart Uploader APIs to be marked as private/unstable in 3.2.0. Contributed by Steve Loughran. 2018-09-24 21:19:47 +05:30
Steve Loughran 26d0c63a1e
HADOOP-15754. s3guard: testDynamoTableTagging should clear existing config.
Contributed by Gabor Bota.
2018-09-17 22:40:08 +01:00
Steve Loughran d7c0a08a1c
HADOOP-15426 Make S3guard client resilient to DDB throttle events and network failures (Contributed by Steve Loughran) 2018-09-12 21:04:49 -07:00
Aaron Fabbri d32a8d5d58
HADOOP-14734 add option to tag DDB table(s) created. (Contributed by Gabor Bota and Abe Fine) 2018-09-12 16:36:01 -07:00
Mingliang Liu 1f6c4545cf HADOOP-15750. Remove obsolete S3A test ITestS3ACredentialsInURL. Contributed by Steve Loughran 2018-09-12 10:58:39 -07:00
Sean Mackrory 47b72c87eb HADOOP-15635. s3guard set-capacity command to fail fast if bucket is unguarded.
Contributed by Gabor Bota.
2018-09-12 09:12:38 -06:00
Mingliang Liu 87f63b6479 HADOOP-14833. Remove s3a user:secret authentication. Contributed by Steve Loughran 2018-09-11 17:18:42 -07:00
Gabor Bota 36c7c78260
HADOOP-15709 Move S3Guard LocalMetadataStore constants to org.apache.hadoop.fs.s3a.Constants (Contributed by Gabor Bota) 2018-09-07 10:25:20 -07:00
Steve Loughran 5a0babf765
HADOOP-15107. Stabilize/tune S3A committers; review correctness & docs.
Contributed by Steve Loughran.
2018-08-30 14:49:53 +01:00
Steve Loughran 2e6c1109dc
HADOOP-15667. FileSystemMultipartUploader should verify that UploadHandle has non-0 length.
Contributed by Ewan Higgs
2018-08-30 14:33:16 +01:00
Aaron Fabbri d7232857d8
HADOOP-14154 Persist isAuthoritative bit in DynamoDBMetaStore (Contributed by Gabor Bota) 2018-08-17 10:15:39 -07:00
Steve Loughran 0e832e7a74
HADOOP-15642. Update aws-sdk version to 1.11.375.
Contributed by Steve Loughran.
2018-08-16 09:58:46 -07:00
Akira Ajisaka 3e3963b035
HADOOP-15552. Move logging APIs over to slf4j in hadoop-tools - Part2. Contributed by Ian Pickering. 2018-08-16 00:31:59 +09:00
Ewan Higgs a13929ddcb HADOOP-15645. ITestS3GuardToolLocal.testDiffCommand fails if bucket has per-bucket binding to DDB. Contributed by Steve Loughran. 2018-08-13 12:57:45 +02:00
Steve Loughran da9a39eed1
HADOOP-15583. Stabilize S3A Assumed Role support.
Contributed by Steve Loughran.
2018-08-08 22:57:24 -07:00
Ewan Higgs 2ec97abb2e HADOOP-15576. S3A Multipart Uploader to work with S3Guard and encryption Originally contributed by Ewan Higgs with refinements by Steve Loughran. 2018-08-08 13:50:23 +02:00
Sean Mackrory 7862f1523f HADOOP-15400. Improve S3Guard documentation on Authoritative Mode implementation. (Contributed by Gabor Bota) 2018-08-07 20:13:09 -06:00
Steve Loughran 48673bc2a8
HADOOP-15626. FileContextMainOperationsBaseTest.testBuilderCreateAppendExistingFile fails on filesystems without append.
Contributed by Steve Loughran.
2018-08-03 16:06:00 -07:00
Sean Mackrory 59adeb8d7f HADOOP-15636. Follow-up from HADOOP-14918; restoring test under new name. Contributed by Gabor Bota. 2018-07-27 18:23:29 -06:00
Sean Mackrory a08812a1b1 HADOOP-15349. S3Guard DDB retryBackoff to be more informative on limits exceeded. Contributed by Gabor Bota. 2018-07-12 17:24:01 +02:00
Sean Mackrory d503f65b66 HADOOP-15541. [s3a] Shouldn't try to drain stream before aborting
connection in case of timeout.
2018-07-10 17:52:57 +02:00
Aaron Fabbri 93ac01cb59
HADOOP-15215 s3guard set-capacity command to fail on read/write of 0 (Gabor Bota) 2018-07-03 13:50:11 -07:00
Akira Ajisaka 2b2399d623
HADOOP-15495. Upgrade commons-lang version to 3.7 in hadoop-common-project and hadoop-tools. Contributed by Takanobu Asanuma. 2018-06-28 14:37:22 +09:00
Sean Mackrory c687a6617d HADOOP-15423. Merge fileCache and dirCache into ine single cache in LocalMetadataStore. Contributed by Gabor Bota. 2018-06-25 14:59:41 -06:00
Sean Mackrory 55fad6a3de HADOOP-15416. Clear error message in S3Guard diff if source not found. Contributed by Gabor Bota. 2018-06-22 11:36:56 -06:00
Sean Mackrory b089a06793 HADOOP-14918. Remove the Local Dynamo DB test option. Contributed by Gabor Bota. 2018-06-20 16:45:08 -06:00
Chris Douglas 980031bb04 HADOOP-13186. Multipart Uploader API. Contributed by Ewan Higgs 2018-06-17 11:54:26 -07:00