1130 Commits

Author SHA1 Message Date
Akira Ajisaka
d4f75e2798
HADOOP-16808. Use forkCount and reuseForks parameters instead of forkMode in the config of maven surefire plugin. Contributed by Xieming Li.
(cherry picked from commit f6d20daf404fab28b596171172afa4558facb504)
2020-01-21 18:03:56 +09:00
Steve Loughran
429d5db3d9
HADOOP-16785. followup to abfs close() fix.
Adds one extra test to the ABFS close logic, to explicitly
verify that the close sequence of FilterOutputStream is
not going to fail.

This is just a due-diligence patch, but it helps ensure
that no regressions creep in in future.

Contributed by Steve Loughran.

Change-Id: Ifd33a8c322d32513411405b15f50a1aebcfa6e48
2020-01-20 16:26:33 +00:00
Steve Loughran
e21cb8f96e HADOOP-16785. Improve wasb and abfs resilience on double close() calls.
This hardens the wasb and abfs output streams' resilience to being invoked
in/after close().

wasb:
  Explicity raise IOEs on operations invoked after close,
  rather than implicitly raise NPEs.
  This ensures that invocations which catch and swallow IOEs will perform as
  expected.

abfs:
  When rethrowing an IOException in the close() call, explicitly wrap it
  with a new instance of the same subclass.
  This is needed to handle failures in try-with-resources clauses, where
  any exception in closed() is added as a suppressed exception to the one
  thrown in the try {} clause
  *and you cannot attach the same exception to itself*

Contributed by Steve Loughran.

Change-Id: Ic44b494ff5da332b47d6c198ceb67b965d34dd1b
2020-01-08 12:04:11 +00:00
Steve Loughran
5410732cff
HADOOP-16775. DistCp reuses the same temp file within the task for different files.
Contributed by Amir Shenavandeh.

This avoids overwrite consistency issues with S3 and other stores -though
given S3's copy operation is O(data), you are still best of using -direct
when distcp-ing to it.

Change-Id: I8dc9f048ad0cc57ff01543b849da1ce4eaadf8c3
2020-01-02 15:37:55 +00:00
Akira Ajisaka
7201384e1b
HADOOP-16771. Update checkstyle to 8.26 and maven-checkstyle-plugin to 3.1.0. Contributed by Andras Bokor.
(cherry picked from commit f777cd398f1d48898ddc4a9a5ab4e7e310e3027a)
2019-12-20 13:12:02 +09:00
Mingliang Liu
d19981fe48
HADOOP-16758. Refine testing.md to tell user better how to use auth-keys.xml (#1753)
Contributed by Mingliang Liu
2019-12-11 11:54:12 -08:00
Sneha Vijayarajan
aa9cd0a2d6
HADOOP-16660. ABFS: Make RetryCount in ExponentialRetryPolicy Configurable.
Contributed by Sneha Vijayarajan.
2019-12-08 21:32:13 -08:00
bilaharith
c225efe237
HADOOP-16455. ABFS: Implement FileSystem.access() method.
Contributed by Bilahari T H.
2019-12-08 21:32:02 -08:00
Jeetesh Mangwani
b1e748f45b
HADOOP-16612. Track Azure Blob File System client-perceived latency
Contributed by Jeetesh Mangwani.

This add the ability to track the end-to-end performance of ADLS Gen 2 REST APIs by measuring latency in the Hadoop ABFS driver.
The latency information is sent back to the ADLS Gen 2 REST API endpoints in the subsequent requests.
2019-12-08 21:31:51 -08:00
bilaharith
ffeb6d8ece
HADOOP-16587. Make ABFS AAD endpoints configurable.
Contributed by Bilahari T H.

This also addresses HADOOP-16498: AzureADAuthenticator cannot authenticate
in China.

Change-Id: I2441dd48b50b59b912b0242f7f5a4418cf94a87c
2019-12-08 21:31:39 -08:00
Sneha Vijayarajan
8b2c7e0c4d
HADOOP-16578 : Avoid FileSystem API calls when FileSystem already exists 2019-12-08 21:31:24 -08:00
Sneha Vijayarajan
546db6428e
HADOOP-16548 : Disable Flush() over config 2019-12-08 21:31:08 -08:00
Mingliang Liu
8a60429e0b
HADOOP-16735. Make it clearer in config default that EnvironmentVariableCredentialsProvider supports AWS_SESSION_TOKEN. Contributed by Mingliang Liu
This closes #1733
2019-12-05 17:50:28 -08:00
Szilard Nemeth
62622ab9c1 YARN-9836. General usability improvements in showSimulationTrace.html. Contributed by Adam Antal 2019-11-19 21:21:17 +01:00
Andras Bokor
89e95370a4 HADOOP-16710. Testing_azure.md documentation is misleading.
Contributed by Andras Bokor.

Change-Id: Icf07a53145936953629c7dace2e9648b7b21588d
2019-11-17 17:06:10 +00:00
Siyao Meng
e0cf1735e1 HADOOP-16676. Backport HADOOP-16152 to branch-3.2. Contributed by Siyao Meng.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2019-11-12 11:38:42 -08:00
Da Zhou
fe96407451
HADOOP-16640. WASB: Override getCanonicalServiceName() to return URI
(cherry picked from commit 9a8edb0aeddd7787b2654f6e2a8465c325e048a2)
2019-10-16 14:27:11 -07:00
Rohith Sharma K S
7d5bb2ebb7 Preparing for 3.2.2-SNAPSHOT development. 2019-09-07 08:52:08 +05:30
bilaharith
3b3c0c4b87 HADOOP-16479. ABFS FileStatus.getModificationTime returns localized time instead of UTC.
Contributed by Bilahari T H

Change-Id: I532055baaadfd7c324710e4b25f60cdf0378bdc0
2019-08-27 19:08:38 +00:00
Robert Levas
ce23e971b4 HADOOP-16340. ABFS driver continues to retry on IOException responses from REST operations.
Contributed by Robert Levas.

This makes the HttpException constructor protected rather than public, so it is possible
to implement custom subclasses of this exception -exceptions which will not be retried.

Change-Id: Ie8aaa23a707233c2db35948784908b6778ff3a8f
2019-08-27 19:08:29 +00:00
Da Zhou
a6d50a9054 HADOOP-16376. ABFS: Override access() to no-op.
Contributed by Da Zhou.

Change-Id: Ia0024bba32250189a87eb6247808b2473c331ed0
2019-08-27 19:04:16 +00:00
Da Zhou
dd636127e9 HADOOP-16269. ABFS: add listFileStatus with StartFrom.
Author:    Da Zhou
2019-08-27 19:01:21 +00:00
Da Zhou
006ae258b3 HADOOP-16163. NPE in setup/teardown of ITestAbfsDelegationTokens.
Contributed by Da Zhou.

Signed-off-by: Steve Loughran <stevel@apache.org>
2019-08-27 19:01:21 +00:00
Akira Ajisaka
afb3f329fd
YARN-9774. Fix order of arguments for assertEquals in TestSLSUtils. Contributed by Nikhil Navadiya.
(cherry picked from commit 84b1982060422760702eca6f1ef515c6ad3e85a5)
2019-08-23 14:40:15 +09:00
bibinchundatt
69255fa1b9 YARN-9765. SLS runner crashes when run with metrics turned off. Contributed by Abhishek Modi.
(cherry picked from commit 10ec31d20ee1b6a0b1da915acb6b6ec33f2cd415)
2019-08-21 13:57:53 +05:30
KAI XIE
b3c14d4132 HADOOP-16158. DistCp to support checksum validation when copy blocks in parallel (#919)
* DistCp to support checksum validation when copy blocks in parallel

* address review comments

* add checksums comparison test for combine mode

(cherry picked from commit c765584eb231f8482f5b90b7e8f61f9f7a931d09)
2019-08-18 18:48:21 -07:00
Da Zhou
330e450397
HADOOP-16315. ABFS: transform full UPN for named user in AclStatus
Contributed by Da Zhou

Change-Id: Ibc78322415fcbeff89c06c8586c53f5695550290
2019-08-12 09:41:52 +08:00
Ayush Saxena
35ff1ce42c HADOOP-16440. Distcp can not preserve timestamp with -delete option. Contributed by ludun. 2019-07-20 13:29:45 +05:30
Arun Singh
5f2d07af1b
HADOOP-16404. ABFS default blocksize change(256MB from 512MB)
Contributed by: Arun Singh
2019-07-19 20:34:28 -07:00
Masatake Iwasaki
b6718c754a HADOOP-16401. ABFS: port Azure doc to 3.2 branch.
Signed-off-by: Masatake Iwasaki <iwasakims@apache.org>
2019-07-10 17:16:43 +09:00
Takanobu Asanuma
6dffad028e HDFS-12564. Add the documents of swebhdfs configurations on the client side. Contributed by Takanobu Asanuma.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
(cherry picked from commit 98d20656433cdec76c2108d24ff3b935657c1e80)
2019-06-20 20:17:45 -07:00
DadanielZ
9c8e40fbdb
HADOOP-16251. ABFS: add FSMainOperationsBaseTest. Re-commit to fix git metadata.
Author: Da Zhou
(cherry picked from commit ff27e8eabded6b1de9860da95155b304c07b4c6e)
2019-06-07 18:09:38 +01:00
Da Zhou
bf0bb2470f
HADOOP-16242. ABFS: add bufferpool to AbfsOutputStream.
Contributed by Da Zhou.

(cherry picked from commit 1cef194a28086991cd39fb62092d2b2105ece57b)
2019-06-07 18:09:38 +01:00
Vishwajeet Dusane
907a016142
HADOOP-16182. Update abfs storage back-end with "close" flag when application is done writing to a file.
Contributed by Vishwajeet Dusane.

(cherry picked from commit 1edf1914acb74e45f6717c703f519cb382aae173)
2019-06-07 18:09:37 +01:00
Shweta Yakkali
6b115966bc
HADOOP-16157. [Clean-up] Remove NULL check before instanceof in AzureNativeFileSystemStore
(Contributed by Shweta Yakkali via Daniel Templeton)

Change-Id: I6269ae66378e46eed440a76f847ae1af1fa95450
(cherry picked from commit bb8ad096e785f7127a5c0de15167255d9b119578)
2019-06-07 18:09:37 +01:00
Shweta Yakkali
57c6060c3a
HADOOP-15860. ABFS: Throw exception when directory / file name ends with a period (.).
Contributed by Shweta Yakkali.

(cherry picked from commit 13f0ee21f2c17ebacaa35e14ee01f39624f38a8d)

Change-Id: Ibd010d2e6adc15f53a9c5357482e57313bf84d2e
2019-06-07 18:09:37 +01:00
Da Zhou
3593b66693
HADOOP-15823. ABFS: Stop requiring client ID and tenant ID for MSI
(Contributed by Da Zhou via Daniel Templeton)

Change-Id: I546ab3a1df1efec635c08c388148e718dc4a9843
(cherry picked from commit e374584479b687e41d5379bb6d827dcae620e123)
2019-06-07 18:09:37 +01:00
Denes Gerencser
ede5cbd707
HADOOP-16174. Disable wildfly logs to the console.
Follow-on to HADOOP-15851.

Author:    Denes Gerencser <dgerencser@cloudera.com>
(cherry picked from commit ddede7ae6fbbadbe08861bc85a664b73d66f77c7)
2019-06-07 18:09:37 +01:00
Steve Loughran
96489069b0
HADOOP-15851. Disable wildfly logs to the console.
Contributed by Vishwajeet Dusane.

(cherry picked from commit ef9dc6c44c686e836bb25e31ff355cff80572d23)
2019-06-07 18:09:37 +01:00
Steve Loughran
baa8670256
HADOOP-15825. ABFS: Enable some tests for namespace not enabled account using OAuth.
Contributed by Da Zhou.

(cherry picked from commit bd50fa956b1ca25bb2136977b98a6aa6895eff8b)
2019-06-07 18:09:37 +01:00
Takanobu Asanuma
a9a3450560 HADOOP-16331. Fix ASF License check in pom.xml. Contributed by Akira Ajisaka.
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2019-05-29 17:34:16 +09:00
Akira Ajisaka
855dc997d6
HADOOP-16323. https everywhere in Maven settings. 2019-05-27 15:27:33 +09:00
Andrew Olson
55603529d0
HADOOP-16294: Enable access to input options by DistCp subclasses.
Adding a protected-scope getter for the DistCpOptions, so that a subclass does
not need to save its own copy of the inputOptions supplied to its constructor,
if it wishes to override the createInputFileListing method with logic similar
to the original implementation, i.e. calling CopyListing#buildListing with a path and input options.

Author:    Andrew Olson
(cherry picked from commit c15b3bca86a0f973ccdddd020f3ff2d5767ff1bd)
2019-05-16 16:13:12 +02:00
Weiwei Yang
26eb9f52fb HADOOP-16306. AliyunOSS: Remove temporary files when upload small files to OSS. Contributed by wujinhu.
(cherry picked from commit 2d8282bb8248e6984878626c4cdc7148aa2e7202)
2019-05-14 14:06:42 -07:00
Rajat Khandelwal
12e0053932
HADOOP-16278. With S3A Filesystem, Long Running services End up Doing lot of GC and eventually die.
Contributed by Rajat Khandelwal

(cherry picked from commit 591ca698230f25217c10c7549aff8097baa11f1e)
2019-05-09 21:14:37 +01:00
Akira Ajisaka
df5d8f05d9
HADOOP-16227. Upgrade checkstyle to 8.19
(cherry picked from commit 4b4fef2f0e0ed1e185ea1058db7a65d68d4970b9)
2019-04-15 10:47:02 +09:00
Masatake Iwasaki
03079be707 HADOOP-14544. DistCp documentation for command line options is misaligned. Contributed by Masatake Iwasaki.
(cherry picked from commit bbdbc7a9a158f36955c2253acb0edb14219ccb04)
2019-04-12 11:59:14 +09:00
Steve Loughran
b6ebe74526
HADOOP-16233. S3AFileStatus to declare that isEncrypted() is always true (#685)
This is needed to fix up some confusion about caching of job.addCache() handling of S3A paths; all parent dirs -the files are downloaded by the NM without  using the DTs of the user submitting the job. This means that when you submit jobs to an EC2 cluster with lower IAM permissions than the user, cached resources don't get downloaded and the job doesn't start.

Production code changes:
* S3AFileStatus Adds "true" to the superclass's encrypted flag during construction.

Tests
* Base AbstractContractOpenTest can control whether zero byte files created in tests are encrypted. Not done via an XML attribute, just a subclass point. Thoughts?
* Verify that the filecache considers paths to not have the permissions which trigger reduce-privilege downloads
* And extend ITestDelegatedMRJob to test a completely different bucket (open street map), to verify that cached resources do get their tokens picked up

Docs:
* Advise FS developers to say all files are encrypted. It's otherwise harmless and it'll stop other people seeing impossible to debug error messages on app launch.

Contributed by Steve Loughran.

Change-Id: Ifaae4c9d735ccc5eafeebd2584b65daf2d4e5da3
(cherry picked from commit 366186d9990ef9059b6ac9a19ad24310d6f36d04)
2019-04-03 21:35:19 +01:00
Akira Ajisaka
80a8d3310e
HADOOP-16232. Fix errors in the checkstyle configration xmls. Contributed by Wanqiang Ji.
(cherry picked from commit 8b6deebb1dda49e5e35180ed5c5fb5b5221c1516)
2019-04-03 19:36:17 +09:00
Steve Loughran
60c9042286
HADOOP-16058. S3A tests to include Terasort.
Contributed by Steve Loughran.

This includes
 - HADOOP-15890. Some S3A committer tests don't match ITest* pattern; don't run in maven
 - MAPREDUCE-7090. BigMapOutput example doesn't work with paths off cluster fs
 - MAPREDUCE-7091. Terasort on S3A to switch to new committers
 - MAPREDUCE-7092. MR examples to work better against cloud stores
2019-03-29 15:25:45 +00:00