Commit Graph

1133 Commits

Author SHA1 Message Date
Masatake Iwasaki 83c4f8b9a0 HADOOP-16739. Fix native build failure of hadoop-pipes on CentOS 8. 2020-04-24 15:38:11 +09:00
Weiwei Yang 5fca921fe3 HADOOP-16840. AliyunOSS: getFileStatus throws FileNotFoundException in versioning bucket. Contributed by wujinhu.
(cherry picked from commit 6dfe00c71e)
2020-03-08 21:14:41 -07:00
Mukund Thakur 3937abddbd HDFS-13660. DistCp job fails when new data is appended in the file while the DistCp copy job is running
This uses the length of the file known at the start of the copy to determine the amount of data to copy.

* If a file is appended to during the copy, the original bytes are copied.
* If a file is truncated during a copy, or the attempt to read the data fails with a truncated stream,
  distcp will now fail. Until now these failures were not detected.

Contributed by Mukund Thakur.

Change-Id: I576a49d951fa48d37a45a7e4c82c47488aa8e884
(cherry picked from commit 51c64b357d)
2020-02-27 16:37:03 -08:00
Akira Ajisaka d4f75e2798
HADOOP-16808. Use forkCount and reuseForks parameters instead of forkMode in the config of maven surefire plugin. Contributed by Xieming Li.
(cherry picked from commit f6d20daf40)
2020-01-21 18:03:56 +09:00
Steve Loughran 429d5db3d9
HADOOP-16785. followup to abfs close() fix.
Adds one extra test to the ABFS close logic, to explicitly
verify that the close sequence of FilterOutputStream is
not going to fail.

This is just a due-diligence patch, but it helps ensure
that no regressions creep in in future.

Contributed by Steve Loughran.

Change-Id: Ifd33a8c322d32513411405b15f50a1aebcfa6e48
2020-01-20 16:26:33 +00:00
Steve Loughran e21cb8f96e HADOOP-16785. Improve wasb and abfs resilience on double close() calls.
This hardens the wasb and abfs output streams' resilience to being invoked
in/after close().

wasb:
  Explicity raise IOEs on operations invoked after close,
  rather than implicitly raise NPEs.
  This ensures that invocations which catch and swallow IOEs will perform as
  expected.

abfs:
  When rethrowing an IOException in the close() call, explicitly wrap it
  with a new instance of the same subclass.
  This is needed to handle failures in try-with-resources clauses, where
  any exception in closed() is added as a suppressed exception to the one
  thrown in the try {} clause
  *and you cannot attach the same exception to itself*

Contributed by Steve Loughran.

Change-Id: Ic44b494ff5da332b47d6c198ceb67b965d34dd1b
2020-01-08 12:04:11 +00:00
Steve Loughran 5410732cff
HADOOP-16775. DistCp reuses the same temp file within the task for different files.
Contributed by Amir Shenavandeh.

This avoids overwrite consistency issues with S3 and other stores -though
given S3's copy operation is O(data), you are still best of using -direct
when distcp-ing to it.

Change-Id: I8dc9f048ad0cc57ff01543b849da1ce4eaadf8c3
2020-01-02 15:37:55 +00:00
Akira Ajisaka 7201384e1b
HADOOP-16771. Update checkstyle to 8.26 and maven-checkstyle-plugin to 3.1.0. Contributed by Andras Bokor.
(cherry picked from commit f777cd398f)
2019-12-20 13:12:02 +09:00
Mingliang Liu d19981fe48
HADOOP-16758. Refine testing.md to tell user better how to use auth-keys.xml (#1753)
Contributed by Mingliang Liu
2019-12-11 11:54:12 -08:00
Sneha Vijayarajan aa9cd0a2d6
HADOOP-16660. ABFS: Make RetryCount in ExponentialRetryPolicy Configurable.
Contributed by Sneha Vijayarajan.
2019-12-08 21:32:13 -08:00
bilaharith c225efe237
HADOOP-16455. ABFS: Implement FileSystem.access() method.
Contributed by Bilahari T H.
2019-12-08 21:32:02 -08:00
Jeetesh Mangwani b1e748f45b
HADOOP-16612. Track Azure Blob File System client-perceived latency
Contributed by Jeetesh Mangwani.

This add the ability to track the end-to-end performance of ADLS Gen 2 REST APIs by measuring latency in the Hadoop ABFS driver.
The latency information is sent back to the ADLS Gen 2 REST API endpoints in the subsequent requests.
2019-12-08 21:31:51 -08:00
bilaharith ffeb6d8ece
HADOOP-16587. Make ABFS AAD endpoints configurable.
Contributed by Bilahari T H.

This also addresses HADOOP-16498: AzureADAuthenticator cannot authenticate
in China.

Change-Id: I2441dd48b50b59b912b0242f7f5a4418cf94a87c
2019-12-08 21:31:39 -08:00
Sneha Vijayarajan 8b2c7e0c4d
HADOOP-16578 : Avoid FileSystem API calls when FileSystem already exists 2019-12-08 21:31:24 -08:00
Sneha Vijayarajan 546db6428e
HADOOP-16548 : Disable Flush() over config 2019-12-08 21:31:08 -08:00
Mingliang Liu 8a60429e0b
HADOOP-16735. Make it clearer in config default that EnvironmentVariableCredentialsProvider supports AWS_SESSION_TOKEN. Contributed by Mingliang Liu
This closes #1733
2019-12-05 17:50:28 -08:00
Szilard Nemeth 62622ab9c1 YARN-9836. General usability improvements in showSimulationTrace.html. Contributed by Adam Antal 2019-11-19 21:21:17 +01:00
Andras Bokor 89e95370a4 HADOOP-16710. Testing_azure.md documentation is misleading.
Contributed by Andras Bokor.

Change-Id: Icf07a53145936953629c7dace2e9648b7b21588d
2019-11-17 17:06:10 +00:00
Siyao Meng e0cf1735e1 HADOOP-16676. Backport HADOOP-16152 to branch-3.2. Contributed by Siyao Meng.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2019-11-12 11:38:42 -08:00
Da Zhou fe96407451
HADOOP-16640. WASB: Override getCanonicalServiceName() to return URI
(cherry picked from commit 9a8edb0aed)
2019-10-16 14:27:11 -07:00
Rohith Sharma K S 7d5bb2ebb7 Preparing for 3.2.2-SNAPSHOT development. 2019-09-07 08:52:08 +05:30
bilaharith 3b3c0c4b87 HADOOP-16479. ABFS FileStatus.getModificationTime returns localized time instead of UTC.
Contributed by Bilahari T H

Change-Id: I532055baaadfd7c324710e4b25f60cdf0378bdc0
2019-08-27 19:08:38 +00:00
Robert Levas ce23e971b4 HADOOP-16340. ABFS driver continues to retry on IOException responses from REST operations.
Contributed by Robert Levas.

This makes the HttpException constructor protected rather than public, so it is possible
to implement custom subclasses of this exception -exceptions which will not be retried.

Change-Id: Ie8aaa23a707233c2db35948784908b6778ff3a8f
2019-08-27 19:08:29 +00:00
Da Zhou a6d50a9054 HADOOP-16376. ABFS: Override access() to no-op.
Contributed by Da Zhou.

Change-Id: Ia0024bba32250189a87eb6247808b2473c331ed0
2019-08-27 19:04:16 +00:00
Da Zhou dd636127e9 HADOOP-16269. ABFS: add listFileStatus with StartFrom.
Author:    Da Zhou
2019-08-27 19:01:21 +00:00
Da Zhou 006ae258b3 HADOOP-16163. NPE in setup/teardown of ITestAbfsDelegationTokens.
Contributed by Da Zhou.

Signed-off-by: Steve Loughran <stevel@apache.org>
2019-08-27 19:01:21 +00:00
Akira Ajisaka afb3f329fd
YARN-9774. Fix order of arguments for assertEquals in TestSLSUtils. Contributed by Nikhil Navadiya.
(cherry picked from commit 84b1982060)
2019-08-23 14:40:15 +09:00
bibinchundatt 69255fa1b9 YARN-9765. SLS runner crashes when run with metrics turned off. Contributed by Abhishek Modi.
(cherry picked from commit 10ec31d20e)
2019-08-21 13:57:53 +05:30
KAI XIE b3c14d4132 HADOOP-16158. DistCp to support checksum validation when copy blocks in parallel (#919)
* DistCp to support checksum validation when copy blocks in parallel

* address review comments

* add checksums comparison test for combine mode

(cherry picked from commit c765584eb2)
2019-08-18 18:48:21 -07:00
Da Zhou 330e450397
HADOOP-16315. ABFS: transform full UPN for named user in AclStatus
Contributed by Da Zhou

Change-Id: Ibc78322415fcbeff89c06c8586c53f5695550290
2019-08-12 09:41:52 +08:00
Ayush Saxena 35ff1ce42c HADOOP-16440. Distcp can not preserve timestamp with -delete option. Contributed by ludun. 2019-07-20 13:29:45 +05:30
Arun Singh 5f2d07af1b
HADOOP-16404. ABFS default blocksize change(256MB from 512MB)
Contributed by: Arun Singh
2019-07-19 20:34:28 -07:00
Masatake Iwasaki b6718c754a HADOOP-16401. ABFS: port Azure doc to 3.2 branch.
Signed-off-by: Masatake Iwasaki <iwasakims@apache.org>
2019-07-10 17:16:43 +09:00
Takanobu Asanuma 6dffad028e HDFS-12564. Add the documents of swebhdfs configurations on the client side. Contributed by Takanobu Asanuma.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
(cherry picked from commit 98d2065643)
2019-06-20 20:17:45 -07:00
DadanielZ 9c8e40fbdb
HADOOP-16251. ABFS: add FSMainOperationsBaseTest. Re-commit to fix git metadata.
Author: Da Zhou
(cherry picked from commit ff27e8eabd)
2019-06-07 18:09:38 +01:00
Da Zhou bf0bb2470f
HADOOP-16242. ABFS: add bufferpool to AbfsOutputStream.
Contributed by Da Zhou.

(cherry picked from commit 1cef194a28)
2019-06-07 18:09:38 +01:00
Vishwajeet Dusane 907a016142
HADOOP-16182. Update abfs storage back-end with "close" flag when application is done writing to a file.
Contributed by Vishwajeet Dusane.

(cherry picked from commit 1edf1914ac)
2019-06-07 18:09:37 +01:00
Shweta Yakkali 6b115966bc
HADOOP-16157. [Clean-up] Remove NULL check before instanceof in AzureNativeFileSystemStore
(Contributed by Shweta Yakkali via Daniel Templeton)

Change-Id: I6269ae66378e46eed440a76f847ae1af1fa95450
(cherry picked from commit bb8ad096e7)
2019-06-07 18:09:37 +01:00
Shweta Yakkali 57c6060c3a
HADOOP-15860. ABFS: Throw exception when directory / file name ends with a period (.).
Contributed by Shweta Yakkali.

(cherry picked from commit 13f0ee21f2)

Change-Id: Ibd010d2e6adc15f53a9c5357482e57313bf84d2e
2019-06-07 18:09:37 +01:00
Da Zhou 3593b66693
HADOOP-15823. ABFS: Stop requiring client ID and tenant ID for MSI
(Contributed by Da Zhou via Daniel Templeton)

Change-Id: I546ab3a1df1efec635c08c388148e718dc4a9843
(cherry picked from commit e374584479)
2019-06-07 18:09:37 +01:00
Denes Gerencser ede5cbd707
HADOOP-16174. Disable wildfly logs to the console.
Follow-on to HADOOP-15851.

Author:    Denes Gerencser <dgerencser@cloudera.com>
(cherry picked from commit ddede7ae6f)
2019-06-07 18:09:37 +01:00
Steve Loughran 96489069b0
HADOOP-15851. Disable wildfly logs to the console.
Contributed by Vishwajeet Dusane.

(cherry picked from commit ef9dc6c44c)
2019-06-07 18:09:37 +01:00
Steve Loughran baa8670256
HADOOP-15825. ABFS: Enable some tests for namespace not enabled account using OAuth.
Contributed by Da Zhou.

(cherry picked from commit bd50fa956b)
2019-06-07 18:09:37 +01:00
Takanobu Asanuma a9a3450560 HADOOP-16331. Fix ASF License check in pom.xml. Contributed by Akira Ajisaka.
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2019-05-29 17:34:16 +09:00
Akira Ajisaka 855dc997d6
HADOOP-16323. https everywhere in Maven settings. 2019-05-27 15:27:33 +09:00
Andrew Olson 55603529d0
HADOOP-16294: Enable access to input options by DistCp subclasses.
Adding a protected-scope getter for the DistCpOptions, so that a subclass does
not need to save its own copy of the inputOptions supplied to its constructor,
if it wishes to override the createInputFileListing method with logic similar
to the original implementation, i.e. calling CopyListing#buildListing with a path and input options.

Author:    Andrew Olson
(cherry picked from commit c15b3bca86)
2019-05-16 16:13:12 +02:00
Weiwei Yang 26eb9f52fb HADOOP-16306. AliyunOSS: Remove temporary files when upload small files to OSS. Contributed by wujinhu.
(cherry picked from commit 2d8282bb82)
2019-05-14 14:06:42 -07:00
Rajat Khandelwal 12e0053932
HADOOP-16278. With S3A Filesystem, Long Running services End up Doing lot of GC and eventually die.
Contributed by Rajat Khandelwal

(cherry picked from commit 591ca69823)
2019-05-09 21:14:37 +01:00
Akira Ajisaka df5d8f05d9
HADOOP-16227. Upgrade checkstyle to 8.19
(cherry picked from commit 4b4fef2f0e)
2019-04-15 10:47:02 +09:00
Masatake Iwasaki 03079be707 HADOOP-14544. DistCp documentation for command line options is misaligned. Contributed by Masatake Iwasaki.
(cherry picked from commit bbdbc7a9a1)
2019-04-12 11:59:14 +09:00