Commit Graph

24014 Commits

Author SHA1 Message Date
Sneha Vijayarajan d166420302
HADOOP-17215: Support for conditional overwrite.
Contributed by Sneha Vijayarajan

DETAILS:

    This change adds config key "fs.azure.enable.conditional.create.overwrite" with
    a default of true.  When enabled, if create(path, overwrite: true) is invoked
    and the file exists, the ABFS driver will first obtain its etag and then attempt
    to overwrite the file on the condition that the etag matches. The purpose of this
    is to mitigate the non-idempotency of this method.  Specifically, in the event of
    a network error or similar, the client will retry and this can result in the file
    being created more than once which may result in data loss.  In essense this is
    like a poor man's file handle, and will be addressed more thoroughly in the future
    when support for lease is added to ABFS.

TEST RESULTS:

    namespace.enabled=true
    auth.type=SharedKey
    -------------------
    $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
    Tests run: 87, Failures: 0, Errors: 0, Skipped: 0
    Tests run: 457, Failures: 0, Errors: 0, Skipped: 42
    Tests run: 207, Failures: 0, Errors: 0, Skipped: 24

    namespace.enabled=true
    auth.type=OAuth
    -------------------
    $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
    Tests run: 87, Failures: 0, Errors: 0, Skipped: 0
    Tests run: 457, Failures: 0, Errors: 0, Skipped: 74
    Tests run: 207, Failures: 0, Errors: 0, Skipped: 140
2020-10-14 22:29:13 +00:00
bilaharith f208da286c
HADOOP-17166. ABFS: configure output stream thread pool (#2179)
Adds the options to control the size of the per-output-stream threadpool
when writing data through the abfs connector

* fs.azure.write.max.concurrent.requests
* fs.azure.write.max.requests.to.queue

Contributed by Bilahari T H
2020-10-14 22:29:13 +00:00
bilaharith cc7350302f
HADOOP-16915. ABFS: Ignoring the test ITestAzureBlobFileSystemRandomRead.testRandomReadPerformance
- Contributed by Bilahari T H
2020-10-14 22:29:13 +00:00
Sneha Vijayarajan 4072323de4
Upgrade store REST API version to 2019-12-12
- Contributed by Sneha Vijayarajan
2020-10-14 22:29:13 +00:00
bilaharith e481d0108a
HADOOP-17149. ABFS: Fixing the testcase ITestGetNameSpaceEnabled
- Contributed by Bilahari T H
2020-10-14 22:29:13 +00:00
bilaharith f73c90f0b0
HADOOP-17163. ABFS: Adding debug log for rename failures
- Contributed by Bilahari T H
2020-10-14 22:29:13 +00:00
bilaharith fbf151ef6f
HADOOP-17137. ABFS: Makes the test cases in ITestAbfsNetworkStatistics agnostic
- Contributed by Bilahari T H
2020-10-14 22:29:13 +00:00
Kihwal Lee 41a3c9bc95 HDFS-15628. HttpFS server throws NPE if a file is a symlink. Contributed by Ahmed Hussein.
(cherry picked from commit e45407128d)
2020-10-14 17:28:02 -05:00
Mukund Thakur 351b1498d3
HDFS-15626. TestWebHDFS.testLargeDirectory failing (#2380)
Fixes the regression caused by HADOOP-17281, where the WebHDFS client
listStatusIterator (correctly) throws NoSuchElementException when next()
runs out of values.

Contributed by Mukund Thakur.

Change-Id: I6cec41c20467920cf21f169653553535414b2680
2020-10-13 13:32:35 +01:00
Ayush Saxena 2a043b987c
HDFS-14811. RBF: TestRouterRpc#testErasureCoding is flaky. Contributed by Chen Zhang.
(cherry picked from commit 7a6265ac42)
2020-10-13 18:29:54 +09:00
Pranav Bheda 054dba68c2
HADOOP-17223 update org.apache.httpcomponents:httpclient to 4.5.13 and httpcore to 4.4.13 (#2242)
* update org.apache.httpcomponents:httpclient from 4.5.6 to 4.5.13
* update org.apache.httpcomponents:httpcore from 4.4.10 to 4.4.13

(cherry picked from commit be3edd0532)
2020-10-13 17:56:38 +09:00
Akira Ajisaka 2e73871cab
HDFS-15620. RBF: Fix test failures after HADOOP-17281 (#2375)
(cherry picked from commit 69ef9b1ee8)
2020-10-13 17:43:29 +09:00
Konstantin V Shvachko b6423d2780 HDFS-15567. [SBN Read] HDFS should expose msync() API to allow downstream applications call it explicitly. Contributed by Konstantin V Shvachko.
(cherry picked from commit b3786d6c3c)
2020-10-12 17:38:42 -07:00
Akira Ajisaka 800b1ed1c2 Addendum to HADOOP-16990. Update Mockserver. Contributed by Akira Ajisaka.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2020-10-12 11:59:17 -07:00
Swaroopa Kadam 05a73ded93
MAPREDUCE-7301: Expose Mini MR Cluster attribute for testing
Signed-off-by: Mingliang Liu <liuml07@apache.org>
2020-10-12 11:09:08 -07:00
Dongjoon Hyun 5032f8abba
HADOOP-17258. Magic S3Guard Committer to overwrite existing pendingSet file on task commit (#2371)
Contributed by Dongjoon Hyun and Steve Loughran

Change-Id: Ibaf8082e60eff5298ff4e6513edc386c5bae0274
2020-10-12 13:42:08 +01:00
Steve Loughran 7cf5bdeec0 Revert "HDFS-15620. RBF: Fix test failures after HADOOP-17281 (#2375)"
This reverts commit 263b7d5dfc.
2020-10-12 10:45:18 +01:00
Akira Ajisaka 263b7d5dfc HDFS-15620. RBF: Fix test failures after HADOOP-17281 (#2375) 2020-10-12 10:43:26 +01:00
Doroszlai, Attila 13e0c5f6e0 HADOOP-16990. Update Mockserver. Contributed by Attila Doroszlai.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2020-10-08 23:44:56 -07:00
Jim Brennan 76e223a320 YARN-10455. TestNMProxy.testNMProxyRPCRetry is not consistent. Contributed by Ahmed Hussein
(cherry picked from commit deb35a32ba)
2020-10-08 18:59:25 +00:00
Steve Loughran 963793dd48
HADOOP-17293. S3A to always probe S3 in S3A getFileStatus on non-auth paths
This reverts changes in HADOOP-13230 to use S3Guard TTL in choosing when
to issue a HEAD request; fixing tests to compensate.

New org.apache.hadoop.fs.s3a.performance.OperationCost cost,
S3GUARD_NONAUTH_FILE_STATUS_PROBE for use in cost tests.

Contributed by Steve Loughran.

Change-Id: I418d55d2d2562a48b2a14ec7dee369db49b4e29e
2020-10-08 15:38:32 +01:00
Jinglun 44ff4c1058
HADOOP-17021. Add concat fs command (#1993)
Contributed by Jinglun

Change-Id: Ia10ad2205ed0f3594c391ee78f7df4c3c31c796d
2020-10-08 10:36:40 +01:00
Mukund Thakur 475dba1ddf
HADOOP-17281 Implement FileSystem.listStatusIterator() in S3AFileSystem (#2354)
Contains HADOOP-17300: FileSystem.DirListingIterator.next() call should
return NoSuchElementException

Contributed by Mukund Thakur

Change-Id: I4e7e5c6e295525db9e2de6f416f32bbb81e146d3
2020-10-07 14:00:23 +01:00
Akira Ajisaka 89314a7bae
HDFS-15613. RBF: Router FSCK fails after HDFS-14442. (#2360)
* Support getHAServiceState in DFSRouter

(cherry picked from commit 074f0d46af)
2020-10-07 13:40:31 +09:00
Jim Brennan c789e944b7 YARN-10451. RM (v1) UI NodesPage can NPE when yarn.io/gpu resource type is defined. Contributed by Eric Payne
(cherry picked from commit b361f29dda)
2020-10-06 18:19:53 +00:00
Liang-Chi Hsieh 8f60a90688 HADOOP-17125. Use snappy-java in SnappyCodec (#2297)
This switches the SnappyCodec to use the java-snappy codec, rather than the native one.

To use the codec, snappy-java.jar (from org.xerial.snappy) needs to be on the classpath.

This comesin as an avro dependency,  so it is already on the hadoop-common classpath,
as well as in hadoop-common/lib.
The version used is now managed in the hadoop-project POM; initially 1.1.7.7

Contributed by DB Tsai and Liang-Chi Hsieh

Change-Id: Id52a404a0005480e68917cd17f0a27b7744aea4e
2020-10-06 17:15:17 +01:00
Adam Antal 3ae78e40bf YARN-10393. MR job live lock caused by completed state container leak in heartbeat between node manager and RM. Contributed by zhenzhao wang and Jim Brennan
(cherry picked from commit a1f7e760df)
2020-10-05 10:10:46 +02:00
bilaharith d80dfad900
HADOOP-17183. ABFS: Enabling checkaccess on ABFS (#2331)
Contributed by Bilahari TH

Change-Id: If4224697deed733d6db44145994cdd85547c27d1
2020-10-01 21:29:48 +01:00
Karen Coppage 43c9959b3a
HADOOP-17267. Add debug-level logs in Filesystem.close() (#2321)
When a filesystem is closed, the FileSystem log will, at debug level,
log the method calling close/closeAll.

At trace level: the full calling stack.

Contributed by Karen Coppage.

Change-Id: I1444f065c171fd31d42b497c92ba4517969f67f0
2020-09-29 16:09:14 +01:00
Wanqiang Ji eb6134cd22
MAPREDUCE-7289. Fix wrong comment in LongLong.java (#2338)
(cherry picked from commit 143bdd4188)
2020-09-29 23:07:58 +09:00
Eric Yang 9176e8fe5d YARN-9809. Added node manager health status to resource manager registration call.
Contributed by Eric Badger via eyang

(cherry picked from commit e8dc862d38)
2020-09-28 16:41:53 +00:00
Hui Fei ed19f63998
HADOOP-17277. Correct spelling errors for separator (#2322)
Contributed by Hui Fei.

(cherry picked from commit 474fa80bfb)
2020-09-23 15:39:51 +09:00
Kihwal Lee 7eb91ac1b2 HDFS-15581. Access Controlled HttpFS Server. Contributed by Richard Ross.
(cherry picked from commit dfc2682213)
2020-09-22 10:55:26 -05:00
S O'Donnell 5f321df0a0 HDFS-15415. Reduce locking in Datanode DirectoryScanner. Contributed by Stephen O'Donnell 2020-09-22 12:00:02 +01:00
crossfire c3cb86ba42
HADOOP-17088. Failed to load XInclude files with relative path. (#2097)
Contributed by Yushi Hayasaka.

Change-Id: I8aad5143c34fb831bef0077f7b659643f8ae073a
2020-09-21 19:13:20 +01:00
Mukund Thakur 7e642ec5a3
HADOOP-17023. Tune S3AFileSystem.listStatus() (#2257)
S3AFileSystem.listStatus() is optimized for invocations
where the path supplied is a non-empty directory.
The number of S3 requests is significantly reduced, saving
time, money, and reducing the risk of S3 throttling.

Contributed by Mukund Thakur.

Change-Id: I7cc5f87aa16a4819e245e0fbd2aad226bd500f3f
2020-09-21 17:30:15 +01:00
zz e5e91397de
MAPREDUCE-7294. Only application master should upload resource to Yarn Shared Cache (#2223)
Contributed by Zhenzhao Wang <zhenzhaowang@gmail.com>

Signed-off-by: Mingliang Liu <liuml07@apache.org>
2020-09-19 23:26:37 -07:00
David Tucker 75bc54a66d
HADOOP-15136. Correct typos in filesystem.md (#2314)
Contributed by David Tucker

Change-Id: I130e15d4f625a5b1b30967e6cfc1684079dd1f98
2020-09-18 18:31:15 +01:00
Uma Maheswara Rao G 2ce5846bfa HDFS-15585: ViewDFS#getDelegationToken should not throw UnsupportedOperationException. (#2312). Contributed by Uma Maheswara Rao G. 2020-09-18 15:23:35 +05:30
Uma Maheswara Rao G 1fc1b34633 HDFS-15558: ViewDistributedFileSystem#recoverLease should call super.recoverLease when there are no mounts configured (#2275) Contributed by Uma Maheswara Rao G. 2020-09-18 15:23:12 +05:30
Uma Maheswara Rao G 2d9c5395ef HDFS-15578: Fix the rename issues with fallback fs enabled (#2305). Contributed by Uma Maheswara Rao G.
Co-authored-by: Uma Maheswara Rao G <umagangumalla@cloudera.com>
(cherry picked from commit e4cb0d3514)
2020-09-16 23:01:03 -07:00
hemanthboyina 94e5c5257f HDFS-15574. Remove unnecessary sort of block list in DirectoryScanner. Contributed by Stephen O'Donnell. 2020-09-17 09:40:36 +05:30
Akira Ajisaka 74c0764343
HADOOP-17246. Addendum patch for branch-3.3. 2020-09-16 16:45:20 +09:00
Wanqiang Ji cda7d6ca85
HADOOP-17246. Fix build the hadoop-build Docker image failed (#2277)
(cherry picked from commit ce86183691)
2020-09-16 16:24:44 +09:00
Jim Brennan 0ec21b9667 YARN-10430. Log improvements in NodeStatusUpdaterImpl. Contributed by Bilwa S T.
(cherry picked from commit 90894ea641)
2020-09-14 21:22:02 +00:00
Akira Ajisaka 0731c8b5d0
HDFS-15555. RBF: Refresh cacheNS when SocketException occurs. (#2267)
(cherry picked from commit c78d18023d)
2020-09-14 11:36:26 +09:00
Uma Maheswara Rao G 1195dac55e HDFS-15529: getChildFilesystems should include fallback fs as well (#2234). Contributed by Uma Maheswara Rao G.
(cherry picked from commit b3660d0147)
2020-09-12 20:48:59 -07:00
Uma Maheswara Rao G bfa145dd7c HDFS-15532: listFiles on root/InternalDir will fail if fallback root has file. (#2298). Contributed by Uma Maheswara Rao G.
(cherry picked from commit d2779de3f5)
2020-09-12 20:43:34 -07:00
zz 2d5ca83078 HADOOP-15891. provide Regex Based Mount Point In Inode Tree (#2185). Contributed by Zhenzhao Wang.
Co-authored-by: Zhenzhao Wang <zhenzhaowang@gmail.com>
(cherry picked from commit 12a316cdf9)
2020-09-12 20:42:06 -07:00
Mingliang Liu 4eccdd950f
HDFS-15573. Only log warning if considerLoad and considerStorageType are both true. Contributed by Stephen O'Donnell 2020-09-12 01:50:28 -07:00