23844 Commits

Author SHA1 Message Date
Szilard Nemeth
fa41e38450 YARN-10279. Avoid unnecessary QueueMappingEntity creations. Contributed by Marton Hudaky
(cherry picked from commit 6a8fd73b273629d0c7c071cf4d090f67d9b96fe4)
2020-06-25 17:28:48 +02:00
Thomas Marquardt
ee192c4826
HADOOP-17089: WASB: Update azure-storage-java SDK
Contributed by Thomas Marquardt

DETAILS: WASB depends on the Azure Storage Java SDK. There is a concurrency
bug in the Azure Storage Java SDK that can cause the results of a list blobs
operation to appear empty. This causes the Filesystem listStatus and similar
APIs to return empty results. This has been seen in Spark work loads when jobs
use more than one executor core.

See Azure/azure-storage-java#546 for details on the bug in the Azure Storage SDK.

TESTS: A new test was added to validate the fix. All tests are passing:

wasb:
mvn -T 1C -Dparallel-tests=wasb -Dscale -DtestsThreadCount=8 clean verify
Tests run: 248, Failures: 0, Errors: 0, Skipped: 11
Tests run: 651, Failures: 0, Errors: 0, Skipped: 65

abfs:
mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 64, Failures: 0, Errors: 0, Skipped: 0
Tests run: 437, Failures: 0, Errors: 0, Skipped: 33
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24
2020-06-25 05:43:32 +00:00
Szilard Nemeth
480919e42d YARN-10316. FS-CS converter: convert maxAppsDefault, maxRunningApps settings. Contributed by Peter Bacsko 2020-06-23 16:25:33 +02:00
Szilard Nemeth
8f1b70e367 YARN-9930. Support max running app logic for CapacityScheduler. Contributed by Peter Bacsko 2020-06-22 12:00:06 +02:00
Masatake Iwasaki
56d72adbdd MAPREDUCE-7281. Fix NoClassDefFoundError on 'mapred minicluster'. (#2077)
(cherry picked from commit 8fd0fdf8890b4c0cf3ea977be8fae8fa17e6599b)
2020-06-20 21:39:57 +09:00
Thomas Marquardt
63d236c019
HADOOP-17076: ABFS: Delegation SAS Generator Updates
Contributed by Thomas Marquardt.

DETAILS:
1) The authentication version in the service has been updated from Dec19 to Feb20, so need to update the client.
2) Add support and test cases for getXattr and setXAttr.
3) Update DelegationSASGenerator and related to use Duration instead of int for time periods.
4) Cleanup DelegationSASGenerator switch/case statement that maps operations to permissions.
5) Cleanup SASGenerator classes to use String.equals instead of ==.

TESTS:
Added tests for getXAttr and setXAttr.

All tests are passing against my account in eastus2euap:

 $mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
 Tests run: 76, Failures: 0, Errors: 0, Skipped: 0
 Tests run: 441, Failures: 0, Errors: 0, Skipped: 33
 Tests run: 206, Failures: 0, Errors: 0, Skipped: 24
2020-06-19 19:19:31 +00:00
bilaharith
d639c11986
HADOOP-17004. Fixing a formatting issue
Contributed by Bilahari T H.
2020-06-19 19:11:06 +00:00
bilaharith
11307f3be9
HADOOP-17004. ABFS: Improve the ABFS driver documentation
Contributed by Bilahari T H.
2020-06-19 19:10:22 +00:00
Thomas Marquardt
af98f32f7d
HADOOP-16916: ABFS: Delegation SAS generator for integration with Ranger
Contributed by Thomas Marquardt.

DETAILS:

Previously we had a SASGenerator class which generated Service SAS, but we need to add DelegationSASGenerator.
I separated SASGenerator into a base class and two subclasses ServiceSASGenerator and DelegationSASGenreator.  The
code in ServiceSASGenerator is copied from SASGenerator but the DelegationSASGenrator code is new.  The
DelegationSASGenerator code demonstrates how to use Delegation SAS with minimal permissions, as would be used
by an authorization service such as Apache Ranger.  Adding this to the tests helps us lock in this behavior.

Added a MockDelegationSASTokenProvider for testing User Delegation SAS.

Fixed the ITestAzureBlobFileSystemCheckAccess tests to assume oauth client ID so that they are ignored when that
is not configured.

To improve performance, AbfsInputStream/AbfsOutputStream re-use SAS tokens until the expiry is within 120 seconds.
After this a new SAS will be requested.  The default period of 120 seconds can be changed using the configuration
setting "fs.azure.sas.token.renew.period.for.streams".

The SASTokenProvider operation names were updated to correspond better with the ADLS Gen2 REST API, since these
operations must be provided tokens with appropriate SAS parameters to succeed.

Support for the version 2.0 AAD authentication endpoint was added to AzureADAuthenticator.

The getFileStatus method was mistakenly calling the ADLS Gen2 Get Properties API which requires read permission
while the getFileStatus call only requires execute permission.  ADLS Gen2 Get Status API is supposed to be used
for this purpose, so the underlying AbfsClient.getPathStatus API was updated with a includeProperties
parameter which is set to false for getFileStatus and true for getXAttr.

Added SASTokenProvider support for delete recursive.

Fixed bugs in AzureBlobFileSystem where public methods were not validating the Path by calling makeQualified.  This is
necessary to avoid passing null paths and to convert relative paths into absolute paths.

Canonicalized the path used for root path internally so that root path can be used with SAS tokens, which requires
that the path in the URL and the path in the SAS token match.  Internally the code was using
"//" instead of "/" for the root path, sometimes.  Also related to this, the AzureBlobFileSystemStore.getRelativePath
API was updated so that we no longer remove and then add back a preceding forward / to paths.

To run ITestAzureBlobFileSystemDelegationSAS tests follow the instructions in testing_azure.md under the heading
"To run Delegation SAS test cases".  You also need to set "fs.azure.enable.check.access" to true.

TEST RESULTS:

namespace.enabled=true
auth.type=SharedKey
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 41
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=false
auth.type=SharedKey
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 244
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=true
auth.type=SharedKey
sas.token.provider.type=MockDelegationSASTokenProvider
enable.check.access=true
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 33
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=true
auth.type=OAuth
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 1, Skipped: 74
Tests run: 206, Failures: 0, Errors: 0, Skipped: 140
2020-06-19 19:00:46 +00:00
Mehakmeet Singh
a2f44344c3
HADOOP-17018. Intermittent failing of ITestAbfsStreamStatistics in ABFS (#1990)
Contributed by: Mehakmeet Singh

In some cases, ABFS-prefetch thread runs in the background which returns some bytes from the buffer and gives an extra readOp. Thus, making readOps values arbitrary and giving intermittent failures in some cases. Hence, readOps values of 2 or 3 are seen in different setups.
2020-06-19 19:00:04 +00:00
bilaharith
76ee7e5494
HADOOP-17002. ABFS: Adding config to determine if the account is HNS enabled or not
Contributed by Bilahari T H.
2020-06-19 18:57:47 +00:00
Stephen O'Donnell
7613191fcd HDFS-15372. Files in snapshots no longer see attribute provider permissions. Contributed by Stephen O'Donnell.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
(cherry picked from commit 630ac9e7e5fdbc3ce86476cf0167255ab9b0470a)
2020-06-18 06:45:28 -07:00
Wei-Chiu Chuang
10880dc396 Revert "HDFS-15372. Files in snapshots no longer see attribute provider permissions. Contributed by Stephen O'Donnell."
This reverts commit 0b9e5ea592b66e1b370feaae9677a7b99fdbd03c.
2020-06-18 06:45:28 -07:00
S O'Donnell
3dc9db3aed HDFS-15406. Improve the speed of Datanode Block Scan. Contributed by hemanthboyina
(cherry picked from commit 123777823edc98553fcef61f1913ab6e4cd5aa9a)
2020-06-18 12:29:12 +01:00
Mehakmeet Singh
d1ba6c963d HADOOP-17020. Improve RawFileSystem Performance (#2063)
Contributed by : Mehakmeet Singh

Co-authored-by: Rajesh Balamohan
Co-authored-by: Mehakmeet Singh
2020-06-17 16:16:30 +01:00
Szilard Nemeth
ec913398a9 YARN-10281. Redundant QueuePath usage in UserGroupMappingPlacementRule and AppNameMappingPlacementRule. Contributed by Gergely Pollak 2020-06-17 14:36:08 +02:00
Vinayakumar B
c1ef247dc6
YARN-10314. YarnClient throws NoClassDefFoundError for WebSocketException with only shaded client jars (#2075) 2020-06-17 09:29:49 +05:30
Uma Maheswara Rao G
120ee793fc HDFS-15387. FSUsage#DF should consider ViewFSOverloadScheme in processPath. Contributed by Uma Maheswara Rao G.
(cherry picked from commit 785b1def959fab6b8b7ffff66410bcd240feee13)
2020-06-16 20:02:44 -07:00
Ayush Saxena
bee2846bee HDFS-15389. DFSAdmin should close filesystem and dfsadmin -setBalancerBandwidth should work with ViewFSOverloadScheme. Contributed by Ayush Saxena
(cherry picked from commit cc671b16f7b0b7c1ed7b41b96171653dc43cf670)
2020-06-16 16:54:40 -07:00
Uma Maheswara Rao G
418580446b HDFS-15330. Document the ViewFSOverloadScheme details in ViewFS guide. Contributed by Uma Maheswara Rao G.
(cherry picked from commit 76fa0222f0d2e2d92b4a1eedba8b3e38002e8c23)
2020-06-16 16:54:01 -07:00
Uma Maheswara Rao G
0b5e202614 HDFS-15321. Make DFSAdmin tool to work with ViewFileSystemOverloadScheme. Contributed by Uma Maheswara Rao G.
(cherry picked from commit ed83c865dd0b4e92f3f89f79543acc23792bb69c)
2020-06-16 16:53:38 -07:00
Uma Maheswara Rao G
8e71e85af7 HDFS-15322. Make NflyFS to work when ViewFsOverloadScheme's scheme and target uris schemes are same. Contributed by Uma Maheswara Rao G.
(cherry picked from commit 4734c77b4b64b7c6432da4cc32881aba85f94ea1)
2020-06-16 16:53:10 -07:00
Abhishek Das
5b248de42d HADOOP-17024. ListStatus on ViewFS root (ls "/") should list the linkFallBack root (configured target root). Contributed by Abhishek Das.
(cherry picked from commit ce4ec7445345eb94c6741d416814a4eac319f0a6)
2020-06-16 16:52:29 -07:00
Uma Maheswara Rao G
544996c857 HDFS-15306. Make mount-table to read from central place ( Let's say from HDFS). Contributed by Uma Maheswara Rao G.
(cherry picked from commit ac4a2e11d98827c7926a34cda27aa7bcfd3f36c1)
2020-06-16 16:50:57 -07:00
Stephen O'Donnell
0b9e5ea592 HDFS-15372. Files in snapshots no longer see attribute provider permissions. Contributed by Stephen O'Donnell.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
(cherry picked from commit 730a39d1388548f22f76132a6734d61c24c3eb72)
2020-06-16 16:02:07 -07:00
Szilard Nemeth
8be302a3b8 YARN-10274. Merge QueueMapping and QueueMappingEntity. Contributed by Gergely Pollak 2020-06-16 18:25:47 +02:00
Szilard Nemeth
52efe48d79 YARN-10292. FS-CS converter: add an option to enable asynchronous scheduling in CapacityScheduler. Contributed by Benjamin Teke 2020-06-16 18:01:39 +02:00
Eric Yang
d73cdb1c86
SPNEGO TLS verification
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 81d8a887b0406380e469c76ed2e41022a6372dd7)
2020-06-15 11:12:08 +09:00
Takanobu Asanuma
3b8418250f HDFS-15403. NPE in FileIoProvider#transferToSocketFully. Contributed by hemanthboyina.
(cherry picked from commit f41a144077fc0e2d32072e0d088c1abd1897cee5)
2020-06-15 09:17:40 +09:00
Vinayakumar B
534b15caf9
HADOOP-17046. Support downstreams' existing Hadoop-rpc implementations using non-shaded protobuf classes (#2026) 2020-06-12 23:20:10 +05:30
Eric Badger
fcd7ce53b5 YARN-10312. Add support for yarn logs -logFile to retain backward compatibility.
Contributed by Jim Brennan.

(cherry picked from commit fed6fecd3a9e24efc20f9221505da35a7e1949c7)
2020-06-11 21:11:20 +00:00
Szilard Nemeth
e35f619841 YARN-10296. Make ContainerPBImpl#getId/setId synchronized. Contributed by Benjamin Teke 2020-06-10 18:00:21 +02:00
Ayush Saxena
043628dcf1 HDFS-15398. EC: hdfs client hangs due to exception during addBlock. Contributed by Hongbing Wang. 2020-06-10 12:09:34 +05:30
Eric E Payne
a7526ba9f7 YARN-10300: appMasterHost not set in RM ApplicationSummary when AM fails before first heartbeat. Contributed by Eric Badger (ebadger).
(cherry picked from commit 56247db3022705635580c4d2f8b0abde109f954f)
2020-06-09 18:51:46 +00:00
Steve Loughran
5e290e702f
HADOOP-17050. S3A to support additional token issuers
Contributed by Steve Loughran.

S3A delegation token providers will be asked for any additional
token issuers, an array can be returned,
each one will be asked for tokens when DelegationTokenIssuer collects
all the tokens for a filesystem.

Change-Id: I1bd3035bbff98cbd8e1d1ac7fc615d937e6bb7bb
2020-06-09 14:43:02 +01:00
Ayush Saxena
2d4faa39e8 HDFS-15211. EC: File write hangs during close in case of Exception during updatePipeline. Contributed by Ayush Saxena.
*Added missed test file.
2020-06-09 18:47:24 +05:30
Eric Badger
890617c7ac Revert "MAPREDUCE-7277. IndexCache totalMemoryUsed differs from cache contents. Contributed by Jon Eagles (jeagles)."
This reverts commit 741fcf2c63c639edfc66834088378afd473c9ce6.
2020-06-08 20:25:02 +00:00
Mingliang Liu
fa723aa7f8
HADOOP-17047. TODO comment exist in trunk while related issue HADOOP-6223 is already fixed. Contributed by Rungroj Maipradit 2020-06-08 11:31:42 -07:00
Mingliang Liu
543075b845
HADOOP-17059. ArrayIndexOfboundsException in ViewFileSystem#listStatus. Contributed by hemanthboyina 2020-06-08 10:38:17 -07:00
Szilard Nemeth
ac307fe20d YARN-10284. Add lazy initialization of LogAggregationFileControllerFactory in LogServlet. Contributed by Adam Antal 2020-06-05 12:40:57 +02:00
Toshihiro Suzuki
ec8f3714e0 HDFS-15386. ReplicaNotFoundException keeps happening in DN after removing multiple DN's data directories (#2052)
Contributed by Toshihiro Suzuki.

(cherry picked from commit 545a0a147c5256c44911ba57b4898e01d786d836)
2020-06-05 11:17:13 +01:00
Szilard Nemeth
a266e32d82 YARN-10286. PendingContainers bugs in the scheduler outputs. Contributed by Andras Gyori 2020-06-05 09:50:43 +02:00
Akira Ajisaka
9cfc0e50fa
HADOOP-17056. Addendum patch: Fix typo
(cherry picked from commit 5157118bd7f3448949da885e323c163828c35aee)
2020-06-04 16:35:27 +09:00
Akira Ajisaka
0b25913384
HADOOP-17062. Fix shelldocs path in Jenkinsfile (#2049)
(cherry picked from commit 704409d53bf7ebf717a3c2e988ede80f623bbad3)
2020-06-04 06:05:51 +09:00
Mehakmeet Singh
1714589609
HADOOP-17016. Adding Common Counters in ABFS (#1991).
Contributed by: Mehakmeet Singh.

Change-Id: Ib84e7a42f28e064df4c6204fcce33e573360bf42
2020-06-03 20:02:44 +01:00
Steve Loughran
8a642caca8
HADOOP-16568. S3A FullCredentialsTokenBinding fails if local credentials are unset. (#1441)
Contributed by Steve Loughran.

Move the loading to deployUnbonded (where they are required) and add a safety check when a new DT is requested

Change-Id: I03c69aa2e16accfccddca756b2771ff832e7dd58
2020-06-03 17:08:52 +01:00
Mike
cf84bec6e3 HADOOP-14566. Add seek support for SFTP FileSystem. (#1999)
Contributed by Mikhail Pryakhin
2020-06-03 11:38:49 +01:00
Akira Ajisaka
c88bf7acc1
HADOOP-17056. shelldoc fails in hadoop-common. (#2045)
In the docker build image, skip GPG verification when downloading
Yetus tarball via yetus-wrapper.

(cherry picked from commit 9c290c08db4361de29f392b0569312c2623b8321)
2020-06-03 18:03:03 +09:00
Szilard Nemeth
f65f64e8ae YARN-10254. CapacityScheduler incorrect User Group Mapping after leaf queue change. Contributed by Gergely Pollak 2020-06-02 18:32:06 +02:00
Dhiraj
910d88eeed
HADOOP-17052. NetUtils.connect() throws unchecked exception (UnresolvedAddressException) causing clients to abort (#2036)
Contributed by Dhiraj Hegde.

Signed-off-by: Mingliang Liu <liuml07@apache.org>
2020-06-01 10:50:22 -07:00