Go to file
Thomas Marquardt af98f32f7d
HADOOP-16916: ABFS: Delegation SAS generator for integration with Ranger
Contributed by Thomas Marquardt.

DETAILS:

Previously we had a SASGenerator class which generated Service SAS, but we need to add DelegationSASGenerator.
I separated SASGenerator into a base class and two subclasses ServiceSASGenerator and DelegationSASGenreator.  The
code in ServiceSASGenerator is copied from SASGenerator but the DelegationSASGenrator code is new.  The
DelegationSASGenerator code demonstrates how to use Delegation SAS with minimal permissions, as would be used
by an authorization service such as Apache Ranger.  Adding this to the tests helps us lock in this behavior.

Added a MockDelegationSASTokenProvider for testing User Delegation SAS.

Fixed the ITestAzureBlobFileSystemCheckAccess tests to assume oauth client ID so that they are ignored when that
is not configured.

To improve performance, AbfsInputStream/AbfsOutputStream re-use SAS tokens until the expiry is within 120 seconds.
After this a new SAS will be requested.  The default period of 120 seconds can be changed using the configuration
setting "fs.azure.sas.token.renew.period.for.streams".

The SASTokenProvider operation names were updated to correspond better with the ADLS Gen2 REST API, since these
operations must be provided tokens with appropriate SAS parameters to succeed.

Support for the version 2.0 AAD authentication endpoint was added to AzureADAuthenticator.

The getFileStatus method was mistakenly calling the ADLS Gen2 Get Properties API which requires read permission
while the getFileStatus call only requires execute permission.  ADLS Gen2 Get Status API is supposed to be used
for this purpose, so the underlying AbfsClient.getPathStatus API was updated with a includeProperties
parameter which is set to false for getFileStatus and true for getXAttr.

Added SASTokenProvider support for delete recursive.

Fixed bugs in AzureBlobFileSystem where public methods were not validating the Path by calling makeQualified.  This is
necessary to avoid passing null paths and to convert relative paths into absolute paths.

Canonicalized the path used for root path internally so that root path can be used with SAS tokens, which requires
that the path in the URL and the path in the SAS token match.  Internally the code was using
"//" instead of "/" for the root path, sometimes.  Also related to this, the AzureBlobFileSystemStore.getRelativePath
API was updated so that we no longer remove and then add back a preceding forward / to paths.

To run ITestAzureBlobFileSystemDelegationSAS tests follow the instructions in testing_azure.md under the heading
"To run Delegation SAS test cases".  You also need to set "fs.azure.enable.check.access" to true.

TEST RESULTS:

namespace.enabled=true
auth.type=SharedKey
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 41
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=false
auth.type=SharedKey
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 244
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=true
auth.type=SharedKey
sas.token.provider.type=MockDelegationSASTokenProvider
enable.check.access=true
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 33
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24

namespace.enabled=true
auth.type=OAuth
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 1, Skipped: 74
Tests run: 206, Failures: 0, Errors: 0, Skipped: 140
2020-06-19 19:00:46 +00:00
.github HADOOP-15184. Add GitHub pull request template. (#1419) 2019-09-11 11:10:11 +09:00
dev-support HADOOP-17056. Addendum patch: Fix typo 2020-06-04 16:35:27 +09:00
hadoop-assemblies Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
hadoop-build-tools Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
hadoop-client-modules YARN-10314. YarnClient throws NoClassDefFoundError for WebSocketException with only shaded client jars (#2075) 2020-06-17 09:29:49 +05:30
hadoop-cloud-storage-project Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
hadoop-common-project HADOOP-17020. Improve RawFileSystem Performance (#2063) 2020-06-17 16:16:30 +01:00
hadoop-dist Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
hadoop-hdfs-project HDFS-15372. Files in snapshots no longer see attribute provider permissions. Contributed by Stephen O'Donnell. 2020-06-18 06:45:28 -07:00
hadoop-mapreduce-project HADOOP-17046. Support downstreams' existing Hadoop-rpc implementations using non-shaded protobuf classes (#2026) 2020-06-12 23:20:10 +05:30
hadoop-maven-plugins Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
hadoop-minicluster Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
hadoop-project HDFS-15330. Document the ViewFSOverloadScheme details in ViewFS guide. Contributed by Uma Maheswara Rao G. 2020-06-16 16:54:01 -07:00
hadoop-project-dist Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
hadoop-tools HADOOP-16916: ABFS: Delegation SAS generator for integration with Ranger 2020-06-19 19:00:46 +00:00
hadoop-yarn-project YARN-10281. Redundant QueuePath usage in UserGroupMappingPlacementRule and AppNameMappingPlacementRule. Contributed by Gergely Pollak 2020-06-17 14:36:08 +02:00
licenses HADOOP-15958. Revisiting LICENSE and NOTICE files. 2019-08-27 13:47:12 +09:00
licenses-binary HADOOP-15993. Upgrade Kafka to 2.4.0 in hadoop-kafka module. (#1796) 2020-01-09 16:24:58 +09:00
.gitattributes HADOOP-13598. Add eol=lf for unix format files in .gitattributes. Contributed by Yiqun Lin. 2016-09-14 11:14:31 +09:00
.gitignore HADOOP-17055. Remove residual code of Ozone (#2039) 2020-05-29 16:50:10 +09:00
BUILDING.txt HADOOP-16856. cmake is missing in the CentOS 8 section of BUILDING.txt. (#1841) 2020-02-12 21:17:33 +09:00
Jenkinsfile HADOOP-17062. Fix shelldocs path in Jenkinsfile (#2049) 2020-06-04 06:05:51 +09:00
LICENSE-binary HADOOP-17049. javax.activation-api and jakarta.activation-api define overlapping classes (#2027) 2020-05-22 11:20:16 +09:00
LICENSE.txt YARN-9561. Add C changes for the new RuncContainerRuntime. Contributed by Eric Badger 2019-12-09 01:25:10 +00:00
NOTICE-binary HADOOP-15958. Revisiting LICENSE and NOTICE files. 2019-08-27 13:47:12 +09:00
NOTICE.txt HADOOP-15958. Revisiting LICENSE and NOTICE files. 2019-08-27 13:47:12 +09:00
README.txt HADOOP-15958. Revisiting LICENSE and NOTICE files. 2019-08-27 13:47:12 +09:00
pom.xml Preparing for 3.3.1 development 2020-04-30 13:33:42 +09:00
start-build-env.sh HADOOP-16849. start-build-env.sh behaves incorrectly when username is numeric only. Contributed by Jihyun Cho. 2020-02-12 14:06:23 +09:00

README.txt

For the latest information about Hadoop, please visit our website at:

   http://hadoop.apache.org/

and our wiki, at:

   https://cwiki.apache.org/confluence/display/HADOOP/