HADOOP-9629. Support Windows Azure Storage - Blob as a file system in Hadoop. Contributed by Dexter Bradshaw, Mostafa Elhemali, Xi Fang, Johannes Klein, David Lao, Mike Liddell, Chuan Liu, Lengning Liu, Ivan Mitic, Michael Rys, Alexander Stojanovic, Brian Swan, and Min Wei.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1601781 13f79535-47bb-0310-9956-ffa450edef68
2014-06-10 18:26:45 -04:00
|
|
|
<!--
|
|
|
|
Licensed to the Apache Software Foundation (ASF) under one or more
|
|
|
|
contributor license agreements. See the NOTICE file distributed with
|
|
|
|
this work for additional information regarding copyright ownership.
|
|
|
|
The ASF licenses this file to You under the Apache License, Version 2.0
|
|
|
|
(the "License"); you may not use this file except in compliance with
|
|
|
|
the License. You may obtain a copy of the License at
|
|
|
|
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
|
|
|
|
Unless required by applicable law or agreed to in writing, software
|
|
|
|
distributed under the License is distributed on an "AS IS" BASIS,
|
|
|
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
|
|
See the License for the specific language governing permissions and
|
|
|
|
limitations under the License.
|
|
|
|
-->
|
|
|
|
<FindBugsFilter>
|
HADOOP-16916: ABFS: Delegation SAS generator for integration with Ranger
Contributed by Thomas Marquardt.
DETAILS:
Previously we had a SASGenerator class which generated Service SAS, but we need to add DelegationSASGenerator.
I separated SASGenerator into a base class and two subclasses ServiceSASGenerator and DelegationSASGenreator. The
code in ServiceSASGenerator is copied from SASGenerator but the DelegationSASGenrator code is new. The
DelegationSASGenerator code demonstrates how to use Delegation SAS with minimal permissions, as would be used
by an authorization service such as Apache Ranger. Adding this to the tests helps us lock in this behavior.
Added a MockDelegationSASTokenProvider for testing User Delegation SAS.
Fixed the ITestAzureBlobFileSystemCheckAccess tests to assume oauth client ID so that they are ignored when that
is not configured.
To improve performance, AbfsInputStream/AbfsOutputStream re-use SAS tokens until the expiry is within 120 seconds.
After this a new SAS will be requested. The default period of 120 seconds can be changed using the configuration
setting "fs.azure.sas.token.renew.period.for.streams".
The SASTokenProvider operation names were updated to correspond better with the ADLS Gen2 REST API, since these
operations must be provided tokens with appropriate SAS parameters to succeed.
Support for the version 2.0 AAD authentication endpoint was added to AzureADAuthenticator.
The getFileStatus method was mistakenly calling the ADLS Gen2 Get Properties API which requires read permission
while the getFileStatus call only requires execute permission. ADLS Gen2 Get Status API is supposed to be used
for this purpose, so the underlying AbfsClient.getPathStatus API was updated with a includeProperties
parameter which is set to false for getFileStatus and true for getXAttr.
Added SASTokenProvider support for delete recursive.
Fixed bugs in AzureBlobFileSystem where public methods were not validating the Path by calling makeQualified. This is
necessary to avoid passing null paths and to convert relative paths into absolute paths.
Canonicalized the path used for root path internally so that root path can be used with SAS tokens, which requires
that the path in the URL and the path in the SAS token match. Internally the code was using
"//" instead of "/" for the root path, sometimes. Also related to this, the AzureBlobFileSystemStore.getRelativePath
API was updated so that we no longer remove and then add back a preceding forward / to paths.
To run ITestAzureBlobFileSystemDelegationSAS tests follow the instructions in testing_azure.md under the heading
"To run Delegation SAS test cases". You also need to set "fs.azure.enable.check.access" to true.
TEST RESULTS:
namespace.enabled=true
auth.type=SharedKey
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 41
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24
namespace.enabled=false
auth.type=SharedKey
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 244
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24
namespace.enabled=true
auth.type=SharedKey
sas.token.provider.type=MockDelegationSASTokenProvider
enable.check.access=true
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 33
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24
namespace.enabled=true
auth.type=OAuth
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 1, Skipped: 74
Tests run: 206, Failures: 0, Errors: 0, Skipped: 140
2020-05-12 13:32:52 -04:00
|
|
|
<!-- This reference equality check is an intentional light weight
|
|
|
|
check to avoid re-validating the token when re-used. -->
|
|
|
|
<Match>
|
|
|
|
<Class name="org.apache.hadoop.fs.azurebfs.utils.CachedSASToken" />
|
|
|
|
<Method name="update" />
|
|
|
|
<Bug pattern="ES_COMPARING_PARAMETER_STRING_WITH_EQ" />
|
|
|
|
</Match>
|
|
|
|
|
|
|
|
<!-- This is intentional. The unsynchronized field access is safe
|
|
|
|
and only synchronized access is used when using the sasToken
|
|
|
|
for authorization. -->
|
|
|
|
<Match>
|
|
|
|
<Class name="org.apache.hadoop.fs.azurebfs.utils.CachedSASToken" />
|
|
|
|
<Field name="sasToken" />
|
|
|
|
<Bug pattern="IS2_INCONSISTENT_SYNC" />
|
|
|
|
</Match>
|
|
|
|
|
2014-10-08 17:20:23 -04:00
|
|
|
<!-- It is okay to skip up to end of file. No need to check return value. -->
|
|
|
|
<Match>
|
|
|
|
<Class name="org.apache.hadoop.fs.azure.AzureNativeFileSystemStore" />
|
|
|
|
<Method name="retrieve" />
|
|
|
|
<Bug pattern="SR_NOT_CHECKED" />
|
|
|
|
<Priority value="2" />
|
|
|
|
</Match>
|
|
|
|
|
|
|
|
<!-- Returning fully loaded array to iterate through is a convenience
|
HADOOP-16916: ABFS: Delegation SAS generator for integration with Ranger
Contributed by Thomas Marquardt.
DETAILS:
Previously we had a SASGenerator class which generated Service SAS, but we need to add DelegationSASGenerator.
I separated SASGenerator into a base class and two subclasses ServiceSASGenerator and DelegationSASGenreator. The
code in ServiceSASGenerator is copied from SASGenerator but the DelegationSASGenrator code is new. The
DelegationSASGenerator code demonstrates how to use Delegation SAS with minimal permissions, as would be used
by an authorization service such as Apache Ranger. Adding this to the tests helps us lock in this behavior.
Added a MockDelegationSASTokenProvider for testing User Delegation SAS.
Fixed the ITestAzureBlobFileSystemCheckAccess tests to assume oauth client ID so that they are ignored when that
is not configured.
To improve performance, AbfsInputStream/AbfsOutputStream re-use SAS tokens until the expiry is within 120 seconds.
After this a new SAS will be requested. The default period of 120 seconds can be changed using the configuration
setting "fs.azure.sas.token.renew.period.for.streams".
The SASTokenProvider operation names were updated to correspond better with the ADLS Gen2 REST API, since these
operations must be provided tokens with appropriate SAS parameters to succeed.
Support for the version 2.0 AAD authentication endpoint was added to AzureADAuthenticator.
The getFileStatus method was mistakenly calling the ADLS Gen2 Get Properties API which requires read permission
while the getFileStatus call only requires execute permission. ADLS Gen2 Get Status API is supposed to be used
for this purpose, so the underlying AbfsClient.getPathStatus API was updated with a includeProperties
parameter which is set to false for getFileStatus and true for getXAttr.
Added SASTokenProvider support for delete recursive.
Fixed bugs in AzureBlobFileSystem where public methods were not validating the Path by calling makeQualified. This is
necessary to avoid passing null paths and to convert relative paths into absolute paths.
Canonicalized the path used for root path internally so that root path can be used with SAS tokens, which requires
that the path in the URL and the path in the SAS token match. Internally the code was using
"//" instead of "/" for the root path, sometimes. Also related to this, the AzureBlobFileSystemStore.getRelativePath
API was updated so that we no longer remove and then add back a preceding forward / to paths.
To run ITestAzureBlobFileSystemDelegationSAS tests follow the instructions in testing_azure.md under the heading
"To run Delegation SAS test cases". You also need to set "fs.azure.enable.check.access" to true.
TEST RESULTS:
namespace.enabled=true
auth.type=SharedKey
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 41
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24
namespace.enabled=false
auth.type=SharedKey
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 244
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24
namespace.enabled=true
auth.type=SharedKey
sas.token.provider.type=MockDelegationSASTokenProvider
enable.check.access=true
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 33
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24
namespace.enabled=true
auth.type=OAuth
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 1, Skipped: 74
Tests run: 206, Failures: 0, Errors: 0, Skipped: 140
2020-05-12 13:32:52 -04:00
|
|
|
and helps performance. -->
|
2014-10-08 17:20:23 -04:00
|
|
|
<Match>
|
|
|
|
<Class name="org.apache.hadoop.fs.azure.NativeAzureFileSystem$FolderRenamePending" />
|
|
|
|
<Method name="getFiles" />
|
|
|
|
<Bug pattern="EI_EXPOSE_REP" />
|
|
|
|
<Priority value="2" />
|
|
|
|
</Match>
|
|
|
|
|
|
|
|
<!-- Need to start keep-alive thread for SelfRenewingLease in constructor. -->
|
|
|
|
<Match>
|
|
|
|
<Class name="org.apache.hadoop.fs.azure.SelfRenewingLease" />
|
|
|
|
<Bug pattern="SC_START_IN_CTOR" />
|
|
|
|
<Priority value="2" />
|
|
|
|
</Match>
|
HADOOP-9629. Support Windows Azure Storage - Blob as a file system in Hadoop. Contributed by Dexter Bradshaw, Mostafa Elhemali, Xi Fang, Johannes Klein, David Lao, Mike Liddell, Chuan Liu, Lengning Liu, Ivan Mitic, Michael Rys, Alexander Stojanovic, Brian Swan, and Min Wei.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1601781 13f79535-47bb-0310-9956-ffa450edef68
2014-06-10 18:26:45 -04:00
|
|
|
|
2014-10-08 17:20:23 -04:00
|
|
|
<!-- Using a key set iterator is fine because this is not a performance-critical
|
HADOOP-16916: ABFS: Delegation SAS generator for integration with Ranger
Contributed by Thomas Marquardt.
DETAILS:
Previously we had a SASGenerator class which generated Service SAS, but we need to add DelegationSASGenerator.
I separated SASGenerator into a base class and two subclasses ServiceSASGenerator and DelegationSASGenreator. The
code in ServiceSASGenerator is copied from SASGenerator but the DelegationSASGenrator code is new. The
DelegationSASGenerator code demonstrates how to use Delegation SAS with minimal permissions, as would be used
by an authorization service such as Apache Ranger. Adding this to the tests helps us lock in this behavior.
Added a MockDelegationSASTokenProvider for testing User Delegation SAS.
Fixed the ITestAzureBlobFileSystemCheckAccess tests to assume oauth client ID so that they are ignored when that
is not configured.
To improve performance, AbfsInputStream/AbfsOutputStream re-use SAS tokens until the expiry is within 120 seconds.
After this a new SAS will be requested. The default period of 120 seconds can be changed using the configuration
setting "fs.azure.sas.token.renew.period.for.streams".
The SASTokenProvider operation names were updated to correspond better with the ADLS Gen2 REST API, since these
operations must be provided tokens with appropriate SAS parameters to succeed.
Support for the version 2.0 AAD authentication endpoint was added to AzureADAuthenticator.
The getFileStatus method was mistakenly calling the ADLS Gen2 Get Properties API which requires read permission
while the getFileStatus call only requires execute permission. ADLS Gen2 Get Status API is supposed to be used
for this purpose, so the underlying AbfsClient.getPathStatus API was updated with a includeProperties
parameter which is set to false for getFileStatus and true for getXAttr.
Added SASTokenProvider support for delete recursive.
Fixed bugs in AzureBlobFileSystem where public methods were not validating the Path by calling makeQualified. This is
necessary to avoid passing null paths and to convert relative paths into absolute paths.
Canonicalized the path used for root path internally so that root path can be used with SAS tokens, which requires
that the path in the URL and the path in the SAS token match. Internally the code was using
"//" instead of "/" for the root path, sometimes. Also related to this, the AzureBlobFileSystemStore.getRelativePath
API was updated so that we no longer remove and then add back a preceding forward / to paths.
To run ITestAzureBlobFileSystemDelegationSAS tests follow the instructions in testing_azure.md under the heading
"To run Delegation SAS test cases". You also need to set "fs.azure.enable.check.access" to true.
TEST RESULTS:
namespace.enabled=true
auth.type=SharedKey
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 41
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24
namespace.enabled=false
auth.type=SharedKey
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 244
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24
namespace.enabled=true
auth.type=SharedKey
sas.token.provider.type=MockDelegationSASTokenProvider
enable.check.access=true
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 0, Skipped: 33
Tests run: 206, Failures: 0, Errors: 0, Skipped: 24
namespace.enabled=true
auth.type=OAuth
-------------------
$mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Tests run: 63, Failures: 0, Errors: 0, Skipped: 0
Tests run: 432, Failures: 0, Errors: 1, Skipped: 74
Tests run: 206, Failures: 0, Errors: 0, Skipped: 140
2020-05-12 13:32:52 -04:00
|
|
|
method. -->
|
2014-10-08 17:20:23 -04:00
|
|
|
<Match>
|
|
|
|
<Class name="org.apache.hadoop.fs.azure.PageBlobOutputStream" />
|
|
|
|
<Method name="logAllStackTraces" />
|
|
|
|
<Bug pattern="WMI_WRONG_MAP_ITERATOR" />
|
|
|
|
<Priority value="2" />
|
|
|
|
</Match>
|
2018-07-19 15:31:19 -04:00
|
|
|
|
|
|
|
<!-- FileMetadata is used internally for storing metadata but also
|
|
|
|
subclasses FileStatus to reduce allocations when listing a large number
|
|
|
|
of files. When it is returned to an external caller as a FileStatus, the
|
|
|
|
extra metadata is no longer useful and we want the equals and hashCode
|
|
|
|
methods of FileStatus to be used. -->
|
|
|
|
<Match>
|
|
|
|
<Class name="org.apache.hadoop.fs.azure.FileMetadata" />
|
|
|
|
<Bug pattern="EQ_DOESNT_OVERRIDE_EQUALS" />
|
|
|
|
</Match>
|
2021-02-04 08:36:19 -05:00
|
|
|
|
|
|
|
<!-- continuation is returned from an external http call. Keeping this
|
|
|
|
outside synchronized block since the same is costly. -->
|
|
|
|
<Match>
|
|
|
|
<Class name="org.apache.hadoop.fs.azurebfs.services.AbfsListStatusRemoteIterator" />
|
|
|
|
<Field name="continuation" />
|
|
|
|
<Bug pattern="IS2_INCONSISTENT_SYNC" />
|
|
|
|
</Match>
|
|
|
|
|
2021-10-22 02:15:42 -04:00
|
|
|
<!-- This field is instance of BlockBlobInputStream and read(long, byte[], int, int)
|
|
|
|
calls it's Super class method when 'fs.azure.block.blob.buffered.pread.disable'
|
|
|
|
is configured false. Super class FSInputStream's implementation is having
|
|
|
|
proper synchronization.
|
|
|
|
When 'fs.azure.block.blob.buffered.pread.disable' is true, we want a lock free
|
|
|
|
implementation of blob read. Here we don't use any of the InputStream's
|
|
|
|
shared resource (buffer) and also don't change any cursor position etc.
|
|
|
|
So its safe to go with unsynchronized way of read. -->
|
|
|
|
<Match>
|
|
|
|
<Class name="org.apache.hadoop.fs.azure.NativeAzureFileSystem$NativeAzureFsInputStream" />
|
|
|
|
<Field name="in" />
|
|
|
|
<Bug pattern="IS2_INCONSISTENT_SYNC" />
|
|
|
|
</Match>
|
HADOOP-9629. Support Windows Azure Storage - Blob as a file system in Hadoop. Contributed by Dexter Bradshaw, Mostafa Elhemali, Xi Fang, Johannes Klein, David Lao, Mike Liddell, Chuan Liu, Lengning Liu, Ivan Mitic, Michael Rys, Alexander Stojanovic, Brian Swan, and Min Wei.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1601781 13f79535-47bb-0310-9956-ffa450edef68
2014-06-10 18:26:45 -04:00
|
|
|
</FindBugsFilter>
|