This switches the default behavior of S3A output streams
to warning that Syncable.hsync() or hflush() have been
called; it's not considered an error unless the defaults
are overridden.
This avoids breaking applications which call the APIs,
at the risk of people trying to use S3 as a safe store
of streamed data (HBase WALs, audit logs etc).
Contributed by Steve Loughran.
Add support for S3 Access Points. This provides extra security as it
ensures applications are not working with buckets belong to third parties.
To bind a bucket to an access point, set the access point (ap) ARN,
which must be done for each specific bucket, using the pattern
fs.s3a.bucket.$BUCKET.accesspoint.arn = ARN
* The global/bucket option `fs.s3a.accesspoint.required` to
mandate that buckets must declare their access point.
* This is not compatible with S3Guard.
Consult the documentation for further details.
Contributed by Bogdan Stolojan
Addresses the problem of processes running out of memory when
there are many ABFS output streams queuing data to upload,
especially when the network upload bandwidth is less than the rate
data is generated.
ABFS Output streams now buffer their blocks of data to
"disk", "bytebuffer" or "array", as set in
"fs.azure.data.blocks.buffer"
When buffering via disk, the location for temporary storage
is set in "fs.azure.buffer.dir"
For safe scaling: use "disk" (default); for performance, when
confident that upload bandwidth will never be a bottleneck,
experiment with the memory options.
The number of blocks a single stream can have queued for uploading
is set in "fs.azure.block.upload.active.blocks".
The default value is 20.
Contributed by Mehakmeet Singh.
* HDFS-16129. Fixing the signature secret file misusage in HttpFS.
The signature secret file was not used in HttpFs.
- if the configuration did not contain the deprecated
httpfs.authentication.signature.secret.file option then it
used the random secret provider
- if both option (httpfs. and hadoop.http.) was set then
the HttpFSAuthenticationFilter could not read the file
because the file path was not substituted properly
!NOTE! behavioral change: the deprecated httpfs. configuration
values are overwritten with the hadoop.http. values.
The commit also contains a follow up change to the YARN-10814,
empty secret files will result in a random secret provider.
Co-authored-by: Tamas Domok <tdomok@cloudera.com>
This adds a new class org.apache.hadoop.util.Preconditions which is
* @Private/@Unstable
* Intended to allow us to move off Google Guava
* Is designed to be trivially backportable
(i.e contains no references to guava classes internally)
Please use this instead of the guava equivalents, where possible.
Contributed by: Ahmed Hussein
Change-Id: Ic392451bcfe7d446184b7c995734bcca8c07286e
This migrates the fs.s3a-server-side encryption configuration options
to a name which covers client-side encryption too.
fs.s3a.server-side-encryption-algorithm becomes fs.s3a.encryption.algorithm
fs.s3a.server-side-encryption.key becomes fs.s3a.encryption.key
The existing keys remain valid, simply deprecated and remapped
to the new values. If you want server-side encryption options
to be picked up regardless of hadoop versions, use
the old keys.
(the old key also works for CSE, though as no version of Hadoop
with CSE support has shipped without this remapping, it's less
relevant)
Contributed by: Mehakmeet Singh
This migrates the fs.s3a-server-side encryption configuration options
to a name which covers client-side encryption too.
fs.s3a.server-side-encryption-algorithm becomes fs.s3a.encryption.algorithm
fs.s3a.server-side-encryption.key becomes fs.s3a.encryption.key
The existing keys remain valid, simply deprecated and remapped
to the new values. If you want server-side encryption options
to be picked up regardless of hadoop versions, use
the old keys.
(the old key also works for CSE, though as no version of Hadoop
with CSE support has shipped without this remapping, it's less
relevant)
Contributed by: Mehakmeet Singh