* The current version of hadolint (1.11.1)
is only suitable for linting Linux
commands in Dockerfile. It fails to
recognize Windows command syntax.
* HADOOP-18133 adds Dockerfile for Windows.
Thus, it's essential to upgrade hadolint
to 2.10.0 which has the ability to
recognize Windows command syntax.
part of HADOOP-18103.
While merging the ranges in CheckSumFs, they are rounded up based on the
value of checksum bytes size which leads to some ranges crossing the EOF
thus they need to be fixed else it will cause EOFException during actual reads.
Contributed By: Mukund Thakur
Follow-up to HADOOP-12020 Support configuration of different S3 storage classes;
S3 storage class is now set when buffering to heap/bytebuffers, and when
creating directory markers
Contributed by Monthon Klongklaew
HADOOP-16202 "Enhance openFile()" added asynchronous draining of the
remaining bytes of an S3 HTTP input stream for those operations
(unbuffer, seek) where it could avoid blocking the active
thread.
This patch fixes the asynchronous stream draining to work and so
return the stream back to the http pool. Without this, whenever
unbuffer() or seek() was called on a stream and an asynchronous
drain triggered, the connection was not returned; eventually
the pool would be empty and subsequent S3 requests would
fail with the message "Timeout waiting for connection from pool"
The root cause was that even though the fields passed in to drain() were
converted to references through the methods, in the lambda expression
passed in to submit, they were direct references
operation = client.submit(
() -> drain(uri, streamStatistics,
false, reason, remaining,
object, wrappedStream)); /* here */
Those fields were only read during the async execution, at which
point they would have been set to null (or even a subsequent read).
A new SDKStreamDrainer class peforms the draining; this is a Callable
and can be submitted directly to the executor pool.
The class is used in both the classic and prefetching s3a input streams.
Also, calling unbuffer() switches the S3AInputStream from adaptive
to random IO mode; that is, it is considered a cue that future
IO will not be sequential, whole-file reads.
Contributed by Steve Loughran.
* This PR adds an option
use.platformToolsetVersion that
makes the build systems to use
this platform toolset version.
* This also makes sure that
win-vs-upgrade.cmd does not get
executed when the
use.platformToolsetVersion
option is specified.
This patch prepares the hadoop-aws module for a future
migration to using the v2 AWS SDK (HADOOP-18073)
That upgrade will be incompatible; this patch prepares
for it:
-marks some credential providers and other
classes and methods as @deprecated.
-updates site documentation
-reduces the visibility of the s3 client;
other than for testing, it is kept private to
the S3AFileSystem class.
-logs some warnings when deprecated APIs are used.
The warning messages are printed only once
per JVM's life. To disable them, set the
log level of org.apache.hadoop.fs.s3a.SDKV2Upgrade
to ERROR
Contributed by Ahmar Suhail