Go to file
Steve Loughran c69e16b297
HADOOP-18410. S3AInputStream.unbuffer() does not release http connections (#4766)
HADOOP-16202 "Enhance openFile()" added asynchronous draining of the 
remaining bytes of an S3 HTTP input stream for those operations
(unbuffer, seek) where it could avoid blocking the active
thread.

This patch fixes the asynchronous stream draining to work and so
return the stream back to the http pool. Without this, whenever
unbuffer() or seek() was called on a stream and an asynchronous
drain triggered, the connection was not returned; eventually
the pool would be empty and subsequent S3 requests would
fail with the message "Timeout waiting for connection from pool"

The root cause was that even though the fields passed in to drain() were
converted to references through the methods, in the lambda expression
passed in to submit, they were direct references

operation = client.submit(
 () -> drain(uri, streamStatistics,
       false, reason, remaining,
       object, wrappedStream));  /* here */

Those fields were only read during the async execution, at which
point they would have been set to null (or even a subsequent read).

A new SDKStreamDrainer class peforms the draining; this is a Callable
and can be submitted directly to the executor pool.

The class is used in both the classic and prefetching s3a input streams.

Also, calling unbuffer() switches the S3AInputStream from adaptive
to random IO mode; that is, it is considered a cue that future
IO will not be sequential, whole-file reads.

Contributed by Steve Loughran.
2022-08-31 11:16:52 +01:00
.github HADOOP-17799. Improve the GitHub pull request template (#3277) 2021-08-14 21:16:15 +09:00
dev-support HADOOP-11867. Add a high-performance vectored read API. (#3904) 2022-06-22 17:29:32 +01:00
hadoop-assemblies HDFS-15346. FedBalance tool implementation. Contributed by Jinglun. 2020-06-18 13:33:25 +08:00
hadoop-build-tools HADOOP-17968 Migrate checkstyle module illegalimport to maven enforcer banned-illegal-imports (#3584) 2021-10-28 15:57:15 +09:00
hadoop-client-modules HADOOP-18332: remove rs-api dependency as it conflicts with jsr311-api (#4547) 2022-07-17 21:37:54 +05:30
hadoop-cloud-storage-project HADOOP-18159. Bump cos_api-bundle to 5.6.69 to update public-suffix-list.txt (#4444) 2022-06-15 20:03:26 +01:00
hadoop-common-project HADOOP-18428. Parameterize platform toolset version (#4815) 2022-08-30 22:41:03 +05:30
hadoop-dist Preparing for 3.4.0 development 2020-03-29 23:24:25 +05:30
hadoop-hdfs-project HDFS-16735. Reduce the number of HeartbeatManager loops. (#4780). Contributed by Shuyan Zhang. 2022-08-29 11:30:21 +08:00
hadoop-mapreduce-project MAPREDUCE-7403. manifest-committer dynamic partitioning support. (#4728) 2022-08-24 11:18:19 +01:00
hadoop-maven-plugins HADOOP-18294. Ensure build folder exists before writing checksum file.ProtocRunner#writeChecksums (#4446) 2022-07-12 20:15:26 +09:00
hadoop-minicluster HADOOP-18131. Upgrade maven enforcer plugin and relevant dependencies (#4000) 2022-03-08 17:27:04 +09:00
hadoop-project Revert "HADOOP-18417. Upgrade to M7 of surefire plugin (#4795)" 2022-08-25 03:44:49 +05:30
hadoop-project-dist HADOOP-18305. Release Hadoop 3.3.4: upstream changelog and jdiff files 2022-08-05 14:06:22 +01:00
hadoop-tools HADOOP-18410. S3AInputStream.unbuffer() does not release http connections (#4766) 2022-08-31 11:16:52 +01:00
hadoop-yarn-project YARN-11287. Fix NoClassDefFoundError: org/junit/platform/launcher/core/LauncherFactory after YARN-10793 (#4828) 2022-08-30 20:41:04 +09:00
licenses HADOOP-17144. Update Hadoop's lz4 to v1.9.2. Contributed by Hemanth Boyina. 2020-10-18 18:37:46 +05:30
licenses-binary HADOOP-15993. Upgrade Kafka to 2.4.0 in hadoop-kafka module. (#1796) 2020-01-09 16:24:58 +09:00
.asf.yaml HADOOP-17234. Addendum. Add .asf.yaml to allow github and jira integration. (#4686). Contributed by Ayush Saxena. 2022-08-05 08:34:56 +05:30
.gitattributes HADOOP-13598. Add eol=lf for unix format files in .gitattributes. Contributed by Yiqun Lin. 2016-09-14 11:14:31 +09:00
.gitignore YARN-10407. Add phantomjsdriver.log to gitignore. (#2244) 2020-09-01 10:44:55 +09:00
BUILDING.txt Update BUILDING.txt (#3811) 2021-12-22 13:08:14 +08:00
LICENSE-binary HADOOP-18361. Update commons-net from 3.6 to 3.8.0. (#4683). Contributed by fanshilun. 2022-08-24 20:05:17 +05:30
LICENSE.txt HADOOP-18044. Hadoop - Upgrade to jQuery 3.6.0 (#3791) 2022-01-12 11:40:32 +08:00
NOTICE-binary HADOOP-18068. upgrade AWS SDK to 1.12.132 (#3864) 2022-01-18 10:31:28 +00:00
NOTICE.txt HADOOP-15958. Revisiting LICENSE and NOTICE files. 2019-08-27 13:47:12 +09:00
README.txt HADOOP-15958. Revisiting LICENSE and NOTICE files. 2019-08-27 13:47:12 +09:00
pom.xml HADOOP-18297. Upgrade dependency-check-maven to 7.1.1 (#4449) 2022-07-03 04:04:55 +08:00
start-build-env.sh HADOOP-18052. Support Apple Silicon in start-build-env.sh (#3817) 2021-12-23 18:13:18 +09:00

README.txt

For the latest information about Hadoop, please visit our website at:

   http://hadoop.apache.org/

and our wiki, at:

   https://cwiki.apache.org/confluence/display/HADOOP/