Commit Graph

26116 Commits

Author SHA1 Message Date
slfan1989 37e213c3fc
YARN-11177. Support getNewReservation, submit / update/ Reservation API's for Federation. (#4764) 2022-09-01 16:35:20 -07:00
monthonk 20560401ec
HADOOP-18339. S3A storage class option only picked up when buffering writes to disk. (#4669)
Follow-up to HADOOP-12020 Support configuration of different S3 storage classes; 
S3 storage class is now set when buffering to heap/bytebuffers, and when
creating directory markers

Contributed by Monthon Klongklaew
2022-09-01 18:14:32 +01:00
Steve Vaughan 2dd8b1342e
HDFS-16755. TestQJMWithFaults.testUnresolvableHostName() can fail due to unexpected host resolution (#4833)
Use ".invalid" domain from IETF RFC 2606 to ensure that the host doesn't resolve.

Contributed by Steve Vaughan Jr
2022-09-01 14:00:15 +01:00
slfan1989 33edbed54c
YARN-11272. Federation StateStore: Support storage/retrieval of Reservations With Zk. (#4781) 2022-08-31 10:15:15 -07:00
Mukund Thakur 19830c98bc
HADOOP-18391. Improvements in VectoredReadUtils#readVectored() for direct buffers (#4787)
part of HADOOP-18103.

Contributed By: Mukund Thakur
2022-08-31 21:41:41 +05:30
9uapaw 84081a8cae MAPREDUCE-7409. Make shuffle key length configurable. Contributed by Ashutosh Gupta. 2022-08-31 17:32:51 +02:00
Steve Loughran c69e16b297
HADOOP-18410. S3AInputStream.unbuffer() does not release http connections (#4766)
HADOOP-16202 "Enhance openFile()" added asynchronous draining of the 
remaining bytes of an S3 HTTP input stream for those operations
(unbuffer, seek) where it could avoid blocking the active
thread.

This patch fixes the asynchronous stream draining to work and so
return the stream back to the http pool. Without this, whenever
unbuffer() or seek() was called on a stream and an asynchronous
drain triggered, the connection was not returned; eventually
the pool would be empty and subsequent S3 requests would
fail with the message "Timeout waiting for connection from pool"

The root cause was that even though the fields passed in to drain() were
converted to references through the methods, in the lambda expression
passed in to submit, they were direct references

operation = client.submit(
 () -> drain(uri, streamStatistics,
       false, reason, remaining,
       object, wrappedStream));  /* here */

Those fields were only read during the async execution, at which
point they would have been set to null (or even a subsequent read).

A new SDKStreamDrainer class peforms the draining; this is a Callable
and can be submitted directly to the executor pool.

The class is used in both the classic and prefetching s3a input streams.

Also, calling unbuffer() switches the S3AInputStream from adaptive
to random IO mode; that is, it is considered a cue that future
IO will not be sequential, whole-file reads.

Contributed by Steve Loughran.
2022-08-31 11:16:52 +01:00
Gautham B A c334ba89ad
HADOOP-18428. Parameterize platform toolset version (#4815)
* This PR adds an option
  use.platformToolsetVersion that
  makes the build systems to use
  this platform toolset version.
* This also makes sure that
  win-vs-upgrade.cmd does not get
  executed when the
  use.platformToolsetVersion
  option is specified.
2022-08-30 22:41:03 +05:30
slfan1989 8a47ed6f84
YARN-11287. Fix NoClassDefFoundError: org/junit/platform/launcher/core/LauncherFactory after YARN-10793 (#4828)
Co-authored-by: slfan1989 <louj1988@@>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-08-30 20:41:04 +09:00
Masatake Iwasaki 22835be63d
HADOOP-18375. Fix failure of shelltest for hadoop_add_ldlibpath. (#4652) 2022-08-30 19:33:29 +09:00
Ashutosh Gupta 90dba8b614
YARN-11245. Upgrade JUnit from 4 to 5 in hadoop-yarn-csi (#4778)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-08-30 17:26:06 +09:00
Samrat 2c05015716
YARN-11196. NUMA support in DefaultContainerExecutor (#4742) 2022-08-30 10:39:41 +05:30
zhangshuyan0 71778a6cc5
HDFS-16735. Reduce the number of HeartbeatManager loops. (#4780). Contributed by Shuyan Zhang.
Signed-off-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2022-08-29 11:30:21 +08:00
slfan1989 c60a900583
YARN-11275. [Federation] Add batchFinishApplicationMaster in UAMPoolManager. (#4792) 2022-08-27 10:17:00 -07:00
slfan1989 0075ef15c2
YARN-8482. [Router] Add cache for fast answers to getApps. (#4769) 2022-08-27 10:14:55 -07:00
slfan1989 4031b0774e
YARN-11253. Add Configuration to delegationToken RemoverScanInterval. (#4751) 2022-08-27 10:02:59 -07:00
ZanderXu 5567154f71
HDFS-16734. RBF: fix some bugs when handling getContentSummary RPC (#4763) 2022-08-26 16:04:33 -07:00
slfan1989 f8b9dd911c
YARN-11219. [Federation] Add getAppActivities, getAppStatistics REST APIs for Router. (#4757) 2022-08-26 16:01:17 -07:00
Gautham B A 5736b34b2a
HDFS-16736. Link to Boost library in libhdfspp (#4782) 2022-08-26 09:11:44 -07:00
zhengchenyu 231a4468cd
HDFS-16732. [SBN READ] Avoid get location from observer when the block report is delayed (#4756)
Signed-off-by: Erik Krogen <xkrogen@apache.org>
2022-08-25 10:37:25 -07:00
ahmarsuhail 7fb9c306e2
HADOOP-18382. AWS SDK v2 upgrade prerequisites (#4698)
This patch prepares the hadoop-aws module for a future
migration to using the v2 AWS SDK (HADOOP-18073)

That upgrade will be incompatible; this patch prepares
for it:
-marks some credential providers and other 
 classes and methods as @deprecated.
-updates site documentation
-reduces the visibility of the s3 client;
 other than for testing, it is kept private to
 the S3AFileSystem class.
-logs some warnings when deprecated APIs are used.

The warning messages are printed only once
per JVM's life. To disable them, set the
log level of org.apache.hadoop.fs.s3a.SDKV2Upgrade
to ERROR
 
Contributed by Ahmar Suhail
2022-08-25 17:36:48 +01:00
ZanderXu 1691cccc89
HDFS-16738. Invalid CallerContext caused NullPointerException (#4791) 2022-08-25 17:12:27 +08:00
Ayush Saxena 880686d1e3
Revert "HADOOP-18417. Upgrade to M7 of surefire plugin (#4795)"
This reverts commit 1ff121041c.
2022-08-25 03:44:49 +05:30
ZanderXu 8d4f51c432
HDFS-16728. RBF throw IndexOutOfBoundsException with disableNameServices (#4734). Contributed by ZanderXu.
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-08-24 20:27:15 +05:30
slfan1989 75aff247ae
YARN-11240. Fix incorrect placeholder in yarn-module. (#4678). Contributed by fanshilun
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-08-24 20:06:36 +05:30
slfan1989 052d7f286e
HADOOP-18361. Update commons-net from 3.6 to 3.8.0. (#4683). Contributed by fanshilun.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-08-24 20:05:17 +05:30
Steve Loughran de37fd37d6
MAPREDUCE-7403. manifest-committer dynamic partitioning support. (#4728)
Declares its compatibility with Spark's dynamic
output partitioning by having the stream capability
"mapreduce.job.committer.dynamic.partitioning"

Requires a Spark release with SPARK-40034, which
does the probing before deciding whether to 
accept/rejecting instantiation with
dynamic partition overwrite set

This feature can be declared as supported by
any other PathOutputCommitter implementations
whose algorithm and destination filesystem
are compatible.

None of the S3A committers are compatible.

The classic FileOutputCommitter is, but it
does not declare itself as such out of our fear
of changing that code. The Spark-side code
will automatically infer compatibility if
the created committer is of that class or
a subclass.

Contributed by Steve Loughran.
2022-08-24 11:18:19 +01:00
Steve Vaughan 1ff121041c
HADOOP-18417. Upgrade to M7 of surefire plugin (#4795)
This addresses an issue where the plugin's default classpath for executing tests fails to include org.junit.platform.launcher.core.LauncherFactory.

Contributed by: Steve Vaughan Jr
2022-08-24 11:04:04 +01:00
Simba Dzinamarira 4890ba5052
HADOOP-18406: Adds alignment context to call path for creating RPC proxy with multiple connections per user.
Fixes #4748

Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
2022-08-23 17:00:57 -07:00
ZanderXu c37f01d95b
HDFS-16724. RBF should support get the information about ancestor mount points (#4719) 2022-08-23 13:25:42 -07:00
Simba Dzinamarira a3b1bafa34
HDFS-16669: Enhance client protocol to propagate last seen state IDs for multiple nameservices.
Fixes #4584

Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
2022-08-23 11:12:50 -07:00
Steve Vaughan 6fbc38db95
HDFS-16686. GetJournalEditServlet fails to authorize valid Kerberos request (#4724) 2022-08-23 08:03:57 -07:00
ZanderXu 183f09b1da
HDFS-16717. Replace NPE with IOException in DataNode.class (#4699). Contributed by ZanderXu.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2022-08-23 18:17:32 +08:00
Viraj Jasani c249db80c2
HADOOP-18380. fs.s3a.prefetch.block.size to be read through longBytesOption (#4762)
Contributed by Viraj Jasani.
2022-08-23 10:49:04 +01:00
slfan1989 eda4bb5dcd
YARN-11250. Capture the Performance Metrics of ZookeeperFederationStateStore. (#4738) 2022-08-22 14:09:20 -07:00
Steve Vaughan 17daad34d4
HADOOP-18279. Cancel fileMonitoringTimer even if trustManager isn't defined (#4767) 2022-08-22 12:22:23 -07:00
Mukund Thakur 231e095802
HADOOP-18407. Improve readVectored() api spec (#4760)
part of HADOOP-18103.

Contributed By: Mukund Thakur
2022-08-22 23:19:29 +05:30
Steve Vaughan a9e5fb3313
HDFS-16684. Exclude the current JournalNode (#4723)
Exclude bound local addresses, including the use of a wildcard address in the bound host configurations.
* Allow sync attempts with unresolved addresses
* Update the comments.
* Remove unused import

Signed-off-by: stack <stack@apache.org>
2022-08-22 09:52:45 -07:00
Ashutosh Gupta c294a414b9
YARN-9425. Make initialDelay configurable for FederationStateStoreService#scheduledExecutorService (#4731). Contributed by groot and Shen Yinjie.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-08-22 03:40:00 +05:30
jianghuazhu 7f176d080c
HDFS-16729. RBF: fix some unreasonably annotated docs. (#4745)
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-08-21 07:29:31 +09:00
Clara Fang c870171182
YARN-11254. hadoop-minikdc dependency duplicated in hadoop-yarn-server-nodemanager (#4755)
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-08-21 07:09:42 +09:00
Ashutosh Gupta b253b3be9f
YARN-11269. Upgrade JUnit from 4 to 5 in hadoop-yarn-server-timeline-pluginstorage (#4771)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-08-21 06:52:23 +09:00
slfan1989 f75c58a1ca
YARN-11252. Yarn Federation Router Supports Update / Delete Reservation in MemoryStore. (#4741) 2022-08-18 21:13:43 -07:00
Viraj Jasani 7f030250b4
HADOOP-18403. Fix FileSystem leak in ITestS3AAWSCredentialsProvider (#4737)
Contributed By: Viraj Jasani
2022-08-19 04:14:43 +05:30
Steve Vaughan b7d4dc61bf
HADOOP-18365. Update the remote address when a change is detected (#4692)
Avoid reconnecting to the old address after detecting that the address has been updated.

* Fix Checkstyle line length violation
* Keep ConnectionId as Immutable for map key

The ConnectionId is used as a key in the connections map, and updating the remoteId caused problems with the cleanup of connections when the removeMethod was used.

Instead of updating the address within the remoteId, use the removeMethod to cleanup references to the current identifier and then replace it with a new identifier using the updated address.

* Use final to protect immutable ConnectionId

Mark non-test fields as private and final, and add a missing accessor.

* Use a stable hashCode to allow safe IP addr changes
* Add test that updated address is used

Once the address has been updated, it should be used in future calls.  Check to ensure that a second request succeeds and that it uses the existing updated address instead of having to re-resolve.

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Signed-off-by: sokui
Signed-off-by: XanderZu
Signed-off-by: stack <stack@apache.org>
2022-08-18 09:21:23 -07:00
Ashutosh Gupta d09dd4a0b9
HADOOP-18385. ITestS3ACannedACLs failure; fixed by adding in a span (#4736)
Contributed by Ashutosh Gupta
2022-08-18 13:57:43 +01:00
Steve Loughran 682931a6ac
HADOOP-18028. High performance S3A input stream (#4752)
This is the the preview release of the HADOOP-18028 S3A performance input stream.
It is still stabilizing, but ready to test.

Contains

HADOOP-18028. High performance S3A input stream (#4109)
	Contributed by Bhalchandra Pandit.

HADOOP-18180. Replace use of twitter util-core with java futures (#4115)
	Contributed by PJ Fanning.

HADOOP-18177. Document prefetching architecture. (#4205)
	Contributed by Ahmar Suhail

HADOOP-18175. fix test failures with prefetching s3a input stream (#4212)
 Contributed by Monthon Klongklaew

HADOOP-18231.  S3A prefetching: fix failing tests & drain stream async.  (#4386)

	* adds in new test for prefetching input stream
	* creates streamStats before opening stream
	* updates numBlocks calculation method
	* fixes ITestS3AOpenCost.testOpenFileLongerLength
	* drains stream async
	* fixes failing unit test

	Contributed by Ahmar Suhail

HADOOP-18254. Disable S3A prefetching by default. (#4469)
	Contributed by Ahmar Suhail

HADOOP-18190. Collect IOStatistics during S3A prefetching (#4458)

	This adds iOStatisticsConnection to the S3PrefetchingInputStream class, with
	new statistic names in StreamStatistics.

	This stream is not (yet) IOStatisticsContext aware.

	Contributed by Ahmar Suhail

HADOOP-18379 rebase feature/HADOOP-18028-s3a-prefetch to trunk
HADOOP-18187. Convert s3a prefetching to use JavaDoc for fields and enums.
HADOOP-18318. Update class names to be clear they belong to S3A prefetching
	Contributed by Steve Loughran
2022-08-18 13:53:06 +01:00
slfan1989 cd72f7e042
YARN-11224. [Federation] Add getAppQueue, updateAppQueue REST APIs for Router. (#4747) 2022-08-17 13:13:07 -07:00
Steve Vaughan e40b3a3089
HDFS-4043. Namenode Kerberos Login does not use proper hostname for host qualified hdfs principal name. (#4693) 2022-08-17 12:03:33 -07:00
Ashutosh Gupta 5cc8c574d1
HDFS-16676. DatanodeAdminManager$Monitor reports a node as invalid continuously (#4626)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
2022-08-18 02:25:09 +08:00