Commit Graph

3593 Commits

Author SHA1 Message Date
Eric Badger 52791d2005 YARN-10540. Node page is broken in YARN UI1 and UI2 including RMWebService api
for nodes. Contributed by Jim Brennan.

(cherry picked from commit 4c5d88e230)
2020-12-21 23:32:18 +00:00
Eric Payne 73cf0bdba1 YARN-10278: CapacityScheduler test framework ProportionalCapacityPreemptionPolicyMockFramework. Contributed by Szilard Nemeth (snemeth)
(cherry picked from commit 1184284baf)
2020-12-02 17:27:32 +00:00
Peter Bacsko a86c91bc06 YARN-10396. Max applications calculation per queue disregards queue level settings in absolute mode. Contributed by Benjamin Teke. 2020-11-16 11:54:19 +01:00
Eric E Payne c7b1bdba63 YARN-10479. RMProxy should retry on SocketTimeout Exceptions. Contributed by Jim Brennan (Jim_Brennan)
(cherry picked from commit 55339c2bdd)
2020-11-05 22:33:58 +00:00
Eric E Payne 052b9799c0 YARN-10475: Scale RM-NM heartbeat interval based on node utilization. Contributed by Jim Brennan (Jim_Brennan).
(cherry picked from commit 31154fdde5)
2020-11-02 17:59:58 +00:00
Jonathan Hung 42fab7897a YARN-10467. ContainerIdPBImpl objects can be leaked in RMNodeImpl.completedContainers. Contributed by Haibo Chen
(cherry picked from commit bab5bf9743)
(cherry picked from commit f95c0824b0)
(cherry picked from commit d0104e72c5)
2020-10-28 10:42:02 -07:00
Eric Badger 50eba43c7f YARN-10450. Add cpu and memory utilization per node and cluster-wide metrics.
Contributed by Jim Brennan.
2020-10-16 19:01:50 +00:00
Jim Brennan 7bda43fbd5 YARN-9667. Container-executor.c duplicates messages to stdout. Contributed by Peter Bacsko
(cherry picked from commit e1c6804ace)
2020-10-08 21:13:39 +00:00
Jim Brennan 3a360b4bf1 YARN-10455. TestNMProxy.testNMProxyRPCRetry is not consistent. Contributed by Ahmed Hussein
(cherry picked from commit deb35a32ba)
2020-10-08 19:33:16 +00:00
Jim Brennan c36435c033 YARN-10451. RM (v1) UI NodesPage can NPE when yarn.io/gpu resource type is defined. Contributed by Eric Payne
(cherry picked from commit ecf91638a8)
2020-10-06 18:38:23 +00:00
Adam Antal b5e6ae97f4 YARN-10393. MR job live lock caused by completed state container leak in heartbeat between node manager and RM. Contributed by zhenzhao wang and Jim Brennan
(cherry picked from commit a1f7e760df)
2020-10-05 10:53:00 +02:00
Jim Brennan c83a3177dc YARN-10430. Log improvements in NodeStatusUpdaterImpl. Contributed by Bilwa S T.
(cherry picked from commit 1efb54bd52)
2020-09-15 16:37:19 +00:00
Eric E Payne 1a8d9ddd14 YARN-10390: LeafQueue: retain user limits cache across assignContainers() calls. Contributed by Samir Khan (samkhan).
(cherry picked from commit 9afec2ed17)
2020-09-11 17:07:09 +00:00
bibinchundatt daff39a601 YARN-10369. Make NMTokenSecretManagerInRM sending NMToken for nodeId DEBUG. Contributed by Jim Brennan.
(cherry picked from commit 5d8600e80a)
2020-09-08 21:47:00 +00:00
Eric Badger c3dbbd66b9 [YARN-10353] Log vcores used and cumulative cpu in containers monitor.
Contributed by Jim Brennan

(cherry picked from commit 736bed6d6d)
(cherry picked from commit 01ada576f3)
2020-09-08 17:31:04 +00:00
Sunil G 2f7e584c21 Revert "YARN-10396. Max applications calculation per queue disregards queue level settings in absolute mode. Contributed by Benjamin Teke."
This reverts commit 7369fc31b4.
2020-08-20 19:13:03 +05:30
Sunil G 7369fc31b4 YARN-10396. Max applications calculation per queue disregards queue level settings in absolute mode. Contributed by Benjamin Teke.
(cherry picked from commit 82ec28f442)
2020-08-19 12:02:40 +05:30
Jonathan Hung c18a38bd07 YARN-10251. Show extended resources on legacy RM UI. Contributed by Eric Payne
(cherry picked from commit 17d18a2a3a)
2020-08-07 17:44:26 -07:00
Eric Badger 647be0c0f6 YARN-4575. ApplicationResourceUsageReport should return ALL reserved resource.
Contributed by Bibin Chundatt and Eric Payne.
2020-08-05 23:14:27 +00:00
Eric E Payne 8d8366166e YARN-1529: Add Localization overhead metrics to NM. Contributed by Jim_Brennan.
(cherry picked from commit e0c9653166)
(cherry picked from commit 863689ff9a)
2020-07-30 17:27:35 +00:00
Jonathan Hung b0edec2918 YARN-10343. Legacy RM UI should include labeled metrics for allocated, total, and reserved resources. Contributed by Eric Payne
(cherry picked from commit ffb920de2a)
2020-07-28 13:44:40 -07:00
Eric Badger 2255bcff87 YARN-4771. Some containers can be skipped during log aggregation after NM
restart. Contributed by Jason Lowe and Jim Brennan.

(cherry picked from commit ac5f21dbef)
2020-07-24 23:02:19 +00:00
Ayush Saxena 4592af898b HADOOP-17100. Replace Guava Supplier with Java8+ Supplier in Hadoop. Contributed by Ahmed Hussein. 2020-07-22 19:05:13 +05:30
Ahmed Hussein b7b9bd32db HADOOP-17099. Replace Guava Predicate with Java8+ Predicate
Signed-off-by: Jonathan Eagles <jeagles@gmail.com>
(cherry picked from commit 1f71c4ae71)
2020-07-15 12:19:42 -05:00
Eric Badger 4cf5c282d0 YARN-10348. Allow RM to always cancel tokens after app completes. Contributed by
Jim Brennan.

(cherry picked from commit 09f1547697)
2020-07-14 18:27:23 +00:00
Eric E Payne ae4e6d1677 YARN-10297. TestContinuousScheduling#testFairSchedulerContinuousSchedulingInitTime fails intermittently. Contributed by Jim Brennan (Jim_Brennan)
(cherry picked from commit 0427100b75)
2020-07-13 21:37:37 +00:00
Masatake Iwasaki 4fa8055aa4 YARN-10347. Fix double locking in CapacityScheduler#reinitialize in branch-3.1. 2020-07-09 08:19:15 +09:00
Eric E Payne b03e6fc44a YARN-9903: Support reservations continue looking for Node Labels. Contributed by Jim Brennan (Jim_Brennan).
(cherry picked from commit e6794f2fc4)
2020-06-29 19:37:25 +00:00
Szilard Nemeth 59b20a1ebf YARN-10295. CapacityScheduler NPE can cause apps to get stuck without resources. Contributed by Benjamin Teke 2020-06-10 18:15:08 +02:00
Eric E Payne 2e4892061a YARN-10300: appMasterHost not set in RM ApplicationSummary when AM fails before first heartbeat. Contributed by Eric Badger (ebadger).
(cherry picked from commit 56247db302)
(cherry picked from commit 034d458511)
2020-06-09 21:34:20 +00:00
Jonathan Hung aa19cb20ea YARN-6492. Generate queue metrics for each partition. Contributed by Manikandan R
(cherry picked from commit c30c23cb66)
(cherry picked from commit 7a323a45aa)
(cherry picked from commit a80595a6deb3124a3d6d99057e9d5298cd7237d8)
2020-05-29 10:56:31 -07:00
Jonathan Hung b3e9aff5f7 YARN-10260. Allow transitioning queue from DRAINING to RUNNING state. Contributed by Bilwa S T
(cherry picked from commit fff1d2c122)
(cherry picked from commit 564d3211f2)
(cherry picked from commit a7ea55e015)
2020-05-12 10:53:23 -07:00
Ahmed Hussein 4e29738f4c YARN-8959. TestContainerResizing fails randomly (Ahmed Hussein via jeagles)
Signed-off-by: Jonathan Eagles <jeagles@gmail.com>
(cherry picked from commit 92e3ebb401)
2020-05-06 12:36:30 -05:00
Ahmed Hussein e25ac17e2b YARN-10256. Refactor TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic (Ahmed Hussein via jeagles)
Signed-off-by: Jonathan Eagles <jeagles@gmail.com>
(cherry picked from commit f5081a9a5d)
2020-05-04 11:11:14 -05:00
Gabor Bota ec6d2a8402 Preparing for 3.1.5 development
Change-Id: Iabc64aba7392e3b6f9e4e18109fcaa2cfc01d1f9
2020-04-29 11:18:18 +02:00
Jonathan Hung 6dc1f7e154 YARN-9954. Configurable max application tags and max tag length. Contributed by Bilwa S T
(cherry picked from commit 49ae9b2137)
(cherry picked from commit d1af4e0fae)
2020-04-17 10:36:26 -07:00
Jonathan Hung 6271a2852e YARN-10212. Create separate configuration for max global AM attempts. Contributed by Bilwa S T
(cherry picked from commit 57659422abbf6d9bf52e6e27fca775254bb77a56)
(cherry picked from commit e3a52804b03d646f15048c078f8c5292d5cbecfa)
(cherry picked from commit 54599b177c)
2020-04-09 10:44:39 -07:00
Jonathan Hung 9c6dd8c83a YARN-10200. Add number of containers to RMAppManager summary
(cherry picked from commit 2de0572cdc1c6fdbfaab108b169b2d5b0c077e86)
(cherry picked from commit 5d3fb0ebe9)
2020-03-25 10:33:18 -07:00
Eric E Payne 80e394e81b YARN-942. TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers fails sporadically Contributed by Ahmed Hussein (ahussein)
(cherry picked from commit ede05b19d1)
2020-03-10 15:25:10 +00:00
Wei-Chiu Chuang 58b025c8f4 HADOOP-16882. Update jackson-databind to 2.9.10.2 in branch-3.1, branch-2.10. Contributed by Lisheng Sun. 2020-02-28 16:24:38 -08:00
Inigo Goiri 4924622e6e YARN-10161. TestRouterWebServicesREST is corrupting STDOUT. Contributed by Jim Brennan.
(cherry picked from commit a43510e21d)
2020-02-27 13:20:17 -08:00
Sunil G a6124cd2b8 YARN-10139. ValidateAndGetSchedulerConfiguration API fails when cluster max allocation > default 8GB. Contributed by Prabhu Joseph.
(cherry picked from commit 6526f95bd2)
2020-02-19 11:18:53 +05:30
Prabhu Joseph 00fde836f7 YARN-10022. RM Rest API to validate the CapacityScheduler Configuration change.
Contributed by Kinga Marton.

(cherry-picked from commit 1ab9c692fa)
2020-02-11 22:23:07 +05:30
Jonathan Hung 5dfd1dcfe3 YARN-10116. Expose diagnostics in RMAppManager summary
(cherry picked from commit 314e2f9d2e)
(cherry picked from commit 147897da4b420b4749f3c7b410f4c329632c3352)
2020-02-05 11:17:24 -08:00
Eric Badger 4af7d14ce2 YARN-10084. Allow inheritance of max app lifetime / default app lifetime. Contributed by Eric Payne. 2020-01-29 22:45:02 +00:00
Abhishek Modi 868a0f5ef0 YARN-9790. Failed to set default-application-lifetime if maximum-application-lifetime is less than or equal to zero. Contributed by kyungwan nam.
(cherry picked from commit d2d963f3d4)
2020-01-23 15:50:28 +00:00
Szilard Nemeth 9638985428 YARN-7913. Improve error handling when application recovery fails with exception. Contributed by Wilfred Spiegelenburg 2020-01-22 16:30:59 +01:00
Sunil G af89b5b086 YARN-8373. RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH. Contributed by Wilfred Spiegelenburg. 2020-01-15 17:07:34 +05:30
Eric E Payne ec40c1f400 YARN-9018. Add functionality to AuxiliaryLocalPathHandler to return all locations to read for a given path. Contributed by Kuhu Shukla (kshukla)
(cherry picked from commit 93233a7d6e)
2020-01-09 17:41:10 +00:00
Eric Badger cba0bbe98c YARN-8672. TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out. Contributed by Chandni Singh and Jim Brennan. 2020-01-08 20:02:43 +00:00