1966 Commits

Author SHA1 Message Date
Eric Payne
9ee5265fb3 YARN-10178: Global Scheduler async thread crash caused by 'Comparison method violates its general contract. Contributed by Andras Gyori (gandras) and Qi Zhu (zhuqi). 2021-12-21 19:48:06 +00:00
Sunil G
29f81c6121 YARN-9984. FSPreemptionThread can cause NullPointerException while app is unregistered with containers running on a node. Contributed by Wilfred Spiegelenburg.
(cherry picked from commit 215f2052fc3b7e366e8bd1bd332663966fa9206c)

 Conflicts:
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSPreemptionThread.java
2021-11-29 14:35:47 +09:00
Shubham Gupta
484cac36fd YARN-10438. Handle null containerId in ClientRMService#getContainerReport() (#2313)
Co-authored-by: Shubham Gupta <gshubham@microsoft.com>
(cherry picked from commit e3cd627069c7d35b4638af3f2299a248eeca3923)
2021-11-29 14:21:56 +09:00
Ahmed Hussein
de120b16ad YARN-1115: Provide optional means for a scheduler to check real user ACLs. Contributed by Eric Payne (epayne) 2021-10-22 17:02:38 +00:00
Weiwei Yang
5f2047d491 YARN-8222. Fix potential NPE when gets RMApp from RM context. Contributed by Tao Yang.
(cherry picked from commit 251f528814c4a4647cac0af6effb9a73135db180)
2021-10-12 17:43:43 +00:00
Weiwei Yang
bdd396b26d YARN-8546. Resource leak caused by a reserved container being released more than once under async scheduling. Contributed by Tao Yang.
(cherry picked from commit 5be9f4a5d05c9cb99348719fe35626b1de3055db)
2021-10-08 16:08:45 +00:00
Weiwei Yang
dc03afc7df YARN-8127. Resource leak when async scheduling is enabled. Contributed by Tao Yang.
(cherry picked from commit 7eb783e2634d8c11fb646f1f2fdf597336325312)
2021-10-04 20:16:40 +00:00
Eric Badger
008bd8afc3 YARN-10935. AM Total Queue Limit goes below per-user AM Limit if parent is full. Contributed by Eric Payne. 2021-09-23 17:12:45 +00:00
Szilard Nemeth
b196130c29
YARN-10428. Zombie applications in the YARN queue using FAIR + sizebasedweight. Contributed by Guang Yang, Andras Gyori
(cherry picked from commit 79a46599f76e470527ad94b0894dacb28db01465)

 Conflicts:
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/TestFairOrderingPolicy.java

(cherry picked from commit 7aea2e1b5c24cd6e2dffbe6942f0dadb6a45c48f)
2021-09-01 13:16:30 +09:00
zhuqi-lucas
34acf9d4c8 YARN-10860. Make max container per heartbeat configs refreshable. Contributed by Eric Badger. 2021-07-21 15:35:45 +08:00
Jim Brennan
577ed175f9 YARN-10456. RM PartitionQueueMetrics records are named QueueMetrics in Simon metrics registry. Contributed by Eric Payne. 2021-07-15 15:21:02 +00:00
Jim Brennan
f7bcc58e0f YARN-10834. Intra-queue preemption: apps that don't use defined custom resource won't be preempted. Contributed by Eric Payne. 2021-06-29 14:22:39 +00:00
lujiefsi
13a2e751e0
YARN-10555. Missing access check before getAppAttempts (#2608)
Co-authored-by: lujie <lujie@foxmail.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit d92a25b790e5ad7d8e21fc3949cdd0f74d496b1b)
2021-05-17 19:47:03 +09:00
Eric Badger
7b3a6e96d9 YARN-10479. Can't remove all node labels after add node label without
nodemanager port, broken by YARN-10647. Contributed by D M Murali Krishna Reddy

(cherry picked from commit 6857a05d6ac566a60336c0a28951f09ecda39f24)
2021-04-23 23:10:50 +00:00
Jim Brennan
33c4d4570d YARN-10697. Resources are displayed in bytes in UI for schedulers other than capacity. Contributed by Bilwa S T.
(cherry picked from commit 34e507cb8c11d3b6ee561fd4aabde6dadadcee00)
2021-03-23 19:05:57 +00:00
Eric Payne
d53ca0b887 YARN-10588. Percentage of queue and cluster is zero in WebUI . Contributed by Bilwa S T
(cherry picked from commit aa4c17b9d7af122163789a731ced05f740562e45)
2021-03-15 20:12:58 +00:00
Jonathan Hung
1d76a8e73f YARN-10651. CapacityScheduler crashed with NPE in AbstractYarnScheduler.updateNodeResource(). Contributed by Haibo Chen
(cherry picked from commit f348ab3f2f468751af329a1ffce4917cb000fcbf)
(cherry picked from commit be6e99963ded94adf6f447ff53f2ba66b99120ca)
(cherry picked from commit 6863a5bb8ace591de3374102920bba916dbebfda)
(cherry picked from commit eb6c08e423dd06bf37ff44665ffb98c97e26ad08)
2021-02-25 15:47:36 -08:00
Jim Brennan
4ed7b80b19 [YARN-10613] Config to allow Intra- and Inter-queue preemption to enable/disable conservativeDRF. Contributed by Eric Payne 2021-02-25 20:30:42 +00:00
Jim Brennan
d0562d6cd0 YARN-10500. TestDelegationTokenRenewer fails intermittently. (#2619) Contributed by Masatake Iwasaki 2021-02-11 22:45:08 +00:00
Eric Badger
7b4034cd88 YARN-6977. Node information is not provided for non am containers in RM logs. (Suma Shivaprasad via wangda)
Change-Id: I0c44d09a560446dee2ba68c2b9ae69fce0ec1d3e
(cherry picked from commit 8a42e922fad613f3cf1cc6cb0f3fa72546a9cc56)
(cherry picked from commit 958e8c0e257216c82f68fee726e5280a919da94a)
2021-02-08 20:04:56 +00:00
Jonathan Hung
6f436a6776 YARN-10467. ContainerIdPBImpl objects can be leaked in RMNodeImpl.completedContainers. Contributed by Haibo Chen 2020-10-28 10:45:34 -07:00
Eric Badger
c4b42fa1ae YARN-10450. Add cpu and memory utilization per node and cluster-wide metrics.
Contributed by Jim Brennan.
2020-10-16 19:29:04 +00:00
Jim Brennan
0bf270d2ed YARN-10451. RM (v1) UI NodesPage can NPE when yarn.io/gpu resource type is defined. Contributed by Eric Payne
(cherry picked from commit ecf91638a8d1f546d90e92d076f32cb5681ce944)
2020-10-06 18:46:08 +00:00
Masatake Iwasaki
f4e0c14fe9 Preparing for 2.10.2 development 2020-09-13 14:33:36 +09:00
Eric E Payne
e5bd8d2840 YARN-10177: Backport YARN-7307 to branch-2.10 Allow client/AM update supported resource types via YARN APIs 2020-09-04 18:23:08 +00:00
Eric E Payne
21788f9fd4 YARN-8459. Improve Capacity Scheduler logs to debug invalid states. Contributed by Wangda Tan and Jim Brennan. 2020-08-10 20:52:44 +00:00
Jonathan Hung
865828ae63 YARN-10251. Show extended resources on legacy RM UI. Contributed by Eric Payne 2020-08-07 17:45:04 -07:00
Eric Badger
9bf554deae YARN-4575. ApplicationResourceUsageReport should return ALL reserved resource.
Contributed by Bibin Chundatt and Eric Payne.

(cherry picked from commit 647be0c0f69887afd7ad63aa1dfa6373a4b65ae6)
2020-08-05 23:20:52 +00:00
Jonathan Hung
50e68e67b6 YARN-10343. Legacy RM UI should include labeled metrics for allocated, total, and reserved resources. Contributed by Eric Payne 2020-07-28 13:45:14 -07:00
Eric Badger
a4b419cdf5 YARN-10348. Allow RM to always cancel tokens after app completes. Contributed by
Jim Brennan.

(cherry picked from commit 09f1547697d0aa51380a0351df6d77f54af074a0)
2020-07-14 18:33:56 +00:00
Eric E Payne
7190507aa2 YARN-10297. TestContinuousScheduling#testFairSchedulerContinuousSchedulingInitTime fails intermittently. Contributed by Jim Brennan (Jim_Brennan)
(cherry picked from commit 0427100b7543d412f4fafe631b7ace289662d28c)
2020-07-13 21:51:32 +00:00
Eric E Payne
76fa956d3b YARN-9903: Support reservations continue looking for Node Labels. Contributed by Jim Brennan (Jim_Brennan).
(cherry picked from commit e6794f2fc4d763459ce13ffa8db4c064bcb076dc)
2020-06-29 19:55:18 +00:00
Tao Yang
a91d4d612f YARN-8011. TestOpportunisticContainerAllocatorAMService#testContainerPromoteAndDemoteBeforeContainerStart fails intermittently. Contributed by Jim Brennan. 2020-06-12 10:56:24 +08:00
Eric E Payne
af324e3153 YARN-10300: appMasterHost not set in RM ApplicationSummary when AM fails before first heartbeat. Contributed by Eric Badger (ebadger).
(cherry picked from commit 56247db3022705635580c4d2f8b0abde109f954f)
(cherry picked from commit 034d458511692341636f0d2ef0574b7516c01ed6)
(cherry picked from commit 2e4892061a2ff1ae99ef5aacffd7b229dc3dac1b)
2020-06-09 22:16:16 +00:00
Jonathan Hung
b9a0f99966 YARN-6492. Generate queue metrics for each partition. Contributed by Manikandan R 2020-06-01 10:48:41 -07:00
Jonathan Hung
b8c88f6968 YARN-10260. Allow transitioning queue from DRAINING to RUNNING state. Contributed by Bilwa S T
(cherry picked from commit fff1d2c1226ec23841b04dd478e8b97f31abbeba)
(cherry picked from commit 564d3211f27c35bf3143a4bd1b3f8eeac2c6b01f)
(cherry picked from commit a7ea55e0156299ec8b80af1f3e681a3a7a31a3b4)
(cherry picked from commit b3e9aff5f7bcafea8b82f9b07719ff53d3ab2f12)
2020-05-12 10:53:37 -07:00
Ahmed Hussein
0f0707fb0d YARN-8959. TestContainerResizing fails randomly (Ahmed Hussein via jeagles)
Signed-off-by: Jonathan Eagles <jeagles@gmail.com>
2020-05-06 12:48:12 -05:00
Jonathan Hung
27ad054696 YARN-8193. YARN RM hangs abruptly (stops allocating resources) when running successive applications. (Zian Chen via wangda) 2020-04-30 12:16:15 -07:00
Wangda Tan
34804679e3 YARN-8369. Javadoc build failed due to 'bad use of >'. (Takanobu Asanuma via wangda)
Change-Id: I79a42154e8f86ab1c3cc939b3745024b8eebe5f4
(cherry picked from commit 17aa40f669f197d43387d67dc00040d14cd00948)
2020-04-19 12:51:56 +09:00
Jonathan Hung
ebce5c74e6 YARN-9954. Configurable max application tags and max tag length. Contributed by Bilwa S T
(cherry picked from commit cd6c10de442fc3a53c9ed5521ac1d944a6ac95c6)
(cherry picked from commit 2c79865b951d0fdea7f576ce31e310b4074ecedd)
2020-04-17 10:35:39 -07:00
Jonathan Hung
c0394c5434 YARN-10212. Create separate configuration for max global AM attempts. Contributed by Bilwa S T
(cherry picked from commit 57659422abbf6d9bf52e6e27fca775254bb77a56)
(cherry picked from commit e3a52804b03d646f15048c078f8c5292d5cbecfa)
(cherry picked from commit 54599b177c46ed511e096909bed0c4f17bca1fe0)
(cherry picked from commit 6271a2852ea70c54589ce554e6bfad2eb703fe86)
2020-04-09 11:07:59 -07:00
Jonathan Hung
a7556f1ec2 YARN-8213. Add Capacity Scheduler performance metrics. (Weiwei Yang via wangda) 2020-03-27 16:10:39 -07:00
Jonathan Hung
1c8529f030 YARN-10200. Add number of containers to RMAppManager summary
(cherry picked from commit 2de0572cdc1c6fdbfaab108b169b2d5b0c077e86)
(cherry picked from commit 5d3fb0ebe9d3f3395320b82a76194ba6fad01e00)
(cherry picked from commit 9c6dd8c83a29183d70cd4a69a8317a9303954cc1)
2020-03-25 10:39:45 -07:00
Jonathan Hung
c34c87b1a8 YARN-8292. Fix the dominant resource preemption cannot happen when some of the resource vector becomes negative. Contributed by Wangda Tan and Eric Payne 2020-02-07 17:29:32 -08:00
Jonathan Hung
4fce8c8023 YARN-10116. Expose diagnostics in RMAppManager summary
(cherry picked from commit 314e2f9d2e888fae1e5bf669aeeead84a928d282)
(cherry picked from commit 147897da4b420b4749f3c7b410f4c329632c3352)
(cherry picked from commit fa35b8370ce14c9b8ee911b73fda380817b964fd)
2020-02-05 11:16:09 -08:00
Eric Badger
21970f6f67 YARN-10084. Allow inheritance of max app lifetime / default app lifetime. Contributed by Eric Payne. 2020-01-30 21:29:33 +00:00
Abhishek Modi
296786a647 YARN-9790. Failed to set default-application-lifetime if maximum-application-lifetime is less than or equal to zero. Contributed by kyungwan nam.
(cherry picked from commit d2d963f3d4819704351c04dbeb90fc8154488f91)
2020-01-23 17:12:25 +00:00
Eric E Payne
5cca5ca81b YARN-7387: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer fails intermittently. Contributed by Jim Brennan (Jim_Brennan)
(cherry picked from commit b1e07d27cc1a26be4e5ebd1ab7b03ef15032bef0)
2020-01-08 19:59:13 +00:00
Eric E Payne
2ae1b3568b YARN-10072: TestCSAllocateCustomResource failures. Contributed by Jim Brennan (Jim_Brennan)
(cherry picked from commit 6899be5a1729e49cff45090acd2cf4f54aeac089)
2020-01-08 18:04:12 +00:00
Eric Badger
cb5b80d6cb YARN-10009. In Capacity Scheduler, DRC can treat minimum user limit percent as a max when custom resource is defined. Contributed by Eric Payne. 2019-12-20 19:40:55 +00:00