1941 Commits

Author SHA1 Message Date
Jason Lowe
17583e690a YARN-8804. resourceLimits may be wrongly calculated when leaf-queue is blocked in cluster with 3+ level queues. Contributed by Tao Yang
(cherry picked from commit 6b988d821e62d29c118e10a7213583b92c302baf)
2018-09-26 16:54:29 -07:00
Weiwei Yang
4edcde055b Revert "YARN-8771. CapacityScheduler fails to unreserve when cluster resource contains empty resource type. Contributed by Tao Yang."
This reverts commit f56e36935da69b5b12086d80bfe1cdb446213882.
2018-09-19 19:47:06 +08:00
Weiwei Yang
f56e36935d YARN-8771. CapacityScheduler fails to unreserve when cluster resource contains empty resource type. Contributed by Tao Yang.
(cherry picked from commit 0712537e799bc03855d548d1f4bd690dd478b871)
2018-09-19 19:41:07 +08:00
Weiwei Yang
ef17728c58 YARN-8720. CapacityScheduler does not enforce max resource allocation check at queue level. Contributed by Tarun Parimi.
(cherry picked from commit f1a893fdbc2dbe949cae786f08bdb2651b88d673)
2018-09-14 16:43:28 +08:00
Eric E Payne
48dc8de28b YARN-8709: CS preemption monitor always fails since one under-served queue was deleted. Contributed by Tao Yang.
(cherry picked from commit 987d8191ad409298570f7ef981e9bc8fb72ff16c)
2018-09-10 20:18:14 +00:00
Haibo Chen
49590e7c6b YARN-8051. TestRMEmbeddedElector#testCallbackSynchronization is flaky. (Robert Kanter via Haibo Chen)
(cherry picked from commit 93d47a0ed504ee81d4b74d340c1815bdbb3c9b14)
2018-08-24 13:27:54 -05:00
Rohith Sharma K S
675aa2bbc0 YARN-8679. [ATSv2] If HBase cluster is down for long time, high chances that NM ContainerManager dispatcher get blocked. Contributed by Wangda Tan.
(cherry picked from commit 4aacbfff605262aaf3dbd926258afcadc86c72c0)
2018-08-18 11:06:14 +05:30
Jonathan Hung
0420ca5a6f YARN-8559. Expose mutable-conf scheduler's configuration in RM /scheduler-conf endpoint. Contributed by Weiwei Yang. 2018-08-10 15:22:11 -07:00
Sunil G
a3675f382a YARN-8397. Potential thread leak in ActivitiesManager. Contributed by Rohith Sharma K S.
(cherry picked from commit 6310c0d17d6422a595f856a55b4f1fb82be43739)
2018-08-01 08:36:12 +05:30
Jonathan Hung
0d6e1a2aab YARN-7974. Allow updating application tracking url after registration. Contributed by Jonathan Hung 2018-07-30 18:06:25 -07:00
Eric E Payne
299dffc72d YARN-4606. CapacityScheduler: applications could get starved because computation of #activeUsers considers pending apps. Contributed by Manikandan R
(cherry picked from commit 9485c9aee6e9bb935c3e6ae4da81d70b621781de)
2018-07-25 16:49:03 +00:00
Sunil G
1d8fce0d2f YARN-7748. TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted fails due to multiple container fail events. Contributed by Weiwei Yang.
(cherry picked from commit 35ce6eb1f526ce3db7e015fb1761eee15604100c)
2018-07-24 22:21:44 +05:30
bibinchundatt
1f713d6c66 YARN-8548. AllocationRespose proto setNMToken initBuilder not done. Contributed by Bilwa S T.
(cherry picked from commit ff7c2eda34c2c40ad71b50df6462a661bd213fbd)
2018-07-24 16:32:21 +05:30
Eric E Payne
5738bd8a10 YARN-8421: when moving app, activeUsers is increased, even though app does not have outstanding request. Contributed by Kyungwan Nam
(cherry picked from commit 937ef39b3ff90f72392b7a319e4346344db34e03)
2018-07-16 16:53:07 +00:00
Junping Du
0a6942d58c yarn.resourcemanager.fail-fast is used inconsistently. Contributed by Yuanbo Liu.
(cherry picked from commit d9ba6f3656e8dc97d2813181e27d12e52dca4328)
(cherry picked from commit 3d6ba2dd4e4f1c42522c837ca96ee5d13f1491a4)
2018-07-09 16:22:58 +08:00
Weiwei Yang
3151e95d27 YARN-8443. Total #VCores in cluster metrics is wrong when CapacityScheduler reserved some containers. Contributed by Tao Yang. 2018-06-25 09:46:32 +08:00
Rohith Sharma K S
6542e31b78 YARN-8155. Improve ATSv2 client logging in RM and NM publisher. Contributed by Abhishek Modi.
(cherry picked from commit 9119b3cf8f883aa2d5df534afc0c50249fed03c6)
2018-06-14 13:51:05 +05:30
Sunil G
d9b00cdd6b YARN-8404. Timeline event publish need to be async to avoid Dispatcher thread leak in case ATS is down. Contributed by Rohith Sharma K S
(cherry picked from commit 6307962b932e0ee69ba61f5796388c175d79195a)
2018-06-13 16:10:24 +05:30
Weiwei Yang
ef105abb70 YARN-8394. Improve data locality documentation for Capacity Scheduler. Contributed by Weiwei Yang.
(Cherry picked from commit 29024a62038c297f11e8992601f2522ffffc7da7)
2018-06-13 13:57:35 +08:00
Rohith Sharma K S
96e7b7e8ae YARN-8405. RM zk-state-store.parent-path ACLs has been changed since HADOOP-14773. Contributed by Íñigo Goiri.
(cherry picked from commit 2df73dace06cfd2b3193a14cd455297f8f989617)
2018-06-12 17:25:22 +05:30
Robert Kanter
d5e6d0d5f4 YARN-4677. RMNodeResourceUpdateEvent update from scheduler can lead to race condition (wilfreds and gphillips via rkanter)
(cherry picked from commit 0cd145a44390bc1a01113dce4be4e629637c3e8a)
2018-06-04 15:42:46 -07:00
Yongjun Zhang
f0de11ba98 Preparing for 3.0.4 development 2018-05-29 23:40:26 -07:00
Sunil G
521ada1a11 YARN-4781. Support intra-queue preemption for fairness ordering policy. Contributed by Eric Payne.
(cherry picked from commit 7c343669baf660df3b70d58987d6e68aec54d6fa)
2018-05-28 16:34:29 +05:30
Wangda Tan
a5a9c8cf0f YARN-8232. RMContainer lost queue name when RM HA happens. (Hu Ziqian via wangda)
Change-Id: Ia21e1da6871570c993bbedde76ce32929e95970f
(cherry picked from commit 6b96a73bb0f0ad1c877a062b19091e3e15a33ec4)
2018-05-22 10:34:15 -05:00
Eric E Payne
1b6c662546 YARN-8179: Preemption does not happen due to natural_termination_factor when DRF is used. Contributed by Kyungwan Nam.
(cherry picked from commit 0b4c44bdeef62945b592d5761666ad026b629c0b)
2018-05-21 20:31:39 +00:00
Weiwei Yang
4e46dc764f YARN-7003. DRAINING state of queues is not recovered after RM restart. Contributed by Tao Yang. 2018-05-11 11:04:50 +08:00
bibinchundatt
c198550cc7 YARN-8201. Skip stacktrace of few exception from ClientRMService. Contributed by Bilwa S T.
(cherry picked from commit cc0310a5266c8b8351f338f5fc8087a203c68cac)
2018-05-10 09:32:45 +05:30
Weiwei Yang
a0b7abf278 YARN-8025. UsersManangers#getComputedResourceLimitForActiveUsers throws NPE due to preComputedActiveUserLimit is empty. Contributed by Tao Yang.
(Cherry picked from commit 67f239c42f676237290d18ddbbc9aec369267692)
2018-05-07 13:59:50 +08:00
Rohith Sharma K S
317deaafc4 YARN-8217. RmAuthenticationFilterInitializer and TimelineAuthenticationFilterInitializer should use Configuration.getPropsWithPrefix instead of iterator. Contributed by Suma Shivaprasad.
(cherry picked from commit ee2ce923a922bfc3e89ad6f0f6a25e776fe91ffb)
2018-05-03 18:21:11 +05:30
Weiwei Yang
c2ed611885 YARN-8222. Fix potential NPE when gets RMApp from RM context. Contributed by Tao Yang.
(Cherry picked from commit 251f528814c4a4647cac0af6effb9a73135db180)
2018-05-02 18:07:34 +08:00
Weiwei Yang
f12c78120e YARN-8212. Pending backlog for async allocation threads should be configurable. Contributed by Tao Yang.
(Cherry picked from commit 2d319e37937c1e20c6a7dc4477ef88defd1f8464)
2018-05-01 10:19:08 +08:00
Rohith Sharma K S
8d3e6f99c1 YARN-8221. RMWebServices also need to honor yarn.resourcemanager.display.per-user-apps. Contributed by Sunil G.
(cherry picked from commit ef3ecc308dbea41c6a88bd4d16739c7bbc10cdda)
2018-04-27 22:59:48 +05:30
Sunil G
f1f3cbb4df YARN-8004. Add unit tests for inter queue preemption for dominant resource calculator. Contributed by Zian Chen.
(cherry picked from commit 71220d218db59cab0b03bbba427e5e9ef5b3003c)
2018-04-27 10:43:24 +05:30
Sunil G
62921db4b9 YARN-8205. Application State is not updated to ATS if AM launching is delayed. Contributed by Rohith Sharma K S.
(cherry picked from commit 1634de0fc1430d86b7688d16259a81462fba482f)
2018-04-27 10:28:41 +05:30
Wangda Tan
baa003035d YARN-8183. Fix ConcurrentModificationException inside RMAppAttemptMetrics#convertAtomicLongMaptoLongMap. (Suma Shivaprasad via wangda)
Change-Id: I347871d672001653a3afe2e99adefd74e0d798cd
(cherry picked from commit bb3c504764f807fccba7f28298a12e2296f284cb)
(cherry picked from commit 3043a93d461fd8b9ccc2ff4b8d17e5430ed77615)
2018-04-24 17:49:32 -07:00
Robert Kanter
101d9005dd HADOOP-15390. Yarn RM logs flooded by DelegationTokenRenewer trying to renew KMS tokens (xiaochen via rkanter)
(cherry picked from commit 7ab08a9c37a76edbe02d556fcfb2e637f45afc21)
2018-04-23 15:53:53 -07:00
Jason Lowe
4bff2df137 YARN-7786. NullPointerException while launching ApplicationMaster. Contributed by lujie
(cherry picked from commit 766544c0b008da9e78bcea6285b2c478653df75a)
2018-04-20 13:26:29 -05:00
Sunil G
070bee57a3 YARN-6827. [ATS1/1.5] NPE exception while publishing recovering applications into ATS during RM restart. Contributed by Rohith Sharma K S.
(cherry picked from commit 7d06806dfdeb3252ac0defe23e8c468eabfa8b5e)
2018-04-20 00:07:12 +05:30
Inigo Goiri
688933e6c7 YARN-8165. Incorrect queue name logging in AbstractContainerAllocator. Contributed by Weiwei Yan.
(cherry picked from commit dd5e18c4aecba56f140c3cc11affc2cb5e61c79d)
2018-04-17 15:29:39 +05:30
Lei Xu
3717df89ee Preparing for 3.0.3 development 2018-04-12 13:57:46 -07:00
Eric E Payne
43d2ee9de6 YARN-8147. TestClientRMService#testGetApplications sporadically fails. Contributed by Jason Lowe
(cherry picked from commit 18844599aef42f79d2af4500aa2eee472dda95cb)
2018-04-12 18:10:42 +00:00
Eric E Payne
081ea1ec39 YARN-8120. JVM can crash with SIGSEGV when exiting due to custom leveldb logger. Contributed by Jason Lowe.
(cherry picked from commit 6bb128dfb893cf0e4aa2d3ecc65440668a1fc8d7)
2018-04-12 16:22:48 +00:00
Vrushali C
ec258b8ed1 YARN-8073 TimelineClientImpl doesn't honor yarn.timeline-service.versions configuration. Contributed by Rohith Sharma K S
(cherry picked from commit 345e7624d58a058a1bad666bd1e5ce4b346a9056)
2018-04-11 09:49:25 +05:30
Wangda Tan
48778eee52 YARN-8068. Application Priority field causes NPE in app timeline publish when Hadoop 2.7 based clients to 2.8+ (Sunil G via wangda)
Change-Id: I7910bd1064a1b4dbbe2084080c060822ea6f3b48
(cherry picked from commit 9eef19b2ad78b8464da252d0e23c08675898b9d8)
2018-03-26 11:33:33 -07:00
Wangda Tan
a465026207 YARN-8062. yarn rmadmin -getGroups returns group from which the user has been removed. (Sunil G via wangda)
Change-Id: I80ed63846502bf7751b890b6c6c6a7c0679e2b4a
(cherry picked from commit 5d381570f83022b411a8740d58486a7f68ab2af6)
2018-03-26 11:33:24 -07:00
Weiwei Yang
7c286fe804 YARN-7636. Re-reservation count may overflow when cluster resource exhausted for a long time. contributed by Tao Yang. 2018-03-16 19:11:54 +08:00
Sunil G
09bb378c10 YARN-7947. Capacity Scheduler intra-queue preemption can NPE for non-schedulable apps. Contributed by Eric Payne.
(cherry picked from commit bdd2a184d78379d99c802a43ebec7d2cef0bbaf7)
2018-02-21 15:06:28 +05:30
Jason Lowe
85c611ad7d YARN-7813. Capacity Scheduler Intra-queue Preemption should be configurable for each queue. Contributed by Eric Payne 2018-02-19 14:38:49 -06:00
Eric Payne
55ded70716 Revert "YARN-7813: Capacity Scheduler Intra-queue Preemption should be configurable for each queue"
This reverts commit 36c9dda07f9d0ac14f0c979c7d519536fc9b3d28.
2018-02-14 14:38:31 -06:00
Eric Payne
36c9dda07f YARN-7813: Capacity Scheduler Intra-queue Preemption should be configurable for each queue 2018-02-14 08:58:28 -06:00