3951 Commits

Author SHA1 Message Date
Jason Lowe
1b0a110501 YARN-8804. resourceLimits may be wrongly calculated when leaf-queue is blocked in cluster with 3+ level queues. Contributed by Tao Yang
(cherry picked from commit 6b988d821e62d29c118e10a7213583b92c302baf)
2018-09-26 17:06:07 -07:00
Arun Suresh
4d69741a61 YARN-7974. Allow updating application tracking url after registration. (Jonathan Hung via asuresh) 2018-09-26 00:08:10 -07:00
Giovanni Matteo Fumarola
6937925838 YARN-8696. [AMRMProxy] FederationInterceptor upgrade: home sub-cluster heartbeat async. Contributed by Botong Huang. 2018-09-24 11:40:07 -07:00
Giovanni Matteo Fumarola
60565976e1 YARN-8658. [AMRMProxy] Metrics for AMRMClientRelayer inside FederationInterceptor. Contributed by Young Chen. 2018-09-21 10:36:36 -07:00
Eric E Payne
121cefd472 YARN-8709: CS preemption monitor always fails since one under-served queue was deleted. Contributed by Tao Yang.
(cherry picked from commit 987d8191ad409298570f7ef981e9bc8fb72ff16c)
2018-09-10 20:32:20 +00:00
Giovanni Matteo Fumarola
6c8fc8f786 HADOOP-15731. TestDistributedShell fails on Windows. Contributed by Botong Huang. 2018-09-07 14:20:28 -07:00
Eric E Payne
21aa7f1d82 YARN-8051: TestRMEmbeddedElector#testCallbackSynchronization is flakey. Contributed by Robert Kanter and Jason Lowe. 2018-08-29 21:30:38 +00:00
Giovanni Matteo Fumarola
bb0e75f72f YARN-8697. LocalityMulticastAMRMProxyPolicy should fallback to random sub-cluster when cannot resolve resource. Contributed by Botong Huang. 2018-08-28 16:07:31 -07:00
Giovanni Matteo Fumarola
d4a3be9591 HADOOP-15699. Fix some of testContainerManager failures in Windows. Contributed by Botong Huang. 2018-08-27 12:28:16 -07:00
Giovanni Matteo Fumarola
548a595027 YARN-8705. Refactor the UAM heartbeat thread in preparation for YARN-8696. Contributed by Botong Huang. 2018-08-27 11:26:31 -07:00
Jason Lowe
c4e3df2261 YARN-8649. NPE in localizer hearbeat processing if a container is killed while localizing. Contributed by lujie
(cherry picked from commit 585ebd873a55bedd2a364d256837f08ada8ba032)
2018-08-23 09:43:03 -05:00
Giovanni Matteo Fumarola
8e6807ef4a YARN-8673. [AMRMProxy] More robust responseId resync after an YarnRM master slave switch. Contributed by Botong Huang. 2018-08-21 13:09:33 -07:00
Giovanni Matteo Fumarola
89da0e9901 YARN-8581. [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy. Contributed by Botong Huang. 2018-08-21 13:04:49 -07:00
Rohith Sharma K S
1a53aab4d6 YARN-8129. Improve error message for invalid value in fields attribute. Contributed by Abhishek Modi.
(cherry picked from commit d3fef7a5c5b83d27e87b5e49928254a7d1b935e5)
2018-08-21 12:11:31 +05:30
Rohith Sharma K S
c68d1d49ca YARN-8679. [ATSv2] If HBase cluster is down for long time, high chances that NM ContainerManager dispatcher get blocked. Contributed by Wangda Tan.
(cherry picked from commit 4aacbfff605262aaf3dbd926258afcadc86c72c0)
2018-08-18 11:04:20 +05:30
Rohith Sharma K S
e2210a5175 YARN-8612. Fix NM Collector Service Port issue in YarnConfiguration. Contributed by Prabha Manepalli.
(cherry picked from commit 1697a0230696e1ed6d9c19471463b44a6d791dfa)
2018-08-17 11:17:30 +05:30
Jason Lowe
a44e53a314 YARN-8640. Restore previous state in container-executor after failure. Contributed by Jim Brennan
(cherry picked from commit d1d129aa9deecebf42261947fcb0b2ca46dacad5)
2018-08-14 10:33:27 -05:00
Jonathan Hung
7abffe4529 YARN-8559. Expose mutable-conf scheduler's configuration in RM /scheduler-conf endpoint. Contributed by Weiwei Yang. 2018-08-10 15:34:21 -07:00
Jason Lowe
2024260af6 YARN-8331. Race condition in NM container launched after done. Contributed by Pradeep Ambati
(cherry picked from commit cd04e954d2db27f0a15b7d1c492b7cdb656a51db)

Conflicts:
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/TestContainer.java
2018-08-09 10:35:07 -05:00
Haibo Chen
1991a1d760 YARN-6966. NodeManager metrics may return wrong negative values when NM restart. (Szilard Nemeth via Haibo Chen) 2018-08-02 10:06:16 -07:00
Arun Suresh
e2b82b82e2 YARN-7542. Fix issue that causes some Running Opportunistic Containers to be recovered as PAUSED. (Sampada Dehankar via asuresh)
(cherry picked from commit a55884c68eb175f1c9f61771386c086bf1ee65a9)
(cherry picked from commit bd4dcc7772f9a6786e8ef4ef8fa97dfdd34d64d1)
2018-08-02 09:59:04 -07:00
Rohith Sharma K S
21e416ad27 YARN-8155. Improve ATSv2 client logging in RM and NM publisher. Contributed by Abhishek Modi. 2018-08-01 22:25:53 +05:30
Eric E Payne
3aa17bd737 YARN-4606. CapacityScheduler: applications could get starved because computation of #activeUsers considers pending apps. Contributed by Manikandan R
(cherry picked from commit 9485c9aee6e9bb935c3e6ae4da81d70b621781de)
2018-07-25 17:06:47 +00:00
Robert Kanter
edb9d8b554 YARN-8518. test-container-executor test_is_empty() is broken (Jim_Brennan via rkanter)
(cherry picked from commit 1bc106a738a6ce4f7ed025d556bb44c1ede022e3)
(cherry picked from commit 6e0db6fe1a8ce50977175567f2ba1f957e7b9c91)
2018-07-22 05:28:21 +00:00
Robert Kanter
92f02f97fd Only mount non-empty directories for cgroups (miklos.szegedi@cloudera.com via rkanter)
(cherry picked from commit 0838fe833738e04f5e6f6408e97866d77bebbf30)
(cherry picked from commit c1dc4ca2c6080377159157ce97bf5d72fa3285a1)
2018-07-22 05:28:20 +00:00
Robert Kanter
f5fd5aa025 Disable mounting cgroups by default (miklos.szegedi@cloudera.com via rkanter)
(cherry picked from commit 351cf87c92872d90f62c476f85ae4d02e485769c)
(cherry picked from commit d61d84279f7f22867c23dd95e8bfeb70ea7e0690)
2018-07-22 05:28:20 +00:00
Eric E Payne
8ee439d791 YARN-8421: when moving app, activeUsers is increased, even though app does not have outstanding request. Contributed by Kyungwan Nam
(cherry picked from commit 937ef39b3ff90f72392b7a319e4346344db34e03)
2018-07-16 17:01:38 +00:00
Jason Lowe
0e6efe06ea YARN-8515. container-executor can crash with SIGPIPE after nodemanager restart. Contributed by Jim Brennan
(cherry picked from commit 17118f446c2387aa796849da8b69a845d9d307d3)
2018-07-13 10:11:57 -05:00
Sunil G
6cc5d49fa3 YARN-8473. Containers being launched as app tears down can leave containers in NEW state. Contributed by Jason Lowe.
(cherry picked from commit 705e2c1f7cba51496b0d019ecedffbe5fb55c28b)
2018-07-10 20:13:56 +05:30
Giovanni Matteo Fumarola
aab9bfc13c YARN-7899. [AMRMProxy] Stateful FederationInterceptor for pending requests. Contributed by Botong Huang. 2018-07-09 16:47:44 -07:00
Giovanni Matteo Fumarola
64baa9ec89 YARN-8481. AMRMProxyPolicies should accept heartbeat response from new/unknown subclusters. Contributed by Botong Huang. 2018-06-29 11:51:50 -07:00
Jason Lowe
14c7dc3c1e YARN-8451. Multiple NM heartbeat thread created when a slow NM resync with RM. Contributed by Botong Huang
(cherry picked from commit 100470140d86eede0fa240a9aa93226f274ee4f5)
2018-06-29 13:17:14 -05:00
Sunil G
33e6eec7b8 YARN-8401. [UI2] new ui is not accessible with out internet connection. Contributed by Bibin A Chundatt.
(cherry picked from commit fbaff369e9b9022723a7b2c6f25e71122a8f8a15)
2018-06-27 10:36:31 -07:00
Rohith Sharma K S
f359b7e5e9 YARN-8457. Compilation is broken with -Pyarn-ui.
(cherry picked from commit 4ffe68a6f70ce01a5654da8991b4cdb35ae0bf1f)
2018-06-25 10:40:40 -07:00
Inigo Goiri
82874e7895 YARN-8412. Move ResourceRequest.clone logic everywhere into a proper API. Contributed by Botong Huang. 2018-06-21 18:25:30 -07:00
Sunil G
96a6798c1d YARN-8404. Timeline event publish need to be async to avoid Dispatcher thread leak in case ATS is down. Contributed by Rohith Sharma K S
(cherry picked from commit 6307962b932e0ee69ba61f5796388c175d79195a)
2018-06-13 16:10:57 +05:30
Inigo Goiri
85f3105e61 HADOOP-15529. ContainerLaunch#testInvalidEnvVariableSubstitutionType is not supported in Windows. Contributed by Giovanni Matteo Fumarola.
(cherry picked from commit 6e756e8a620e4d6dc3192986679060c52063489b)
2018-06-12 10:26:01 -07:00
Rohith Sharma K S
0af3bea05d YARN-8405. RM zk-state-store.parent-path ACLs has been changed since HADOOP-14773. Contributed by Íñigo Goiri.
(cherry picked from commit 2df73dace06cfd2b3193a14cd455297f8f989617)
2018-06-12 17:30:02 +05:30
Inigo Goiri
8be1640bf6 YARN-8370. Some Node Manager tests fail on Windows due to improper path/file separator. Contributed by Anbang Hu.
(cherry picked from commit 2b2f672022547e8c19658213ac5a4090bf5b6c72)
2018-06-11 19:27:34 -07:00
Inigo Goiri
b991b38f51 YARN-8359. Exclude containermanager.linux test classes on Windows. Contributed by Jason Lowe.
(cherry picked from commit 3492a1db2c0654ce5375360caa74a34f928f23be)
2018-06-07 17:11:12 -07:00
Robert Kanter
f97bd6bb7f YARN-4677. RMNodeResourceUpdateEvent update from scheduler can lead to race condition (wilfreds and gphillips via rkanter) 2018-06-04 15:59:27 -07:00
Sunil G
d47a525163 YARN-4781. Support intra-queue preemption for fairness ordering policy. Contributed by Eric Payne. 2018-06-02 08:30:39 +05:30
Miklos Szegedi
81ee9938c5 YARN-8310. Handle old NMTokenIdentifier, AMRMTokenIdentifier, and ContainerTokenIdentifier formats. Contributed by Robert Kanter. 2018-05-24 15:50:17 -07:00
Wangda Tan
911852e932 YARN-8068. Application Priority field causes NPE in app timeline publish when Hadoop 2.7 based clients to 2.8+ (Sunil G via wangda)
Change-Id: I7910bd1064a1b4dbbe2084080c060822ea6f3b48
(cherry picked from commit 9eef19b2ad78b8464da252d0e23c08675898b9d8)
2018-05-24 13:05:23 -05:00
Rohith Sharma K S
b0b32988d9 YARN-8346. Upgrading to 3.1 kills running containers with error 'Opportunistic container queue is full'. Contributed by Jason Lowe.
(cherry picked from commit 4cc0c9b0baa93f5a1c0623eee353874e858a7caa)
2018-05-24 12:27:18 +05:30
Inigo Goiri
aa3b20b762 YARN-8327. Fix TestAggregatedLogFormat#testReadAcontainerLogs1 on Windows. Contributed by Giovanni Matteo Fumarola.
(cherry picked from commit f09dc73001fd5f3319765fa997f4b0ca9e8f2aff)
2018-05-23 16:01:37 -07:00
Inigo Goiri
8f43ade46a YARN-8344. Missing nm.stop() in TestNodeManagerResync to fix testKillContainersOnResync. Contributed by Giovanni Matteo Fumarola.
(cherry picked from commit e99e5bf104e9664bc1b43a2639d87355d47a77e2)
2018-05-23 14:17:20 -07:00
Wangda Tan
777743beb6 YARN-8232. RMContainer lost queue name when RM HA happens. (Hu Ziqian via wangda)
Change-Id: Ia21e1da6871570c993bbedde76ce32929e95970f
(cherry picked from commit 6b96a73bb0f0ad1c877a062b19091e3e15a33ec4)
2018-05-22 10:34:43 -05:00
Arun Suresh
113e2d6801 YARN-7900. [AMRMProxy] AMRMClientRelayer for stateful FederationInterceptor. (Botong Huang via asuresh) 2018-05-21 11:26:32 -07:00
Sunil G
19cf706711 YARN-8249. Few REST api's in RMWebServices are missing static user check. Contributed by Sunil G. 2018-05-16 12:18:25 +05:30