Commit Graph

3949 Commits

Author SHA1 Message Date
Giovanni Matteo Fumarola 6937925838 YARN-8696. [AMRMProxy] FederationInterceptor upgrade: home sub-cluster heartbeat async. Contributed by Botong Huang. 2018-09-24 11:40:07 -07:00
Giovanni Matteo Fumarola 60565976e1 YARN-8658. [AMRMProxy] Metrics for AMRMClientRelayer inside FederationInterceptor. Contributed by Young Chen. 2018-09-21 10:36:36 -07:00
Eric E Payne 121cefd472 YARN-8709: CS preemption monitor always fails since one under-served queue was deleted. Contributed by Tao Yang.
(cherry picked from commit 987d8191ad)
2018-09-10 20:32:20 +00:00
Giovanni Matteo Fumarola 6c8fc8f786 HADOOP-15731. TestDistributedShell fails on Windows. Contributed by Botong Huang. 2018-09-07 14:20:28 -07:00
Eric E Payne 21aa7f1d82 YARN-8051: TestRMEmbeddedElector#testCallbackSynchronization is flakey. Contributed by Robert Kanter and Jason Lowe. 2018-08-29 21:30:38 +00:00
Giovanni Matteo Fumarola bb0e75f72f YARN-8697. LocalityMulticastAMRMProxyPolicy should fallback to random sub-cluster when cannot resolve resource. Contributed by Botong Huang. 2018-08-28 16:07:31 -07:00
Giovanni Matteo Fumarola d4a3be9591 HADOOP-15699. Fix some of testContainerManager failures in Windows. Contributed by Botong Huang. 2018-08-27 12:28:16 -07:00
Giovanni Matteo Fumarola 548a595027 YARN-8705. Refactor the UAM heartbeat thread in preparation for YARN-8696. Contributed by Botong Huang. 2018-08-27 11:26:31 -07:00
Jason Lowe c4e3df2261 YARN-8649. NPE in localizer hearbeat processing if a container is killed while localizing. Contributed by lujie
(cherry picked from commit 585ebd873a)
2018-08-23 09:43:03 -05:00
Giovanni Matteo Fumarola 8e6807ef4a YARN-8673. [AMRMProxy] More robust responseId resync after an YarnRM master slave switch. Contributed by Botong Huang. 2018-08-21 13:09:33 -07:00
Giovanni Matteo Fumarola 89da0e9901 YARN-8581. [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy. Contributed by Botong Huang. 2018-08-21 13:04:49 -07:00
Rohith Sharma K S 1a53aab4d6 YARN-8129. Improve error message for invalid value in fields attribute. Contributed by Abhishek Modi.
(cherry picked from commit d3fef7a5c5)
2018-08-21 12:11:31 +05:30
Rohith Sharma K S c68d1d49ca YARN-8679. [ATSv2] If HBase cluster is down for long time, high chances that NM ContainerManager dispatcher get blocked. Contributed by Wangda Tan.
(cherry picked from commit 4aacbfff60)
2018-08-18 11:04:20 +05:30
Rohith Sharma K S e2210a5175 YARN-8612. Fix NM Collector Service Port issue in YarnConfiguration. Contributed by Prabha Manepalli.
(cherry picked from commit 1697a02306)
2018-08-17 11:17:30 +05:30
Jason Lowe a44e53a314 YARN-8640. Restore previous state in container-executor after failure. Contributed by Jim Brennan
(cherry picked from commit d1d129aa9d)
2018-08-14 10:33:27 -05:00
Jonathan Hung 7abffe4529 YARN-8559. Expose mutable-conf scheduler's configuration in RM /scheduler-conf endpoint. Contributed by Weiwei Yang. 2018-08-10 15:34:21 -07:00
Jason Lowe 2024260af6 YARN-8331. Race condition in NM container launched after done. Contributed by Pradeep Ambati
(cherry picked from commit cd04e954d2)

Conflicts:
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/TestContainer.java
2018-08-09 10:35:07 -05:00
Haibo Chen 1991a1d760 YARN-6966. NodeManager metrics may return wrong negative values when NM restart. (Szilard Nemeth via Haibo Chen) 2018-08-02 10:06:16 -07:00
Arun Suresh e2b82b82e2 YARN-7542. Fix issue that causes some Running Opportunistic Containers to be recovered as PAUSED. (Sampada Dehankar via asuresh)
(cherry picked from commit a55884c68e)
(cherry picked from commit bd4dcc7772)
2018-08-02 09:59:04 -07:00
Rohith Sharma K S 21e416ad27 YARN-8155. Improve ATSv2 client logging in RM and NM publisher. Contributed by Abhishek Modi. 2018-08-01 22:25:53 +05:30
Eric E Payne 3aa17bd737 YARN-4606. CapacityScheduler: applications could get starved because computation of #activeUsers considers pending apps. Contributed by Manikandan R
(cherry picked from commit 9485c9aee6)
2018-07-25 17:06:47 +00:00
Robert Kanter edb9d8b554 YARN-8518. test-container-executor test_is_empty() is broken (Jim_Brennan via rkanter)
(cherry picked from commit 1bc106a738)
(cherry picked from commit 6e0db6fe1a)
2018-07-22 05:28:21 +00:00
Robert Kanter 92f02f97fd Only mount non-empty directories for cgroups (miklos.szegedi@cloudera.com via rkanter)
(cherry picked from commit 0838fe8337)
(cherry picked from commit c1dc4ca2c6)
2018-07-22 05:28:20 +00:00
Robert Kanter f5fd5aa025 Disable mounting cgroups by default (miklos.szegedi@cloudera.com via rkanter)
(cherry picked from commit 351cf87c92)
(cherry picked from commit d61d84279f)
2018-07-22 05:28:20 +00:00
Eric E Payne 8ee439d791 YARN-8421: when moving app, activeUsers is increased, even though app does not have outstanding request. Contributed by Kyungwan Nam
(cherry picked from commit 937ef39b3f)
2018-07-16 17:01:38 +00:00
Jason Lowe 0e6efe06ea YARN-8515. container-executor can crash with SIGPIPE after nodemanager restart. Contributed by Jim Brennan
(cherry picked from commit 17118f446c)
2018-07-13 10:11:57 -05:00
Sunil G 6cc5d49fa3 YARN-8473. Containers being launched as app tears down can leave containers in NEW state. Contributed by Jason Lowe.
(cherry picked from commit 705e2c1f7c)
2018-07-10 20:13:56 +05:30
Giovanni Matteo Fumarola aab9bfc13c YARN-7899. [AMRMProxy] Stateful FederationInterceptor for pending requests. Contributed by Botong Huang. 2018-07-09 16:47:44 -07:00
Giovanni Matteo Fumarola 64baa9ec89 YARN-8481. AMRMProxyPolicies should accept heartbeat response from new/unknown subclusters. Contributed by Botong Huang. 2018-06-29 11:51:50 -07:00
Jason Lowe 14c7dc3c1e YARN-8451. Multiple NM heartbeat thread created when a slow NM resync with RM. Contributed by Botong Huang
(cherry picked from commit 100470140d)
2018-06-29 13:17:14 -05:00
Sunil G 33e6eec7b8 YARN-8401. [UI2] new ui is not accessible with out internet connection. Contributed by Bibin A Chundatt.
(cherry picked from commit fbaff369e9)
2018-06-27 10:36:31 -07:00
Rohith Sharma K S f359b7e5e9 YARN-8457. Compilation is broken with -Pyarn-ui.
(cherry picked from commit 4ffe68a6f7)
2018-06-25 10:40:40 -07:00
Inigo Goiri 82874e7895 YARN-8412. Move ResourceRequest.clone logic everywhere into a proper API. Contributed by Botong Huang. 2018-06-21 18:25:30 -07:00
Sunil G 96a6798c1d YARN-8404. Timeline event publish need to be async to avoid Dispatcher thread leak in case ATS is down. Contributed by Rohith Sharma K S
(cherry picked from commit 6307962b93)
2018-06-13 16:10:57 +05:30
Inigo Goiri 85f3105e61 HADOOP-15529. ContainerLaunch#testInvalidEnvVariableSubstitutionType is not supported in Windows. Contributed by Giovanni Matteo Fumarola.
(cherry picked from commit 6e756e8a62)
2018-06-12 10:26:01 -07:00
Rohith Sharma K S 0af3bea05d YARN-8405. RM zk-state-store.parent-path ACLs has been changed since HADOOP-14773. Contributed by Íñigo Goiri.
(cherry picked from commit 2df73dace0)
2018-06-12 17:30:02 +05:30
Inigo Goiri 8be1640bf6 YARN-8370. Some Node Manager tests fail on Windows due to improper path/file separator. Contributed by Anbang Hu.
(cherry picked from commit 2b2f672022)
2018-06-11 19:27:34 -07:00
Inigo Goiri b991b38f51 YARN-8359. Exclude containermanager.linux test classes on Windows. Contributed by Jason Lowe.
(cherry picked from commit 3492a1db2c0654ce5375360caa74a34f928f23be)
2018-06-07 17:11:12 -07:00
Robert Kanter f97bd6bb7f YARN-4677. RMNodeResourceUpdateEvent update from scheduler can lead to race condition (wilfreds and gphillips via rkanter) 2018-06-04 15:59:27 -07:00
Sunil G d47a525163 YARN-4781. Support intra-queue preemption for fairness ordering policy. Contributed by Eric Payne. 2018-06-02 08:30:39 +05:30
Miklos Szegedi 81ee9938c5 YARN-8310. Handle old NMTokenIdentifier, AMRMTokenIdentifier, and ContainerTokenIdentifier formats. Contributed by Robert Kanter. 2018-05-24 15:50:17 -07:00
Wangda Tan 911852e932 YARN-8068. Application Priority field causes NPE in app timeline publish when Hadoop 2.7 based clients to 2.8+ (Sunil G via wangda)
Change-Id: I7910bd1064a1b4dbbe2084080c060822ea6f3b48
(cherry picked from commit 9eef19b2ad)
2018-05-24 13:05:23 -05:00
Rohith Sharma K S b0b32988d9 YARN-8346. Upgrading to 3.1 kills running containers with error 'Opportunistic container queue is full'. Contributed by Jason Lowe.
(cherry picked from commit 4cc0c9b0ba)
2018-05-24 12:27:18 +05:30
Inigo Goiri aa3b20b762 YARN-8327. Fix TestAggregatedLogFormat#testReadAcontainerLogs1 on Windows. Contributed by Giovanni Matteo Fumarola.
(cherry picked from commit f09dc73001)
2018-05-23 16:01:37 -07:00
Inigo Goiri 8f43ade46a YARN-8344. Missing nm.stop() in TestNodeManagerResync to fix testKillContainersOnResync. Contributed by Giovanni Matteo Fumarola.
(cherry picked from commit e99e5bf104)
2018-05-23 14:17:20 -07:00
Wangda Tan 777743beb6 YARN-8232. RMContainer lost queue name when RM HA happens. (Hu Ziqian via wangda)
Change-Id: Ia21e1da6871570c993bbedde76ce32929e95970f
(cherry picked from commit 6b96a73bb0)
2018-05-22 10:34:43 -05:00
Arun Suresh 113e2d6801 YARN-7900. [AMRMProxy] AMRMClientRelayer for stateful FederationInterceptor. (Botong Huang via asuresh) 2018-05-21 11:26:32 -07:00
Sunil G 19cf706711 YARN-8249. Few REST api's in RMWebServices are missing static user check. Contributed by Sunil G. 2018-05-16 12:18:25 +05:30
Haibo Chen e28af8b0eb YARN-8130 Race condition when container events are published for KILLED applications. (Rohith Sharma K S via Haibo Chen)
(cherry picked from commit 2d00a0c71b)
2018-05-15 11:58:56 +05:30
Vrushali C c2b05339cf YARN-8247 Incorrect HTTP status code returned by ATSv2 for non-whitelisted users. Contributed by Rohith Sharma K S
(cherry picked from commit 3c95ca4f21)
2018-05-14 11:30:00 +05:30