Commit Graph

1140 Commits

Author SHA1 Message Date
Arun Suresh c53d45a687 YARN-3535. Scheduler must re-request container resources when RMContainer transitions from ALLOCATED to KILLED (rohithsharma and peng.zhang via asuresh)
(cherry picked from commit 9b272ccae7)

Conflicts:
	hadoop-yarn-project/CHANGES.txt
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmcontainer/RMContainerImpl.java
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/SchedulerEventType.java
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestAbstractYarnScheduler.java
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
2015-12-14 23:50:55 -08:00
Zhihai Xu 70289432f7 YARN-3857: Memory leak in ResourceManager with SIMPLE mode. Contributed by mujunchao.
(cherry picked from commit 3a76a010b8)

Conflicts:
	hadoop-yarn-project/CHANGES.txt
2015-12-14 22:17:23 -08:00
Karthik Kambatla 843dac5353 YARN-2975. FSLeafQueue app lists are accessed without required locks. (kasha)
(cherry picked from commit 2abec14ec6)
2015-12-09 10:53:26 -08:00
Wangda Tan 5b063d6b7f YARN-4424. Fix deadlock in RMAppImpl. (Jian he via wangda)
(cherry picked from commit 7e4715186d)

Conflicts:
	hadoop-yarn-project/CHANGES.txt

(cherry picked from commit 7013f9d6cd)

Conflicts:
	hadoop-yarn-project/CHANGES.txt
2015-12-08 14:34:53 -08:00
Tsuyoshi Ozawa b345ffd7df YARN-4348. ZKRMStateStore.syncInternal shouldn't wait for sync completion for avoiding blocking ZK's event thread. (ozawa)
(cherry picked from commit 0460b8a8a3)
2015-12-08 13:41:17 +09:00
Jason Lowe 271875a426 YARN-3925. ContainerLogsUtils#getContainerLogFile fails to read container log files from full disks. Contributed by zhihai xu
(cherry picked from commit ff9c13e0a7)

Conflicts:

	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestContainerLogsPage.java
(cherry picked from commit e8410c0175)

Conflicts:

	hadoop-yarn-project/CHANGES.txt
2015-11-23 20:57:57 +00:00
Jason Lowe 5f05e5e5ba YARN-4344. NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations. Contributed by Varun Vasudev 2015-11-23 20:27:00 +00:00
Xuan 11c2326acb YARN-2859-addendum: fix the remaining issue from the previous patch. 2015-11-19 10:29:58 -08:00
Tsuyoshi Ozawa 6b27de0f36 YARN-4320. TestJobHistoryEventHandler fails as AHS in MiniYarnCluster no longer binds to default port 8188. Contributed by Varun Saxena.
(cherry picked from commit ce31b22739)
2015-11-06 00:19:41 -08:00
Tsuyoshi Ozawa 5a00b23106 YARN-4312. TestSubmitApplicationWithRMHA fails on branch-2.7 and branch-2.6 as some of the test cases time out. Contributed by Varun Saxena.
(cherry picked from commit 6636441911)
2015-11-05 10:40:23 -08:00
Xuan 9a97ff54e5 YARN-2859. ApplicationHistoryServer binds to default port 8188 in
MiniYARNCluster. Contributed by Vinod Kumar Vavilapalli

(cherry picked from commit 27414dac66)
(cherry picked from commit 9ce5069d16)
(cherry picked from commit 336be63dad)
2015-10-28 10:55:31 -07:00
Sangjin Lee 6466ead9e0 Preparing for 2.6.3 development 2015-10-21 10:58:06 -07:00
Tsuyoshi Ozawa b898f8014f YARN-3798. ZKRMStateStore shouldn't create new session without occurrance of SESSIONEXPIED. (ozawa and Varun Saxena) 2015-10-21 23:08:02 +09:00
Jason Lowe ac865de725 YARN-3896. RMNode transitioned from RUNNING to REBOOTED because its response id has not been reset synchronously. (Jun Gong via rohithsharmaks)
(cherry picked from commit feaf034994)

Conflicts:

	hadoop-yarn-project/CHANGES.txt
2015-10-08 16:39:46 +00:00
Jason Lowe 528b809d2d YARN-3194. RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node. Contributed by Rohith
(cherry picked from commit a64dd3d24b)
2015-10-08 16:33:34 +00:00
Jason Lowe 1484ebb602 YARN-3802. Two RMNodes for the same NodeId are used in RM sometimes
after NM is reconnected. Contributed by zhihai xu
(cherry picked from commit 5b5bb8dcdc)

Conflicts:

	hadoop-yarn-project/CHANGES.txt
2015-10-08 16:01:20 +00:00
Jason Lowe 4770f190b8 YARN-3780. Should use equals when compare Resource in
RMNodeImpl#ReconnectNodeTransition. Contributed by zhihai xu.
(cherry picked from commit c7ee6c151c)

Conflicts:

	hadoop-yarn-project/CHANGES.txt
2015-10-08 15:37:08 +00:00
Jason Lowe 2ecd173426 YARN-4005. Completed container whose app is finished is possibly not removed from NMStateStore. Contributed by Jun Gong
(cherry picked from commit 38aed1a94e)

Conflicts:

	hadoop-yarn-project/CHANGES.txt
2015-10-08 15:16:11 +00:00
Jason Lowe 49335d9b2b YARN-3727. For better error recovery, check if the directory exists before using it for localization. Contributed by Zhihai Xu
(cherry picked from commit 854d25b0c3)

Conflicts:

	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestLocalResourcesTrackerImpl.java
(cherry picked from commit 493f072008)

Conflicts:

	hadoop-yarn-project/CHANGES.txt
2015-09-30 16:12:09 +00:00
Xuan 1828ba00be YARN-4087. Followup fixes after YARN-2019 regarding RM behavior when state-store error occurs. Contributed by Jian He
(cherry picked from commit 9f7fcb54e798cf4fda1ea7972dd96491976e1857)
2015-09-25 16:43:06 -07:00
Xuan d27f09c936 YARN-2019. Retrospect on decision of making RM crashed if any exception throw in ZKRMStateStore. Contributed by Jian He.
(cherry picked from commit db57d91ac91e895bcb9a23fa50af0b2fbcb1db5a)
2015-09-25 16:30:49 -07:00
Jian He c09bb46579 YARN-4101. RM should print alert messages if Zookeeper and Resourcemanager gets connection issue. Contributed by Xuan Gong
(cherry picked from commit 214fd1408c21f596d1d15217c11b58b34561aab7)
2015-09-25 16:26:18 -07:00
Jian He cc30002bc8 YARN-4092. Fixed UI redirection to print useful messages when both RMs are in standby mode. Contributed by Xuan Gong
(cherry picked from commit 6b3b487d3f4883a6e849c71886da52c4c4d9f0bf)
2015-09-25 16:25:13 -07:00
Sangjin Lee 4cb7dbaead Preparing for 2.6.2 development: mvn versions:set -DnewVersion=2.6.2 2015-09-25 15:51:13 -07:00
Hitesh Shah dba2b60fdd YARN-2890. MiniYarnCluster should turn on timeline service if configured to do so. Contributed by Mit Desai.
(cherry picked from commit 265ed1fe80)
(cherry picked from commit 55b794e7fa)
2015-09-15 17:30:15 -07:00
Vinod Kumar Vavilapalli 9c4a6e1270 Revert "YARN-2890. MiniYARNCluster should start the timeline server based on the configuration. Contributed by Mit Desai."
This reverts commit 8a47d1aa55.
2015-09-15 17:30:15 -07:00
Zhijie Shen d57c3f0a26 YARN-3544. Got back AM logs link on the RM web UI for a completed app. Contributed by Xuan Gong.
(cherry picked from commit 21bf2cdcb77f69abc906e6cd401a8fb221f250e9)
(cherry picked from commit c9ee316045)
2015-09-15 17:30:15 -07:00
Xuan 7af5d6b4ba YARN-3248. Display count of nodes blacklisted by apps in the web UI.
Contributed by Varun Vasudev

(cherry picked from commit 4728bdfa15)
(cherry picked from commit e26b6e55e9)
2015-09-15 17:30:06 -07:00
Jian He e914220ab9 YARN-3379. Fixed missing data in localityTable and ResourceRequests table in RM WebUI. Contributed by Xuan Gong
(cherry picked from commit 4e886eb9cb)

(cherry picked from commit 3f0c9e5fe3)
2015-09-14 12:54:01 -07:00
Zhijie Shen 3ab820e696 YARN-3740. Fixed the typo in the configuration name: APPLICATION_HISTORY_PREFIX_MAX_APPS. Contributed by Xuan Gong.
(cherry picked from commit eb6bf91eea)
(cherry picked from commit 68cddb894a)
2015-09-11 11:53:51 -07:00
Xuan 85ec6eb37a YARN-3171. Sort by Application id, AppAttempt and ContainerID doesn't
work in ATS / RM web ui. Contributed by Naganarasimha G R

(cherry picked from commit 3ff1ba2a7b)
(cherry picked from commit b6eb36dbdc)
2015-09-11 11:49:11 -07:00
Zhijie Shen f4154bdee8 YARN-1884. Added nodeHttpAddress into ContainerReport and fixed the link to NM web page. Contributed by Xuan Gong.
(cherry picked from commit 85f6d67fa7)
(cherry picked from commit 426535007b)
2015-09-11 11:45:29 -07:00
Vinod Kumar Vavilapalli 3462a00dd2 Preparing for release 2.6.1: mvn versions:set -DnewVersion=2.6.1 2015-09-09 15:29:57 -07:00
Jian He 2b526ba757 YARN-4047. ClientRMService getApplications has high scheduler lock contention. Contributed by Jason Lowe
(cherry picked from commit 7a445fcfab)

(cherry picked from commit 703fa1b141)
2015-09-08 22:57:35 -07:00
Xuan d59bf81e08 YARN-3999. RM hangs on draing events. Contributed by Jian He
(cherry picked from commit 3ae716fa69)
(cherry picked from commit 2ebdf5bfce)
2015-09-08 22:57:35 -07:00
Jonathan Eagles 6ed2486c7e YARN-3978. Configurably turn off the saving of container info in Generic AHS (Eric Payne via jeagles)
(cherry picked from commit 3cd02b9522)
(cherry picked from commit 899df5bce0)
2015-09-08 22:57:34 -07:00
Jian He 92742b4402 YARN-2301. Improved yarn container command. Contributed by Naganarasimha G R
(cherry picked from commit 258623ff8b)

(cherry picked from commit 1d1e7682c9)
2015-09-08 22:57:28 -07:00
Jian He 2336264900 YARN-2918. RM should not fail on startup if queue's configured labels do not exist in cluster-node-labels. Contributed by Wangda Tan
(cherry picked from commit f489a4ec96)

(cherry picked from commit d817fbb34d)
2015-09-06 14:15:33 -07:00
Jian He ee2b6bc248 YARN-3124. Fixed CS LeafQueue/ParentQueue to use QueueCapacities to track capacities-by-label. Contributed by Wangda Tan
(cherry picked from commit 18a594257e)

(cherry picked from commit 1be2d64ddd)
2015-09-06 11:54:40 -07:00
Jian He 637e7f9e39 YARN-2694. Ensure only single node label specified in ResourceRequest. Contributed by Wangda Tan
(cherry picked from commit c1957fef29)

(cherry picked from commit 3ddafaa7c8)
2015-09-05 21:07:51 -07:00
Jian He 4c94f07140 YARN-3098. Created common QueueCapacities class in Capacity Scheduler to track capacities-by-labels of queues. Contributed by Wangda Tan
(cherry picked from commit 21d80b3dd9)

(cherry picked from commit c0b1311a93)
2015-09-05 20:54:20 -07:00
Jian He d9281fbbab YARN-3099. Capacity Scheduler LeafQueue/ParentQueue should use ResourceUsage to track used-resources-by-label. Contributed by Wangda Tan
(cherry picked from commit 86358221fc)

(cherry picked from commit cabf97ae4f)
2015-09-05 20:54:20 -07:00
Jian He b0ad553841 YARN-3092. Created a common ResourceUsage class to track labeled resource usages in Capacity Scheduler. Contributed by Wangda Tan
(cherry picked from commit 6f9fe76918)

(cherry picked from commit 61b4116b4b)
2015-09-05 20:54:19 -07:00
Jian He 419e18cb37 YARN-2978. Fixed potential NPE while getting queue info. Contributed by Varun Saxena
(cherry picked from commit dd57c2047b)

(cherry picked from commit c61e8a7bfa)
2015-09-05 20:54:19 -07:00
Jian He 88f022da24 YARN-2920. Changed CapacityScheduler to kill containers on nodes where node labels are changed. Contributed by Wangda Tan
(cherry picked from commit fdf042dfff)

(cherry picked from commit 411836b74c)
2015-09-05 20:54:18 -07:00
Wangda Tan 85d92721a4 YARN-3733. Fix DominantRC#compare() does not work as expected if cluster resource is empty. (Rohith Sharmaks via wangda)
(cherry picked from commit ebd797c48f)
(cherry picked from commit 78d626fa89)
2015-09-03 17:43:01 -07:00
Jian He f1b35ffd4c YARN-2637. Fixed max-am-resource-percent calculation in CapacityScheduler when activating applications. Contributed by Craig Welch
(cherry picked from commit c53420f583)

(cherry picked from commit 4931600030)
2015-09-03 17:40:24 -07:00
Jason Lowe ca7fe71000 YARN-3990. AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected. Contributed by Bibin A Chundatt
(cherry picked from commit 32e490b6c0)

(cherry picked from commit c31e3ba921)
(cherry picked from commit 07d31d4c0808a169f4770187d655f38aa105255c)
2015-09-03 14:40:20 -07:00
Jason Lowe fe5877a49e YARN-3850. NM fails to read files from full disks which can lead to container logs being lost and other issues. Contributed by Varun Saxena
(cherry picked from commit 40b256949a)

(cherry picked from commit 0221d19f4e)
(cherry picked from commit 87d2204f28f192a964c04a5fa1e2e31644d74b59)
2015-09-03 14:35:01 -07:00
Jason Lowe f21fb808f1 YARN-3832. Resource Localization fails on a cluster due to existing cache directories. Contributed by Brahma Reddy Battula
(cherry picked from commit 8d58512d6e)

(cherry picked from commit 15b1800b12)
(cherry picked from commit 38400507e3352d83c2a1f364de137366249b7983)
2015-09-03 14:26:43 -07:00