Commit Graph

334 Commits

Author SHA1 Message Date
Jian He 613a783380 YARN-3503. Expose disk utilization percentage and bad local and log dir counts in NM metrics. Contributed by Varun Vasudev
(cherry picked from commit 674c7ef649)

Conflicts:
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
2015-04-21 21:06:06 -07:00
Jian He 6fed2c2a79 YARN-3354. Add node label expression in ContainerTokenIdentifier to support RM recovery. Contributed by Wangda Tan
(cherry picked from commit 1b89a3e173)
2015-04-15 14:03:29 -07:00
Junping Du 7c072bf092 YARN-3443. Create a 'ResourceHandler' subsystem to ease addition of support for new resource types on the NM. Contributed by Sidharta Seethana.
(cherry picked from commit 838b06ac87)
2015-04-13 18:37:39 -07:00
Vinod Kumar Vavilapalli d8e17c58bc YARN-3365. Enhanced NodeManager to support using the 'tc' tool via container-executor for outbound network traffic control. Contributed by Sidharta Seethana.
(cherry picked from commit b21c72777a)
2015-04-02 16:55:00 -07:00
Wangda Tan cba4ed1678 YARN-2495. Allow admin specify labels from each NM (Distributed configuration for node label). (Naganarasimha G R via wangda)
(cherry picked from commit 2a945d24f7)
2015-03-30 12:05:54 -07:00
Ravi Prakash b1b4951452 YARN-3288. Document and fix indentation in the DockerContainerExecutor code
(cherry picked from commit e0ccea33c9)
2015-03-28 08:01:26 -07:00
Tsuyoshi Ozawa cbacf20755 YARN-3384. TestLogAggregationService.verifyContainerLogs fails after YARN-2777. Contributed by Naganarasimha G R.
(cherry picked from commit 82eda771e0)
2015-03-24 00:25:52 +09:00
Junping Du f40f17489c YARN-3269. Yarn.nodemanager.remote-app-log-dir could not be configured to fully qualified path. Contributed by Xuan Gong
(cherry picked from commit d81109e588)
2015-03-20 13:42:31 -07:00
Ravi Prakash 9f227ad696 YARN-3339. TestDockerContainerExecutor should pull a single image and not the entire centos repository. (Ravindra Kumar Naik via raviprak)
(cherry picked from commit 56085203c4)
2015-03-16 16:18:42 -07:00
Vinod Kumar Vavilapalli 53aa3a4d1f YARN-3154. Added additional APIs in LogAggregationContext to avoid aggregating running logs of application when rolling is enabled. Contributed by Xuan Gong.
(cherry picked from commit 863079bb87)
2015-03-12 13:33:42 -07:00
Jian He 6cef2c16de YARN-2190. Added CPU and memory limit options to the default container executor for Windows containers. Contributed by Chuan Liu
(cherry picked from commit 21101c01f2)
2015-03-06 14:18:56 -08:00
Junping Du 4a87a61fe9 YARN-2799. Cleanup TestLogAggregationService based on the change in YARN-90. Contributed by Zhihai Xu
(cherry picked from commit c33ae271c2)
2015-02-20 09:44:31 -08:00
cnauroth 48302e687a YARN-2899. Run TestDockerContainerExecutorWithMocks on Linux only. Contributed by Ming Ma.
(cherry picked from commit 6804d68901)
2015-02-13 21:59:14 -08:00
Junping Du 380cc4dbed YARN-2079. Recover NonAggregatingLogHandler state upon nodemanager restart. (Contributed by Jason Lowe)
(cherry picked from commit 04f5ef18f7)
2015-02-12 11:48:24 -08:00
Jason Lowe 38333c8f29 YARN-3074. Nodemanager dies when localizer runner tries to write to a full disk. Contributed by Varun Saxena
(cherry picked from commit b379972ab3)
2015-02-11 16:34:42 +00:00
Jason Lowe ca11ffa5de YARN-2809. Implement workaround for linux kernel panic when removing cgroup. Contributed by Nathan Roberts
(cherry picked from commit 3f5431a22f)
2015-02-10 17:28:18 +00:00
Arun C. Murthy 92ff524182 YARN-1537. Fix race condition in TestLocalResourcesTrackerImpl.testLocalResourceCache. Contributed by Xuan Gong. 2015-02-05 23:59:34 -08:00
Xuan c22dcdd191 YARN-3056. Add verification for containerLaunchDuration in
TestNodeManagerMetrics. Contributed by zhihai Xu

(cherry picked from commit b73e776abc)
2015-02-03 15:14:23 -08:00
Robert Kanter 410830fe8c YARN-3022. Expose Container resource information from NodeManager for monitoring (adhoot via ranter)
(cherry picked from commit f7a77819a1)
2015-02-03 10:39:51 -08:00
Jian He 8100c8a68c YARN-3011. Possible IllegalArgumentException in ResourceLocalizationService might lead NM to crash. Contributed by Varun Saxena
(cherry picked from commit 4e15fc0841)
2015-01-27 13:31:48 -08:00
Jason Lowe 07fe6a36cb YARN-3088. LinuxContainerExecutor.deleteAsUser can throw NPE if native executor returns an error. Contributed by Eric Payne
(cherry picked from commit 902c6ea7e4)
2015-01-26 15:41:23 +00:00
Xuan a7696b3fbf YARN-3024. LocalizerRunner should give DIE action when all resources are
localized. Contributed by Chengbing Liu

(cherry picked from commit 0d6bd62102)
2015-01-25 19:39:52 -08:00
Tsuyoshi Ozawa ff627d94e7 YARN-3082. Non thread safe access to systemCredentials in NodeHeartbeatResponse processing. Contributed by Anubhav Dhoot.
(cherry picked from commit 3aab354e66)
2015-01-23 16:05:05 +09:00
Karthik Kambatla 4d8fa9615f YARN-2984. Metrics for container's actual memory usage. (kasha)
(cherry picked from commit 84198564ba)
2015-01-17 06:26:47 +05:30
Junping Du 7cddec31d7 YARN-3064. TestRMRestart/TestContainerResourceUsage/TestNodeManagerResync failure with allocation timeout. (Contributed by Jian He)
(cherry picked from commit 5d1cca34fa)
2015-01-16 00:10:36 -08:00
Jian He e7e6173049 YARN-2997. Fixed NodeStatusUpdater to not send alreay-sent completed container statuses on heartbeat. Contributed by Chengbing Liu
(cherry picked from commit cc2a745f7e)
2015-01-08 11:28:24 -08:00
Karthik Kambatla b4e8ae591d YARN-2675. containersKilled metrics is not updated when the container is killed during localization. (Zhihai Xu via kasha)
(cherry picked from commit 954fb8581e)
2014-12-19 16:03:02 -08:00
cnauroth 36068768d8 HADOOP-11321. copyToLocal cannot save a file to an SMB share unless the user has Full Control permissions. Contributed by Chris Nauroth.
(cherry picked from commit e996a1bfd4)
2014-12-16 15:32:23 -08:00
Steve Loughran a858d726c8 YARN-2912 Jersey Tests failing with port in use. (varun saxena via stevel) 2014-12-12 17:10:54 +00:00
Karthik Kambatla 9d72b0282f YARN-2931. PublicLocalizer may fail until directory is initialized by LocalizeRunner. (Anubhav Dhoot via kasha)
(cherry picked from commit db73cc9124)
2014-12-08 22:26:44 -08:00
Junping Du 86535ff65f YARN-1156. Enhance NodeManager AllocatedGB and AvailableGB metrics for aggregation of decimal values. (Contributed by Tsuyoshi OZAWA)
(cherry picked from commit e65b7c5ff6)
2014-12-03 04:12:35 -08:00
Karthik Kambatla af0b54a4ee YARN-2679. Add metric for container launch duration. (Zhihai Xu via kasha)
(cherry picked from commit 233b61e495)
2014-11-21 14:22:53 -08:00
Jason Lowe ad140d1fc8 YARN-2816. NM fail to start with NPE during container recovery. Contributed by Zhihai Xu
(cherry picked from commit 49c38898b0)
2014-11-14 21:27:16 +00:00
Karthik Kambatla ff1b13ded5 YARN-2236. [YARN-1492] Shared Cache uploader service on the Node Manager. (Chris Trezzo and Sanjin Lee via kasha)
(cherry picked from commit a04143039e)
2014-11-12 09:31:30 -08:00
Ravi Prakash d863f54f57 YARN-1964. Create Docker analog of the LinuxContainerExecutor in YARN 2014-11-11 21:29:27 -08:00
Xuan b3badf935a YARN-2841. RMProxy should retry EOFException. Contributed by Jian He
(cherry picked from commit 5c9a51f140)
2014-11-10 18:26:32 -08:00
Arun C. Murthy 175d222bfc YARN-2830. Add backwords compatible ContainerId.newInstance constructor. Contributed by Jonathan Eagles.
(cherry picked from commit 43cd07b408)
2014-11-09 15:03:59 -08:00
Jason Lowe a5764cb783 YARN-2825. Container leak on NM. Contributed by Jian He
(cherry picked from commit c3d475070a)
2014-11-07 23:17:34 +00:00
Zhijie Shen e06c23a6c9 YARN-2752. Made ContainerExecutor append "nice -n" arg only when priority adjustment flag is set. Contributed by Xuan Gong. 2014-11-04 15:50:10 -08:00
Vinod Kumar Vavilapalli 9c76dcadaf YARN-1922. Fixed NodeManager to kill process-trees correctly in the presence of races between the launch and the stop-container call and when root processes crash. Contributed by Billie Rinaldi.
(cherry picked from commit c5a46d4c8c)
2014-11-03 16:40:37 -08:00
Vinod Kumar Vavilapalli 715c81ef6d YARN-2788. Fixed backwards compatiblity issues with log-aggregation feature that were caused when adding log-upload-time via YARN-2703. Contributed by Xuan Gong.
(cherry picked from commit 58e9f24e0f)
2014-11-03 13:19:34 -08:00
Vinod Kumar Vavilapalli 6627f67bf5 YARN-2790. Fixed a NodeManager bug that was causing log-aggregation to fail beyond HFDS delegation-token expiry even when RM is a proxy-user (YARN-2704). Contributed by Jian He.
(cherry picked from commit 5c0381c96a)
2014-11-01 16:33:35 -07:00
Zhijie Shen d9ac25454c YARN-2711. Fixed TestDefaultContainerExecutor#testContainerLaunchError failure on Windows. Contributed by Varun Vasudev.
(cherry picked from commit 1cd088fd9d)
2014-10-31 17:45:05 -07:00
Jason Lowe 3e8544c5f2 YARN-2755. NM fails to clean up usercache_DEL_<timestamp> dirs after YARN-661. Contributed by Siqi Li
(cherry picked from commit 73e626ad91)
2014-10-30 15:11:57 +00:00
Zhijie Shen f40389ae08 YARN-2741. Made NM web UI serve logs on the drive other than C: on Windows. Contributed by Craig Welch.
(cherry picked from commit 8984e9b177)
2014-10-28 14:12:09 -07:00
Vinod Kumar Vavilapalli 0ad33e1483 YARN-2704. Changed ResourceManager to optionally obtain tokens itself for the sake of localization and log-aggregation for long-running services. Contributed by Jian He.
(cherry picked from commit a16d022ca4)
2014-10-27 15:50:51 -07:00
Zhijie Shen 1b81105143 YARN-2703. Added logUploadedTime into LogValue for better display. Contributed by Xuan Gong.
(cherry picked from commit f81dc3f995)
2014-10-24 14:12:17 -07:00
Jian He 1c235a4448 YARN-2198. Remove the need to run NodeManager as privileged account for Windows Secure Container Executor. Contributed by Remus Rusanu
(cherry picked from commit 3b12fd6cfb)
2014-10-22 15:58:26 -07:00
Jason Lowe 3820bf055e YARN-90. NodeManager should identify failed disks becoming good again. Contributed by Varun Vasudev
(cherry picked from commit 6f2028bd15)
2014-10-21 17:33:34 +00:00
Jian He e9564e729f Missing file for YARN-2701
(cherry picked from commit 4fa1fb3193)
2014-10-20 19:58:21 -07:00
Jian He 3c8ae89050 YARN-2701. Potential race condition in startLocalizer when using LinuxContainerExecutor. Contributed by Xuan Gong
(cherry picked from commit 2839365f23)
2014-10-20 19:54:10 -07:00
Jian He f93d2ea27e YARN-2312. Deprecated old ContainerId#getId API and updated MapReduce to use ContainerId#getContainerId instead. Contributed by Tsuyoshi OZAWA 2014-10-15 15:28:26 -07:00
Karthik Kambatla 88455173e8 YARN-2566. DefaultContainerExecutor should pick a working directory randomly. (Zhihai Xu via kasha)
(cherry picked from commit cc93e7e683)
2014-10-13 16:32:42 -07:00
Zhijie Shen e51ae64761 YARN-2651. Spun off LogRollingInterval from LogAggregationContext. Contributed by Xuan Gong.
(cherry picked from commit 4aed2d8e91)
2014-10-13 10:55:09 -07:00
Zhijie Shen 1e6d81a886 YARN-2583. Modified AggregatedLogDeletionService to be able to delete rolling aggregated logs. Contributed by Xuan Gong.
(cherry picked from commit cb81bac002)
2014-10-10 00:16:34 -07:00
cnauroth b81641a310 YARN-2662. TestCgroupsLCEResourcesHandler leaks file descriptors. Contributed by Chris Nauroth.
(cherry picked from commit d3afd730ac)
2014-10-09 22:47:04 -07:00
Vinod Kumar Vavilapalli 7ed61e150c YARN-2468. Enhanced NodeManager to support log handling APIs (YARN-2569) for use by long running services. Contributed by Xuan Gong.
(cherry picked from commit 34cdcaad71)
2014-10-03 12:17:03 -07:00
Jason Lowe 531c1fd00a YARN-2624. Resource Localization fails on a cluster due to existing cache directories. Contributed by Anubhav Dhoot
(cherry picked from commit 29f520052e)
2014-10-02 17:40:44 +00:00
Jian He 61c7ceaf82 YARN-2617. Fixed NM to not send duplicate container status whose app is not running. Contributed by Jun Gong
(cherry picked from commit 3ef1cf187f)
2014-10-02 10:04:42 -07:00
junping_du 6483342a61 YARN-1979. TestDirectoryCollection fails when the umask is unusual. (Contributed by Vinod Kumar Vavilapalli and Tsuyoshi OZAWA)
(cherry picked from commit c7cee9b455)
2014-10-02 08:04:25 -07:00
Vinod Kumar Vavilapalli 3326fba382 YARN-1972. Added a secure container-executor for Windows. Contributed by Remus Rusanu.
commit ba7f31c2ee is the corresponding trunk commit, this is a slightly different patch for branch-2.
2014-10-01 17:07:21 -07:00
junping_du 625456746c YARN-2613. Support retry in NMClient for rolling-upgrades. (Contributed by Jian He) 2014-10-01 17:08:55 -07:00
Zhijie Shen 4b50e23271 YARN-2630. Prevented previous AM container status from being acquired by the current restarted AM. Contributed by Jian He.
(cherry picked from commit 52bbe0f11b)
2014-10-01 15:39:36 -07:00
Jian He cb08ed1484 YARN-668. Changed NMTokenIdentifier/AMRMTokenIdentifier/ContainerTokenIdentifier to use protobuf object as the payload. Contributed by Junping Du.
(cherry picked from commit 5391919b09)
2014-09-26 17:53:35 -07:00
Zhijie Shen 3a2e400377 YARN-2581. Passed LogAggregationContext to NM via ContainerTokenIdentifier. Contributed by Xuan Gong.
(cherry picked from commit c86674a3a4)
2014-09-24 17:51:54 -07:00
Jian He 3ce97a9efd YARN-1372. Ensure all completed containers are reported to the AMs across RM restart. Contributed by Anubhav Dhoot
(cherry picked from commit 0a641496c7)
2014-09-22 10:32:44 -07:00
Vinod Kumar Vavilapalli 9d34dc87e1 YARN-2531. Added a configuration for admins to be able to override app-configs and enforce/not-enforce strict control of per-container cpu usage. Contributed by Varun Vasudev.
(cherry picked from commit 9f6891d9ef)
2014-09-16 10:15:37 -07:00
Vinod Kumar Vavilapalli a2a61eec6d YARN-2440. Enabled Nodemanagers to limit the aggregate cpu usage across all containers to a preconfigured limit. Contributed by Varun Vasudev.
(cherry picked from commit 4be95175cd)
2014-09-10 19:24:14 -07:00
Jason Lowe 04d325afff YARN-2431. NM restart: cgroup is not removed for reacquired containers. Contributed by Jason Lowe
(cherry picked from commit 3fa5f728c4)
2014-09-04 21:14:20 +00:00
Jason Lowe b61b78e5c6 YARN-2462. TestNodeManagerResync#testBlockNewContainerRequestsOnStartAndResync should have a test timeout. Contributed by Eric Payne
(cherry picked from commit 9ecda8f4c7e10d825b884e35c994d241b9fc8907)
2014-08-29 20:18:49 +00:00
Allen Wittenauer 5d965f2f3c YARN-2424. LCE should support non-cgroups, non-secure mode (Chris Douglas via aw)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1619424 13f79535-47bb-0310-9956-ffa450edef68
2014-08-21 14:57:53 +00:00
Junping Du e8d20ad77c Merge r1617448 from trunk: YARN-1337. Recover containers upon nodemanager restart. (Contributed by Jason Lowe)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1617450 13f79535-47bb-0310-9956-ffa450edef68
2014-08-12 11:02:38 +00:00
Junping Du fc5bb235f2 Merge r1615550 from trunk: YARN-1354. Recover applications upon nodemanager restart. (Contributed by Jason Lowe)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1615554 13f79535-47bb-0310-9956-ffa450edef68
2014-08-04 13:35:49 +00:00
Zhijie Shen f52092be46 YARN-2347. Consolidated RMStateVersion and NMDBSchemaVersion into Version in yarn-server-common. Contributed by Junping Du.
svn merge --ignore-ancestry -c 1614838 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1614839 13f79535-47bb-0310-9956-ffa450edef68
2014-07-31 09:31:22 +00:00
Devarajulu K 087a2acb8b YARN-1342. Recover container tokens upon nodemanager restart. Contributed by Jason Lowe.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1612997 13f79535-47bb-0310-9956-ffa450edef68
2014-07-24 05:02:46 +00:00
Junping Du f6b932fe48 Merget r1612449 from trunk: YARN-2013. The diagnostics is always the ExitCodeException stack when the container crashes. (Contributed by Tsuyoshi OZAWA)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1612450 13f79535-47bb-0310-9956-ffa450edef68
2014-07-22 03:04:22 +00:00
Jason Darrell Lowe f57b6946d7 svn merge -c 1612285 FIXES: YARN-2045. Data persisted in NM should be versioned. Contributed by Junping Du
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1612289 13f79535-47bb-0310-9956-ffa450edef68
2014-07-21 14:49:38 +00:00
Junping Du f81b04df50 Merge r1611512 from trunk: YARN-1341. Recover NMTokens upon nodemanager restart. (Contributed by Jason Lowe)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1611514 13f79535-47bb-0310-9956-ffa450edef68
2014-07-17 23:38:36 +00:00
Jian He 77a94b73b2 Merge r1608334 from trunk. YARN-1367. Changed NM to not kill containers on NM resync if RM work-preserving restart is enabled. Contributed by Anubhav Dhoot
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1608336 13f79535-47bb-0310-9956-ffa450edef68
2014-07-07 04:40:36 +00:00
Karthik Kambatla 14858cd6f7 YARN-2204. Explicitly enable vmem check in TestContainersMonitor#testContainerKillOnMemoryOverflow. (Anubhav Dhoot via kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1607233 13f79535-47bb-0310-9956-ffa450edef68
2014-07-02 02:07:48 +00:00
Vinod Kumar Vavilapalli a2e2c8ad97 YARN-2152. Added missing information into ContainerTokenIdentifier so that NodeManagers can report the same to RM when RM restarts. Contributed Jian He.
svn merge --ignore-ancestry -c 1605205 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1605206 13f79535-47bb-0310-9956-ffa450edef68
2014-06-24 21:44:00 +00:00
Thomas Graves 1c2052e200 YARN-2072. RM/NM UIs and webservices are missing vcore information. (Nathan Roberts via tgraves)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1605166 13f79535-47bb-0310-9956-ffa450edef68
2014-06-24 19:41:56 +00:00
Junping Du 771e157b66 Merge r1603036 from trunk: YARN-1339. Recover DeletionService state upon nodemanager restart. (Contributed by Jason Lowe)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1603037 13f79535-47bb-0310-9956-ffa450edef68
2014-06-17 01:10:49 +00:00
Bikas Saha dc5ee5ff7c Merge 1601762 from trunk to branch-2 for YARN-2091. Add more values to ContainerExitStatus and pass it from NM to RM and then to app masters (Tsuyoshi OZAWA via bikas)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1601763 13f79535-47bb-0310-9956-ffa450edef68
2014-06-10 20:13:23 +00:00
Vinod Kumar Vavilapalli a73447fa07 YARN-2115. Replaced RegisterNodeManagerRequest's ContainerStatus with a new NMContainerStatus which has more information that is needed for work-preserving RM-restart. Contributed by Jian He.
svn merge --ignore-ancestry -c 1598790 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1598791 13f79535-47bb-0310-9956-ffa450edef68
2014-05-31 00:21:47 +00:00
Junping Du 9f76296358 Merge r1598640 from trunk: YARN-1338. Recover localized resource cache state upon nodemanager restart (Contributed by Jason Lowe)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1598652 13f79535-47bb-0310-9956-ffa450edef68
2014-05-30 16:09:36 +00:00
Junping Du ffb0d24fef Merge r1594421 from trunk: YARN-1362. Distinguish between nodemanager shutdown for decommission vs shutdown for restart. (Contributed by Jason Lowe
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1594422 13f79535-47bb-0310-9956-ffa450edef68
2014-05-14 00:25:08 +00:00
Junping Du 4b27c6882a Merge r1593660 from trunk: YARN-766. TestNodeManagerShutdown in branch-2 should use Shell to form the output path and a format issue in trunk. (Contributed by Siddharth Seth)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1593661 13f79535-47bb-0310-9956-ffa450edef68
2014-05-10 03:47:43 +00:00
Ivan Mitic 2fb649a668 YARN-1865 Merging change r1588693 from trunk.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1588696 13f79535-47bb-0310-9956-ffa450edef68
2014-04-19 19:00:42 +00:00
Junping Du 8c23c3295a Merge r1588343 from trunk: YARN-1750. TestNodeStatusUpdater#testNMRegistration is incorrect in test case. (Wangda Tan via junping_du)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1588347 13f79535-47bb-0310-9956-ffa450edef68
2014-04-17 19:13:17 +00:00
Vinod Kumar Vavilapalli 2595a27092 YARN-1933. Fixed test issues with TestAMRestart and TestNodeHealthService. Contributed by Jian He.
svn merge --ignore-ancestry -c 1587104 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1587105 13f79535-47bb-0310-9956-ffa450edef68
2014-04-13 21:52:09 +00:00
Jian He 9df6ddd282 Merge r1586522 from trunk. YARN-1903. Set exit code and diagnostics when container is killed at NEW/LOCALIZING state. Contributed by Zhijie Shen
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1586523 13f79535-47bb-0310-9956-ffa450edef68
2014-04-11 01:28:22 +00:00
Karthik Kambatla a2cdf208dd YARN-1757. NM Recovery. Auxiliary service support. (Jason Lowe via kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1585784 13f79535-47bb-0310-9956-ffa450edef68
2014-04-08 17:17:59 +00:00
Vinod Kumar Vavilapalli 08a194fb55 YARN-1775. Enhanced ProcfsBasedProcessTree to optionally add the ability to use smaps for obtaining used memory information. Contributed by Rajesh Balamohan.
svn merge --ignore-ancestry -c 1580087 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1580088 13f79535-47bb-0310-9956-ffa450edef68
2014-03-22 00:02:26 +00:00
Jian He 69835b9651 Merge r1578614 from trunk. Fixed AM container log to show on NM web page after application finishes if log-aggregation is disabled. Contributed by Rohith Sharmaks
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1578618 13f79535-47bb-0310-9956-ffa450edef68
2014-03-17 21:52:41 +00:00
Jonathan Turner Eagles 8e38068076 YARN-1136. Replace junit.framework.Assert with org.junit.Assert (Chen He via jeagles)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1578546 13f79535-47bb-0310-9956-ffa450edef68
2014-03-17 20:20:22 +00:00
Vinod Kumar Vavilapalli d470c7b71a YARN-1824. Improved NodeManager and clients to be able to handle cross platform application submissions. Contributed by Jian He.
MAPREDUCE-4052. Improved MapReduce clients to use NodeManagers' ability to handle cross platform application submissions. Contributed by Jian He.
svn merge --ignore-ancestry -c 1578135 ../../trunk/ with a couple of minor edits for working in branch-2.


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1578139 13f79535-47bb-0310-9956-ffa450edef68
2014-03-16 19:13:16 +00:00
Vinod Kumar Vavilapalli d5120ccc6b YARN-1800. Fixed NodeManager to gracefully handle RejectedExecutionException in the public-localizer thread-pool. Contributed by Varun Vasudev.
svn merge --ignore-ancestry -c 1576545 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1576546 13f79535-47bb-0310-9956-ffa450edef68
2014-03-11 23:34:20 +00:00
Vinod Kumar Vavilapalli 2fbec50fed YARN-1781. Modified NodeManagers to allow admins to specify max disk utilization for local disks so as to be able to offline full disks. Contributed by Varun Vasudev.
svn merge --ignore-ancestry -c 1575463 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1575464 13f79535-47bb-0310-9956-ffa450edef68
2014-03-08 00:52:32 +00:00
Vinod Kumar Vavilapalli 78f1a475c8 YARN-1783. Fixed a bug in NodeManager's status-updater that was losing completed container statuses when NodeManager is forced to resync by the ResourceManager. Contributed by Jian He.
svn merge --ignore-ancestry -c 1575437 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1575438 13f79535-47bb-0310-9956-ffa450edef68
2014-03-07 22:37:12 +00:00
Vinod Kumar Vavilapalli ae456f408a YARN-1686. Fixed NodeManager to properly handle any errors during re-registration after a RESYNC and thus avoid hanging. Contributed by Rohith Sharma.
svn merge --ignore-ancestry -c 1571474 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1571475 13f79535-47bb-0310-9956-ffa450edef68
2014-02-24 22:42:00 +00:00
Sanford Ryza 5bc592d88d YARN-1697. NodeManager reports negative running containers (Sandy Ryza)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1567381 13f79535-47bb-0310-9956-ffa450edef68
2014-02-11 20:50:22 +00:00
Karthik Kambatla 49389403d6 YARN-1672. YarnConfiguration is missing a default for yarn.nodemanager.log.retain-seconds (Naren Koneru via kasha)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1565867 13f79535-47bb-0310-9956-ffa450edef68
2014-02-08 01:56:10 +00:00
Jian He 5c47b8d78a Merge r1556318 from trunk. YARN-1293. Fixed TestContainerLaunch#testInvalidEnvSyntaxDiagnostics failure caused by non-English system locale. Contributed by Tsuyoshi OZAWA.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1556319 13f79535-47bb-0310-9956-ffa450edef68
2014-01-07 19:03:50 +00:00
Jason Darrell Lowe b8f59ebeaa svn merge -c 1556282 FIXES: YARN-1409. NonAggregatingLogHandler can throw RejectedExecutionException. Contributed by Tsuyoshi OZAWA
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1556284 13f79535-47bb-0310-9956-ffa450edef68
2014-01-07 17:23:50 +00:00
Bikas Saha eb97badbcd Merge r1543973 from trunk to branch-2 for YARN-1053. Diagnostic message from ContainerExitEvent is ignored in ContainerImpl (Omkar Vinit Joshi via bikas)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1543974 13f79535-47bb-0310-9956-ffa450edef68
2013-11-20 22:29:16 +00:00
Sanford Ryza cfcb4a716f YARN-1401. With zero sleep-delay-before-sigkill.ms, no signal is ever sent (Gera Shegalov via Sandy Ryza)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1542047 13f79535-47bb-0310-9956-ffa450edef68
2013-11-14 19:59:05 +00:00
Jonathan Turner Eagles 1884c3a0b9 YARN-1386. NodeManager mistakenly loses resources and relocalizes them (Jason Lowe via jeagles)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1541376 13f79535-47bb-0310-9956-ffa450edef68
2013-11-13 03:26:20 +00:00
Chris Nauroth f6bb3ce621 YARN-1357. Merging change r1537293 from trunk to branch-2.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1537296 13f79535-47bb-0310-9956-ffa450edef68
2013-10-30 20:53:16 +00:00
Sanford Ryza 86bacb6b43 Add missing file TestCgroupsLCEResourcesHandler for YARN-1284.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1530495 13f79535-47bb-0310-9956-ffa450edef68
2013-10-09 05:12:26 +00:00
Vinod Kumar Vavilapalli 763521305d YARN-1278. Fixed NodeManager to not delete local resources for apps on resync command from RM - a bug caused by YARN-1149. Contributed by Hitesh Shah.
svn merge --ignore-ancestry -c 1529657 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1529658 13f79535-47bb-0310-9956-ffa450edef68
2013-10-06 18:33:37 +00:00
Vinod Kumar Vavilapalli ea3c2d28f7 YARN-1254. Fixed NodeManager to not pollute container's credentials. Contributed by Omkar Vinit Joshi.
svn merge --ignore-ancestry -c 1529382 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1529383 13f79535-47bb-0310-9956-ffa450edef68
2013-10-05 04:26:37 +00:00
Alejandro Abdelnur 4de1616b82 YARN-1253. Changes to LinuxContainerExecutor to run containers as a single dedicated user in non-secure mode. (rvs via tucu)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1529326 13f79535-47bb-0310-9956-ffa450edef68
2013-10-04 22:00:01 +00:00
Bikas Saha 7b2cd4b411 Merge r1529039 and r1529048 from trunk to branch-2 for YARN-1256. NM silently ignores non-existent service in StartContainerRequest (Xuan Gong via bikas)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1529049 13f79535-47bb-0310-9956-ffa450edef68
2013-10-04 01:04:26 +00:00
Hitesh Shah 75aeb6070a Merge 1529043 from trunk to branch-2 for YARN-1149. NM throws InvalidStateTransitonException: Invalid event: APPLICATION_LOG_HANDLING_FINISHED at RUNNING. Contributed by Xuan Gong.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1529044 13f79535-47bb-0310-9956-ffa450edef68
2013-10-04 00:46:18 +00:00
Vinod Kumar Vavilapalli 80dd306450 YARN-1070. Fixed race conditions in NodeManager during container-kill. Contributed by Zhijie Shen.
svn merge --ignore-ancestry -c 1527827 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1527828 13f79535-47bb-0310-9956-ffa450edef68
2013-10-01 00:18:42 +00:00
Jonathan Turner Eagles e33f5dc85d YARN-819. ResourceManager and NodeManager should check for a minimum allowed version (Robert Parker via jeagles)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1526665 13f79535-47bb-0310-9956-ffa450edef68
2013-09-26 20:06:47 +00:00
Siddharth Seth e23d5672a7 merge YARN-1229 from trunk. Define constraints on Auxiliary Service names. Change ShuffleHandler service name from mapreduce.shuffle to mapreduce_shuffle. Contributed by Xuan Gong.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1526066 13f79535-47bb-0310-9956-ffa450edef68
2013-09-25 00:37:14 +00:00
Jason Darrell Lowe d5b84644f0 svn merge -c 1523158 FIXES: YARN-1189. NMTokenSecretManagerInNM is not being told when applications have finished. Contributed by Omkar Vinit Joshi
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1523159 13f79535-47bb-0310-9956-ffa450edef68
2013-09-14 00:23:20 +00:00
Jason Darrell Lowe 039c10e0d9 svn merge -c 1522968 FIXES: YARN-1194. TestContainerLogsPage fails with native builds. Contributed by Roman Shaposhnik
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1522971 13f79535-47bb-0310-9956-ffa450edef68
2013-09-13 15:20:38 +00:00
Chris Nauroth 399df2cb08 YARN-1078. Merging change r1522644 from trunk to branch-2.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1522645 13f79535-47bb-0310-9956-ffa450edef68
2013-09-12 16:04:41 +00:00
Vinod Kumar Vavilapalli e5ee67cbea YARN-910. Augmented auxiliary services to listen for container starts and completions in addition to application events. Contributed by Alejandro Abdelnur.
svn merge --ignore-ancestry -c 1521298 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1521299 13f79535-47bb-0310-9956-ffa450edef68
2013-09-09 21:49:00 +00:00
Bikas Saha a84e9321c4 Merge 1520135 from trunk to branch-2 for YARN-1065. NM should provide AuxillaryService data to the container (Xuan Gong via bikas)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1520137 13f79535-47bb-0310-9956-ffa450edef68
2013-09-04 20:51:35 +00:00
Vinod Kumar Vavilapalli 57220dd1d0 YARN-1077. Fixed TestContainerLaunch test failure on Windows. Contributed by Chuan Liu.
svn merge --ignore-ancestry -c 1519333 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1519334 13f79535-47bb-0310-9956-ffa450edef68
2013-09-02 03:11:06 +00:00
Vinod Kumar Vavilapalli 0246b60e93 YARN-649. Added a new NM web-service to serve container logs in plain text over HTTP. Contributed by Sandy Ryza.
svn merge --ignore-ancestry -c 1519326 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1519327 13f79535-47bb-0310-9956-ffa450edef68
2013-09-02 00:09:53 +00:00
Vinod Kumar Vavilapalli 0fb55f1aec YARN-602. Fixed NodeManager to not let users override some mandatory environmental variables. Contributed by Kenji Kikushima.
svn merge --ignore-ancestry -c 1518077 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1518078 13f79535-47bb-0310-9956-ffa450edef68
2013-08-28 05:14:15 +00:00
Arun Murthy 77a60701c1 Merge -c 1514135 from trunk to branch-2 to fix YARN-1056. Remove dual use of string 'resourcemanager' in yarn.resourcemanager.connect.{max.wait.secs|retry_interval.secs}. Contributed by Karthik Kambatla.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1514136 13f79535-47bb-0310-9956-ffa450edef68
2013-08-15 02:36:58 +00:00
Vinod Kumar Vavilapalli 9b05d132bf YARN-906. Fixed a bug in NodeManager where cancelling ContainerLaunch at KILLING state causes that the container to hang. Contributed by Zhijie Shen.
svn merge --ignore-ancestry -c 1509924 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1509925 13f79535-47bb-0310-9956-ffa450edef68
2013-08-03 00:49:32 +00:00
Vinod Kumar Vavilapalli a416371e7d YARN-903. Changed ContainerManager to suppress unnecessary warnings when stopping already stopped containers. Contributed by Omkar Vinit Joshi.
svn merge --ignore-ancestry -c 1509560 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1509561 13f79535-47bb-0310-9956-ffa450edef68
2013-08-02 06:55:21 +00:00
Vinod Kumar Vavilapalli 4120e3b4ef YARN-966. Fixed ContainerLaunch to not fail quietly when there are no localized resources due to some other failure. Contributed by Zhijie Shen.
svn merge --ignore-ancestry -c 1508688 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1508689 13f79535-47bb-0310-9956-ffa450edef68
2013-07-31 00:00:28 +00:00
Vinod Kumar Vavilapalli 22e0a210b6 Reverting YARN-245 to fix a critical bug.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1508278 13f79535-47bb-0310-9956-ffa450edef68
2013-07-30 03:07:26 +00:00
Sanford Ryza 7642501864 YARN-932. TestResourceLocalizationService.testLocalizationInit can fail on JDK7. (Karthik Kambatla via Sandy Ryza)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1508211 13f79535-47bb-0310-9956-ffa450edef68
2013-07-29 22:15:32 +00:00
Vinod Kumar Vavilapalli 9f37464e71 YARN-245. Fixed NodeManager to handle duplicate responses from ResourceManager. Contributed by Mayank Bansal.
svn merge --ignore-ancestry -c 1508157 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1508159 13f79535-47bb-0310-9956-ffa450edef68
2013-07-29 18:15:36 +00:00
Tsz-wo Sze 4c045ebafd svn merge -c 1380921 from trunk for YARN-84. Use Builder to build RPC server.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1507582 13f79535-47bb-0310-9956-ffa450edef68
2013-07-27 06:47:40 +00:00
Vinod Kumar Vavilapalli f6663a1198 YARN-688. Fixed NodeManager to properly cleanup containers when it is shut down. Contributed by Jian He.
svn merge --ignore-ancestry -c 1506814 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1506815 13f79535-47bb-0310-9956-ffa450edef68
2013-07-25 04:15:16 +00:00
Vinod Kumar Vavilapalli f487f4eb19 YARN-926. Modified ContainerManagerProtcol APIs to take in requests for multiple containers. Contributed by Jian He.
MAPREDUCE-5412. Update MR app to use multiple containers API of ContainerManager after YARN-926. Contributed by Jian He.
svn merge --ignore-ancestry -c 1506391 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1506392 13f79535-47bb-0310-9956-ffa450edef68
2013-07-24 03:42:52 +00:00
Vinod Kumar Vavilapalli 384c86f6e1 YARN-814. Improving diagnostics when containers fail during launch due to various reasons like invalid env etc. Contributed by Jian He.
svn merge --ignore-ancestry -c 1504732 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1504733 13f79535-47bb-0310-9956-ffa450edef68
2013-07-19 00:28:53 +00:00
Vinod Kumar Vavilapalli 9b3b4fe8bb YARN-912. Move client facing exceptions to yarn-api module. Contributed by Mayank Bansal.
svn merge --ignore-ancestry -c 1504032 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1504033 13f79535-47bb-0310-9956-ffa450edef68
2013-07-17 07:32:23 +00:00
Vinod Kumar Vavilapalli a6865504d7 YARN-62. Modified NodeManagers to avoid AMs from abusing container tokens for repetitive container launches. Contributed by Omkar Vinit Joshi.
svn merge --ignore-ancestry -c 1503986 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1503987 13f79535-47bb-0310-9956-ffa450edef68
2013-07-17 04:25:17 +00:00
Vinod Kumar Vavilapalli 0623ee954a YARN-820. Fixed an invalid state transition in NodeManager caused by failing resource localization. Contributed by Mayank Bansal.
svn merge --ignore-ancestry -c 1503947 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1503948 13f79535-47bb-0310-9956-ffa450edef68
2013-07-16 23:45:29 +00:00
Vinod Kumar Vavilapalli 14adc33eb4 YARN-661. Fixed NM to cleanup users' local directories correctly when starting up. Contributed by Omkar Vinit Joshi.
svn merge --ignore-ancestry -c 1503942 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1503943 13f79535-47bb-0310-9956-ffa450edef68
2013-07-16 23:31:29 +00:00
Bikas Saha 3990e8b478 Merge r1503933 from trunk to branch-2 for YARN-513. Create common proxy client for communicating with RM (Xuan Gong & Jian He via bikas)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1503935 13f79535-47bb-0310-9956-ffa450edef68
2013-07-16 22:54:55 +00:00
Vinod Kumar Vavilapalli f4f14c079e YARN-523. Modified a test-case to validate container diagnostics on localization failures. Contributed by Jian He.
svn merge --ignore-ancestry -c 1503532 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1503533 13f79535-47bb-0310-9956-ffa450edef68
2013-07-16 01:04:10 +00:00
Chris Nauroth 4f7b540d14 YARN-909. Merging change r1503357 from trunk to branch-2.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1503362 13f79535-47bb-0310-9956-ffa450edef68
2013-07-15 17:36:16 +00:00
Chris Nauroth e9fff0bbba YARN-894. Merging change r1501016 from trunk to branch-2.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1501020 13f79535-47bb-0310-9956-ffa450edef68
2013-07-08 23:44:30 +00:00
Hitesh Shah bf433c06c7 Merge r1495160 from trunk to branch-2 for YARN-861. TestContainerManager is failing. Contributed by Vinod Kumar Vavilapalli.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1495162 13f79535-47bb-0310-9956-ffa450edef68
2013-06-20 20:22:02 +00:00
Siddharth Seth 5de952dca0 merge YARN-848 from trunk. Fix NodeManager to register with RM using the fully qualified hostname. Contributed by Hitesh Shah.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1494389 13f79535-47bb-0310-9956-ffa450edef68
2013-06-18 23:50:32 +00:00
Vinod Kumar Vavilapalli 368c7ae735 YARN-694. Starting to use NMTokens to authenticate all communication with NodeManagers. Contributed by Omkar Vinit Joshi.
svn merge --ignore-ancestry -c 1494369 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1494370 13f79535-47bb-0310-9956-ffa450edef68
2013-06-18 23:20:34 +00:00
Vinod Kumar Vavilapalli af90aa5265 YARN-841. Move Auxiliary service to yarn-api, annotate and document it. Contributed by Vinod Kumar Vavilapalli.
svn merge --ignore-ancestry -c 1494031 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1494032 13f79535-47bb-0310-9956-ffa450edef68
2013-06-18 06:23:54 +00:00
Chris Nauroth b2f8aca282 YARN-839. TestContainerLaunch.testContainerEnvVariables fails on Windows. Contributed by Chuan Liu.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1493943 13f79535-47bb-0310-9956-ffa450edef68
2013-06-17 21:19:57 +00:00