543 Commits

Author SHA1 Message Date
Jason Lowe
74503ed6b4 YARN-5859. TestResourceLocalizationService#testParallelDownloadAttemptsForPublicResource sometimes fails. Contributed by Eric Badger
(cherry picked from commit 009452bb6dbe5dffb0b304d67a2f360fe0eee1e2)
2016-11-21 16:40:00 +00:00
Jason Lowe
cb0fccad19 YARN-5836. Malicious AM can kill containers of other apps running in any node its containers are running. Contributed by Botong Huang
(cherry picked from commit 59bfcbf3579e45ddf96db3aafccf669c8e03648f)

Conflicts:

	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerManagerTest.java
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java
2016-11-16 22:43:56 +00:00
Naganarasimha
5c13bb5008 YARN-4355. NPE while processing localizer heartbeat. Contributed by Varun Saxena & Jonathan Hung.
(cherry picked from commit 7ffb9943b8838a3bb56684e0722db40d800743a2)
2016-11-15 15:52:05 +05:30
Naganarasimha
59a3d9a576 Reverting it because of YARN-5287, blocker issue has been reported.
Revert "YARN-5287. LinuxContainerExecutor fails to set proper permission. Contributed by Ying Zhang."

This reverts commit 16c8fd9dca75e8d4f36e63ef0940aed83f8ebecf.
2016-11-13 17:43:01 +05:30
Jason Lowe
4a70bd86da YARN-5001. Aggregated Logs root directory is created with wrong group if nonexistent. Contributed by Haibo Chen
(cherry picked from commit 76893a41003d57d94eb1a5f486010815266046af)
2016-11-01 20:24:44 +00:00
Jason Lowe
9cc7ab4e96 YARN-5767. Fix the order that resources are cleaned up from the local Public/Private caches. Contributed by Chris Trezzo
(cherry picked from commit 1b79c417dca17bcd2e031864bc6ca07254c61b47)

Conflicts:

	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
2016-10-28 16:11:59 +00:00
Jason Lowe
83bb428e6c YARN-5027. NM should clean up app log dirs after NM restart. Contributed by sandflee
(cherry picked from commit 7146359bfd436a76585fb1f3ea93716795308cec)
2016-10-28 15:50:22 +00:00
Jason Lowe
558e53b10b HADOOP-13770. Shell.checkIsBashSupported swallowed an interrupted exception. Contributed by Wei-Chiu Chuang
(cherry picked from commit c017171da00a6cd71a2901c84a0298ce14a49e23)
2016-10-28 15:06:57 +00:00
Jason Lowe
d21dc8aa6c YARN-4831. Recovered containers will be killed after NM stateful restart. Contributed by Siqi Li
(cherry picked from commit 7e3c327d316b33d6a09bfd4e65e7e5384943bb1d)
2016-10-27 20:45:18 +00:00
Varun Vasudev
d1841b4750 YARN-5704. Provide config knobs to control enabling/disabling new/work in progress features in container-executor. Contributed by Sidharta Seethana. 2016-10-09 13:22:01 +05:30
Varun Vasudev
50409404bf YARN-4245. Generalize config file handling in container-executor. Contributed by Sidharta Seethana.
(cherry picked from commit 8ed2e060e80c0def3fcb7604e0bd27c1c24d291e)
(cherry picked from commit 78919f8c341ec645cf9134991e3ae89a929b9184)
2016-10-09 11:53:03 +05:30
Jason Lowe
515727f46f YARN-4543. Fix random test failure in TestNodeStatusUpdater.testStopReentrant. (Akihiro Suda via rohithsharmaks)
(cherry picked from commit ac686668031ee9837deed3f3566f09f33c437870)
2016-10-03 16:06:17 +00:00
Jason Lowe
1ac8206a85 YARN-5630. NM fails to start after downgrade from 2.8 to 2.7. Contributed by Jason Lowe
(cherry picked from commit e7933097354a246b080b46f1a4ca2ef0f39f3b38)
2016-09-13 14:43:55 +00:00
Arun Suresh
979b29a03c YARN-5221. Expose UpdateResourceRequest API to allow AM to request for change in container properties. (asuresh)
(cherry picked from commit d6d9cff21b7b6141ed88359652cf22e8973c0661)
(cherry picked from commit b279f42d79175bef6529dc1ac4216198a3aaee4d)
2016-08-31 20:06:49 -07:00
Junping Du
1a38dd9cee YARN-4916. TestNMProxy.tesNMProxyRPCRetry fails. Contributed by Tibor Kiss.
(cherry picked from commit 00058167431475c6e63c80207424f1d365569e3a)
2016-08-15 19:18:36 +00:00
Naganarasimha
16c8fd9dca YARN-5287. LinuxContainerExecutor fails to set proper permission. Contributed by Ying Zhang. 2016-08-09 17:34:48 +05:30
Jason Lowe
5883718eea YARN-4717. TestResourceLocalizationService.testPublicResourceInitializesLocalDir fails Intermittently due to IllegalArgumentException from cleanup (templedf via rkanter)
(cherry picked from commit 22ca176dfe125a4f7bf38cc63ab8106c40a7a7ba)
2016-08-03 19:33:58 +00:00
Jason Lowe
c972498b74 YARN-5462. TestNodeStatusUpdater.testNodeStatusUpdaterRetryAndNMShutdown fails intermittently. Contributed by Eric Badger
(cherry picked from commit db646540f094077941b56ed681a4f3e5853f5b7f)
2016-08-03 19:19:47 +00:00
Jason Lowe
9d08ca1ed7 YARN-4393. Fix intermittent test failure for TestResourceLocalizationService#testFailedDirsResourceRelease (Varun Saxana via rohithsharmaks)
(cherry picked from commit 791c1639ae0b351e0bf0b2ecec854dc72ab07935)

Conflicts:

	hadoop-yarn-project/CHANGES.txt
2016-07-12 19:25:07 +00:00
Jian He
23eb3c7ceb YARN-5270. Solve miscellaneous issues caused by YARN-4844. Contributed by Wangda Tan 2016-07-11 22:38:35 -07:00
Vinod Kumar Vavilapalli
4ea87cb38c YARN-5214. Fixed locking in DirectoryCollection to avoid hanging NMs when various code-paths hit slow disks. Contributed by Junping Du.
(cherry picked from commit ce9c006430d13a28bc1ca57c5c70cc1b7cba1692)
2016-07-05 17:12:37 -07:00
Junping Du
810470508b YARN-5237. Fix missing log files issue in rolling log aggregation. Contributed by Xuan Gong. 2016-06-16 07:18:36 -07:00
Wangda Tan
d838c6443d YARN-1942. Deprecate toString/fromString methods from ConverterUtils and move them to records classes like ContainerId/ApplicationId, etc. (wangda) 2016-06-14 15:21:41 -07:00
Junping Du
2be48e7d15 YARN-5199. Close LogReader in NMWebServices#getLogs. Contributed by Xuan Gong. 2016-06-09 12:29:25 -07:00
Xuan
11b4d1e486 Revert "YARN-4920. ATS/NM should support a link to dowload/get the logs in text format. Contributed by Xuan Gong."
This reverts commit 22ac37615a933f9cee8cf19ad0182586a037b690.
2016-06-08 11:23:12 -07:00
Wangda Tan
19e578870d YARN-4844. Add getMemorySize/getVirtualCoresSize to o.a.h.y.api.records.Resource. (wangda) 2016-06-07 12:41:50 -07:00
Ming Ma
ec4f9a14f9 MAPREDUCE-5044. Have AM trigger jstack on task attempts that timeout before killing them. (Eric Payne and Gera Shegalov via mingma)
(cherry picked from commit 4a1cedc010d3fa1d8ef3f2773ca12acadfee5ba5)
(cherry picked from commit 74e2b5efa26f27027fed212b4b2108f0e95587fb)
2016-06-06 14:49:43 -07:00
Jian He
3c2bd19fa5 YARN-5190. Registering/unregistering container metrics in ContainerMonitorImpl and ContainerImpl causing uncaught exception in ContainerMonitorImpl. Contributed by Junping Du
(cherry picked from commit 99cc439e29794f8e61bebe03b2a7ca4b6743ec92)
2016-06-03 11:11:49 -07:00
Wangda Tan
2f3e1d965d Revert "YARn-4844. Add getMemoryLong/getVirtualCoreLong to o.a.h.y.api.records.Resource. Contributed by Wangda Tan."
This reverts commit 457884737f75c796413ce860b1859a31cc5292ca.
2016-05-31 22:16:53 -07:00
Varun Vasudev
457884737f YARn-4844. Add getMemoryLong/getVirtualCoreLong to o.a.h.y.api.records.Resource. Contributed by Wangda Tan. 2016-05-29 20:57:56 +05:30
Carlo Curino
7982933c09 YARN-4957. Add getNewReservation in ApplicationClientProtocol (Sean Po via curino)
(cherry picked from commit 013532a95e63d7c53e601be530021d6d5a15ab7f)
(cherry picked from commit c656977961e2ba0f9dfd349ed59bf1d0d41c57f5)
2016-05-25 17:02:22 -07:00
Jason Lowe
fe10caee8d YARN-4459. container-executor should only kill process groups. Contributed by Jun Gong
(cherry picked from commit 1ba31fe9e906dbd093afd4b254216601967a4a7b)

Conflicts:

	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/test-container-executor.c
2016-05-25 21:37:31 +00:00
Varun Vasudev
35456bb7c9 YARN-857. Localization failures should be available in container diagnostics. Contributed by Vinod Kumar Vavilapalli.
(cherry picked from commit f440a9d8c4a177bc5062d21d4b4bc4d9b2944344)
(cherry picked from commit 36f2ae0692d73a865a5c0c520d1346b6d4498c25)
2016-05-25 19:03:33 +05:30
Jason Lowe
8e404b4321 YARN-5103. With NM recovery enabled, restarting NM multiple times results in AM restart. Contributed by Junping Du
(cherry picked from commit d1df0266cf4e9ff0ec70813c156556ca4e74f791)
2016-05-23 15:17:26 +00:00
Rohith Sharma K S
726c1f14b8 YARN-3840. Resource Manager web ui issue when sorting application by id (with application having id > 9999). Contributed by Mohammad Shahid Khan and Varun Saxena 2016-05-19 10:50:32 +05:30
Jason Lowe
70faa87ccf YARN-4325. Nodemanager log handlers fail to send finished/failed events in some cases. Contributed by Junping Du
(cherry picked from commit 81effb7dcde2b31423438d6f1b8b8204d4ca05b3)
2016-05-16 15:43:42 +00:00
Wangda Tan
3620d0e623 YARN-4984. LogAggregationService shouldn't swallow exception in handling createAppDir() which cause thread leak. (Junping Du via wangda)
(cherry picked from commit 7bd418e48c71590fc8026d69f9b8f8ad42f2aade)
2016-05-05 10:08:37 -07:00
Junping Du
22ac37615a YARN-4920. ATS/NM should support a link to dowload/get the logs in text format. Contributed by Xuan Gong.
(cherry picked from commit 3cf223166d452a0f58f92676837a9edb8ddc1139)
(cherry picked from commit c79dc07dc193904f2586a5d64ea2f4e56d2396b8)
2016-05-04 09:49:08 -07:00
Robert Kanter
3c3d003402 Remove parent's env vars from child processes
(cherry picked from commit ac8fb579c6058fec60caf30682f902413d68edf3)
2016-04-29 15:30:06 -07:00
Jian He
5ba79d77fb YARN-5009. NMLeveldbStateStoreService database can grow substantially leading to longer recovery times. Contributed by Jason Lowe
(cherry picked from commit 4a8508501bc753858693dacdafba61d604702f71)
2016-04-28 21:54:53 -07:00
Jason Lowe
9b5c5bd42f YARN-4924. NM recovery race can lead to container not cleaned up. Contributed by sandflee
(cherry picked from commit 3150ae8108a1fc40a67926be6254824c1e37cb38)

Conflicts:

	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMLeveldbStateStoreService.java
2016-04-14 19:40:10 +00:00
Vinod Kumar Vavilapalli
3589b9e10e YARN-4168. Fixed a failing test TestLogAggregationService.testLocalFileDeletionOnDiskFull. Contributed by Takashi Ohnishi.
(cherry picked from commit 44bbc50d919388e4ad08be2e9ba80ac7502d2579)
2016-04-11 12:13:37 -07:00
Akira Ajisaka
13be7a849d YARN-4630. Remove useless boxing/unboxing code. Contributed by Kousuke Saruta.
(cherry picked from commit 1ff27f9d12e8124c1b9a722708264c5b07fd0fde)
(cherry picked from commit 8cf6630fc6cedbd86eff9da6f35ce1da4ed7ed2f)
2016-04-11 14:55:54 +09:00
Karthik Kambatla
ddb1407980 YARN-4756. Unnecessary wait in Node Status Updater during reboot. (Eric Badger via kasha)
(cherry picked from commit e82f961a3925aadf9e53a009820a48ba9e4f78b6)
(cherry picked from commit 2b97a50eec8e9f7167a44b8ca0391fce0aae571c)
2016-04-07 17:35:06 -07:00
naganarasimha
9bd089ac64 YARN-4746. yarn web services should convert parse failures of appId, appAttemptId and containerId to 400. Contributed by Bibin A Chundatt
(cherry picked from commit 5092c94195a63bd2c3e36d5a74b4c061cea1b847)
2016-04-04 18:16:23 +05:30
Jason Lowe
35f9cfda61 YARN-4773. Log aggregation performs extraneous filesystem operations when rolling log aggregation is disabled. Contributed by Jun Gong
(cherry picked from commit 948b75807068c304ffe789e32f2b850c0d653e0a)
2016-03-28 23:03:16 +00:00
Jian He
7c81e374da YARN-4117. End to end unit test with mini YARN cluster for AMRMProxy Service. Contributed by Giovanni Matteo Fumarola 2016-03-28 13:23:53 -07:00
Eric Payne
878e1cfc77 YARN-4686. MiniYARNCluster.start() returns before cluster is completely started. Contributed by Eric Badger.
(cherry picked from commit 92b7e0d41302b6b110927f99de5c2b4a4a93c5fd)
2016-03-18 17:19:06 +00:00
Jason Lowe
53ec7c9243 YARN-4744. Too many signal to container failure in case of LCE. Contributed by Sidharta Seethana
(cherry picked from commit 059caf99891943d9587cac19b48e82efbed06b2d)
2016-03-07 15:48:06 +00:00
Haohui Mai
69b195d619 HADOOP-12813. Migrate TestRPC and related codes to rebase on ProtobufRpcEngine. Contributed by Kai Zheng. 2016-02-29 14:15:25 -08:00