1449 Commits

Author SHA1 Message Date
Xuan
d59bf81e08 YARN-3999. RM hangs on draing events. Contributed by Jian He
(cherry picked from commit 3ae716fa696b87e849dae40225dc59fb5ed114cb)
(cherry picked from commit 2ebdf5bfcee9ede80681a5266df225885d830883)
2015-09-08 22:57:35 -07:00
Jonathan Eagles
6ed2486c7e YARN-3978. Configurably turn off the saving of container info in Generic AHS (Eric Payne via jeagles)
(cherry picked from commit 3cd02b95224e9d43fd63a4ef9ac5c44f113f710d)
(cherry picked from commit 899df5bce03ea4f94487e48c1d38bd30ae10c26f)
2015-09-08 22:57:34 -07:00
Jian He
92742b4402 YARN-2301. Improved yarn container command. Contributed by Naganarasimha G R
(cherry picked from commit 258623ff8bb1a1057ae3501d4f20982d5a59ea34)

(cherry picked from commit 1d1e7682c9cad6a2f819b390ca3368dfa29c7097)
2015-09-08 22:57:28 -07:00
Jian He
2336264900 YARN-2918. RM should not fail on startup if queue's configured labels do not exist in cluster-node-labels. Contributed by Wangda Tan
(cherry picked from commit f489a4ec969f3727d03c8e85d51af1018fc0b2a1)

(cherry picked from commit d817fbb34d6e34991c6e512c20d71387750a98f4)
2015-09-06 14:15:33 -07:00
Jian He
ee2b6bc248 YARN-3124. Fixed CS LeafQueue/ParentQueue to use QueueCapacities to track capacities-by-label. Contributed by Wangda Tan
(cherry picked from commit 18a594257e052e8f10a03e5594e6cc6901dc56be)

(cherry picked from commit 1be2d64ddddaa6322909073cfaf7f2f2eb46e18d)
2015-09-06 11:54:40 -07:00
Jian He
637e7f9e39 YARN-2694. Ensure only single node label specified in ResourceRequest. Contributed by Wangda Tan
(cherry picked from commit c1957fef29b07fea70938e971b30532a1e131fd0)

(cherry picked from commit 3ddafaa7c854dcf21ecc790c276927e7c869e62c)
2015-09-05 21:07:51 -07:00
Jian He
4c94f07140 YARN-3098. Created common QueueCapacities class in Capacity Scheduler to track capacities-by-labels of queues. Contributed by Wangda Tan
(cherry picked from commit 21d80b3dd90a8e33e51701887c8d9369ed4ab17d)

(cherry picked from commit c0b1311a93614becc4a255af48fb7b697d491b80)
2015-09-05 20:54:20 -07:00
Jian He
d9281fbbab YARN-3099. Capacity Scheduler LeafQueue/ParentQueue should use ResourceUsage to track used-resources-by-label. Contributed by Wangda Tan
(cherry picked from commit 86358221fc85a7743052a0b4c1647353508bf308)

(cherry picked from commit cabf97ae4f2dad53c7b9e3d10a67876b16d94074)
2015-09-05 20:54:20 -07:00
Jian He
b0ad553841 YARN-3092. Created a common ResourceUsage class to track labeled resource usages in Capacity Scheduler. Contributed by Wangda Tan
(cherry picked from commit 6f9fe76918bbc79109653edc6cde85df05148ba3)

(cherry picked from commit 61b4116b4b3c0eec8f514f079debd88bc757b28e)
2015-09-05 20:54:19 -07:00
Jian He
419e18cb37 YARN-2978. Fixed potential NPE while getting queue info. Contributed by Varun Saxena
(cherry picked from commit dd57c2047bfd21910acc38c98153eedf1db75169)

(cherry picked from commit c61e8a7bfa7236e354f859a889083fab3d7ca9eb)
2015-09-05 20:54:19 -07:00
Jian He
88f022da24 YARN-2920. Changed CapacityScheduler to kill containers on nodes where node labels are changed. Contributed by Wangda Tan
(cherry picked from commit fdf042dfffa4d2474e3cac86cfb8fe9ee4648beb)

(cherry picked from commit 411836b74c6c02c0b5aebbbce29c209d93db1de2)
2015-09-05 20:54:18 -07:00
Wangda Tan
2073fc0f84 Add missing test file of YARN-3733
(cherry picked from commit 405bbcf68c32d8fd8a83e46e686eacd14e5a533c)
(cherry picked from commit 344b7509153cdd993218cd5104c7e5c07cd35d3c)
2015-09-03 17:43:03 -07:00
Wangda Tan
85d92721a4 YARN-3733. Fix DominantRC#compare() does not work as expected if cluster resource is empty. (Rohith Sharmaks via wangda)
(cherry picked from commit ebd797c48fe236b404cf3a125ac9d1f7714e291e)
(cherry picked from commit 78d626fa892415023827e35ad549636e2a83275d)
2015-09-03 17:43:01 -07:00
Jian He
f1b35ffd4c YARN-2637. Fixed max-am-resource-percent calculation in CapacityScheduler when activating applications. Contributed by Craig Welch
(cherry picked from commit c53420f58364b11fbda1dace7679d45534533382)

(cherry picked from commit 4931600030e13d9332d9a0e588487cb8684c667d)
2015-09-03 17:40:24 -07:00
Jason Lowe
ca7fe71000 YARN-3990. AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected. Contributed by Bibin A Chundatt
(cherry picked from commit 32e490b6c035487e99df30ce80366446fe09bd6c)

(cherry picked from commit c31e3ba92132f232bd56b257f3854ffe430fbab9)
(cherry picked from commit 07d31d4c0808a169f4770187d655f38aa105255c)
2015-09-03 14:40:20 -07:00
Jason Lowe
fe5877a49e YARN-3850. NM fails to read files from full disks which can lead to container logs being lost and other issues. Contributed by Varun Saxena
(cherry picked from commit 40b256949ad6f6e0dbdd248f2d257b05899f4332)

(cherry picked from commit 0221d19f4e398c386f4ca3990b0893562aa8dacf)
(cherry picked from commit 87d2204f28f192a964c04a5fa1e2e31644d74b59)
2015-09-03 14:35:01 -07:00
Jason Lowe
f21fb808f1 YARN-3832. Resource Localization fails on a cluster due to existing cache directories. Contributed by Brahma Reddy Battula
(cherry picked from commit 8d58512d6e6d9fe93784a9de2af0056bcc316d96)

(cherry picked from commit 15b1800b1289d239cbebc5cfd66cfe156d45a2d3)
(cherry picked from commit 38400507e3352d83c2a1f364de137366249b7983)
2015-09-03 14:26:43 -07:00
Jason Lowe
193d8d3667 YARN-3585. NodeManager cannot exit on SHUTDOWN event triggered and NM recovery is enabled. Contributed by Rohith Sharmaks
(cherry picked from commit e13b671aa510f553f4a6a232b4694b6a4cce88ae)

(cherry picked from commit 752caa95a40d899e1bf98bc907e91aec2bb57073)
(cherry picked from commit 13c4db632b0e7f19dcfa883c2492431c2c7d0799)
2015-09-03 14:09:16 -07:00
Wangda Tan
ae0fac3efa YARN-3725. App submission via REST API is broken in secure mode due to Timeline DT service address is empty. (Zhijie Shen via wangda)
(cherry picked from commit 5cc3fced957a8471733e0e9490878bd68429fe24)
(cherry picked from commit a3734f67d35e714690ecdf21d80bce8a355381e3)
(cherry picked from commit 9ccc22e2ac89990f3e7997f1d89594523c66e76a)
2015-09-03 13:50:14 -07:00
Xuan
1c6a287bf5 YARN-2900. Application (Attempt and Container) Not Found in AHS results
in Internal Server Error (500). Contributed by Zhijie Shen and Mit Desai

(cherry picked from commit 06f8e9cabaf3c05cd7d16215cff47265ea773f39)
(cherry picked from commit 4fee8b320276bac86278e1ae0a3397592a78aa18)
(cherry picked from commit 6c7b625138ce3b262a8c8aa28077074b553638ed)
2015-09-03 13:45:06 -07:00
Zhijie Shen
42ce052585 YARN-3700. Made generic history service load a number of latest applications according to the parameter or the configuration. Contributed by Xuan Gong.
(cherry picked from commit 54504133f41e36eaea6bb06c7b9ddb249468ecd7)
(cherry picked from commit 839f81a6326b2f8b3d5183178382c1551b0bc259)
(cherry picked from commit 058380d9ef35f35e8c624fb8783eac0904c4d1f5)
2015-09-03 12:59:33 -07:00
Zhijie Shen
0f33fcd507 YARN-2766. Made ApplicationHistoryManager return a sorted list of apps, attempts and containers. Contributed by Robert Kanter.
(cherry picked from commit 3648cb57c9f018a3a339c26f5a0ca2779485521a)
(cherry picked from commit 53d6c91df925c58e311ec7cd6a3adb952acc02f4)
2015-09-03 12:58:42 -07:00
Xuan
7b1a71a7ad YARN-3526. ApplicationMaster tracking URL is incorrectly redirected on a QJM cluster. Contributed by Weiwei Yang
(cherry picked from commit b0ad644083a0dfae3a39159ac88b6fc09d846371)
(cherry picked from commit 802676e1be350785d8c0ad35f6676eeb85b2467b)
(cherry picked from commit 2cadeb9e017c6a75db16e1f23b2accda04f12298)
2015-09-03 11:54:23 -07:00
Jason Lowe
778da79e6f YARN-3641. NodeManager: stopRecoveryStore() shouldn't be skipped when exceptions happen in stopping NM's sub-services. Contributed by Junping Du
(cherry picked from commit 711d77cc54a64b2c3db70bdacc6bf2245c896a4b)

(cherry picked from commit a81ad814610936a02e55964fbe08f7b33fe29b23)
(cherry picked from commit aa82b0684554be8d09f6fcd88826f167922280cc)
2015-09-03 11:50:30 -07:00
Karthik Kambatla
6ade6b5051 YARN-3464. Race condition in LocalizerRunner kills localizer before localizing all resources. (Zhihai Xu via kasha)
(cherry picked from commit 47279c3228185548ed09c36579b420225e4894f5)
(cherry picked from commit 4045c41afe440b773d006e962bf8a5eae3fdc284)
(cherry picked from commit 6f2cc0dfa8f21984ecdab59dc087ccf525934930)
2015-09-02 15:03:51 -07:00
Xuan
9af5b1dcd0 YARN-3024. LocalizerRunner should give DIE action when all resources are
localized. Contributed by Chengbing Liu

(cherry picked from commit 0d6bd62102f94c55d59f7a0a86a684e99d746127)
(cherry picked from commit a7696b3fbfacd98a892bbb3678663658c7b9d2bd)
(cherry picked from commit 9e30232004ab7c3c3bfde3b8b27c37fa7065f6be)
2015-09-02 14:52:06 -07:00
Wangda Tan
e081593042 YARN-3487. CapacityScheduler scheduler lock obtained unnecessarily when calling getQueue (Jason Lowe via wangda)
(cherry picked from commit f47a5763acd55cb0b3f16152c7f8df06ec0e09a9)
(cherry picked from commit 3316cd4357ff6ccc4c76584813092adb1c2b4d43)
(cherry picked from commit 24d45ee9544abcfcf9e611ab835ec2f824333670)
2015-09-02 11:28:22 -07:00
Wangda Tan
61f2ddb125 YARN-3493. RM fails to come up with error "Failed to load/recover state" when mem settings are changed. (Jian He via wangda)
(cherry picked from commit f65eeb412d140a3808bcf99344a9f3a965918f70)
(cherry picked from commit e7cbecddc3e7ca5386c71aa4deb67f133611415c)
(cherry picked from commit 9d47d5aa5bffe427c4a77260f7ccc039d446e1fd)
2015-09-02 11:14:35 -07:00
Vinod Kumar Vavilapalli
752e3da738 YARN-3055. Fixed ResourceManager's DelegationTokenRenewer to not stop token renewal of applications part of a bigger workflow. Contributed by Daryn Sharp.
(cherry picked from commit 9c5911294e0ba71aefe4763731b0e780cde9d0ca)
(cherry picked from commit 1ff3fd33ed6f2ac09c774cc42b0107c5dbd9c19d)
(cherry picked from commit 82c722aae86669325672dd10840447434f15e7fd)
2015-09-01 21:31:00 -07:00
Xuan
914cc8f4a4 YARN-3393. Getting application(s) goes wrong when app finishes before
starting the attempt. Contributed by Zhijie Shen

(cherry picked from commit 9fae455e26e0230107e1c6db58a49a5b6b296cf4)
(cherry picked from commit cbdcdfad6de81e17fb586bc2a53b37da43defd79)
(cherry picked from commit 61aafdcfa589cbae8363976c745ea528b03f152d)
2015-09-01 18:14:51 -07:00
Wangda Tan
005d865494 YARN-3369. Missing NullPointer check in AppSchedulingInfo causes RM to die. (Brahma Reddy Battula via wangda)
(cherry picked from commit 6bc7710ec7f2592c4c87dd940fbe5827ef81fe72)
(cherry picked from commit 8e142d27cbddfa1a1c83c5f8752bd14ac0a13612)
(cherry picked from commit 4d43be3c01b1bc0deb31a9081fca5395d0eb4e0d)
2015-09-01 17:10:42 -07:00
Jonathan Eagles
a9bb641d51 YARN-3267. Timelineserver applies the ACL rules after applying the limit on the number of records (Chang Li via jeagles)
(cherry picked from commit 8180e676abb2bb500a48b3a0c0809d2a807ab235)
(cherry picked from commit 44aedad5ddc8069a6dba3eaf66ed54d612b21208)
(cherry picked from commit f4bbf2c8f97d3601132504453f61e472950a433e)
2015-09-01 16:04:19 -07:00
Zhijie Shen
9005b141a5 YARN-3287. Made TimelineClient put methods do as the correct login context. Contributed by Daryn Sharp and Jonathan Eagles.
(cherry picked from commit d6e05c5ee26feefc17267b7c9db1e2a3dbdef117)
(cherry picked from commit a94d23762e2cf4211fe84661eb67504c7072db49)
(cherry picked from commit 68e07eb50b872ec8a78923df8f5f640f08a72aa2)
2015-09-01 15:24:36 -07:00
Xuan
a57ada6c1f YARN-3227. Timeline renew delegation token fails when RM user's TGT is
expired. Contributed by Zhijie Shen

(cherry picked from commit d1abc5d4fc00bb1b226066684556ba16ace71744)
(cherry picked from commit 56c2050ab7c04e9741bcba9504b71e5a54d09eea)
(cherry picked from commit 780a9b1a98827a692e0ea9fbc92f9d1ab979e3e0)
2015-09-01 15:21:46 -07:00
Jian He
7ffdf7d105 YARN-1809. Synchronize RM and TimeLineServer Web-UIs. Contributed by Zhijie Shen and Xuan Gong
(cherry picked from commit 95bfd087dc89e57a93340604cc8b96042fa1a05a)

(cherry picked from commit a5f3fb4dc14503bf7c454a48cf954fb0d6710de2)
(cherry picked from commit 27a2f0acb84202cc082090eef7eea57f6e42f9bb)
2015-09-01 15:12:53 -07:00
Tsuyoshi Ozawa
81417f7572 YARN-3249. Add a 'kill application' button to Resource Manager's Web UI. Contributed by Ryu Kobayashi.
(cherry picked from commit 1b672096121fef775572b517d4f5721997abbac6)
(cherry picked from commit 6660c2f83b855535217582326746dc76d53fdf61)
(cherry picked from commit 6ea859e435e7cd6bc342f67e1551ccb86fbd976f)
2015-09-01 14:37:21 -07:00
Wangda Tan
8b5bdac98e YARN-3230. Clarify application states on the web UI. (Jian He via wangda)
(cherry picked from commit ce5bf927c3d9f212798de1bf8706e5e9def235a1)
(cherry picked from commit a1963968d2a9589fcefaab0d63feeb68c07f4d06)
(cherry picked from commit 591e261ccf1fb5dd25e87665c8d5c0341ff6fb24)
2015-09-01 14:34:07 -07:00
Karthik Kambatla
5a6755cc0f YARN-3242. Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving events for old client. (Zhihai Xu via kasha)
(cherry picked from commit 8d88691d162f87f95c9ed7e0a569ef08e8385d4f)
(cherry picked from commit 0d62e948877e5d50f1b6fbe735a94ac6da5ff472)
(cherry picked from commit 4a5b0e708d42fbff571229a43d1762d1767e2db5)
2015-09-01 14:06:34 -07:00
Karthik Kambatla
dbc5bab9fd YARN-3231. FairScheduler: Changing queueMaxRunningApps interferes with pending jobs. (Siqi Li via kasha)
(cherry picked from commit 22426a1c9f4bd616558089b6862fd34ab42d19a7)
(cherry picked from commit 721d7b574126c4070322f70ec5b49a7b8558a4c7)
(cherry picked from commit 5dfa25f22a989222e8b3d1013117b3350a48b2c5)
2015-09-01 13:54:04 -07:00
Jian He
db92b09e03 YARN-3222. Fixed NPE on RMNodeImpl#ReconnectNodeTransition when a node is reconnected with a different port. Contributed by Rohith Sharmaks
(cherry picked from commit b2f1ec312ee431aef762cfb49cb29cd6f4661e86)

(cherry picked from commit 888a44563819ba910dc3cc10d10ee0fb8f05db61)
(cherry picked from commit b78f87825bd593e30b2f2ea76f37c7a4fd673ab2)
2015-09-01 13:39:35 -07:00
Jason Lowe
a4b8897b30 YARN-3239. WebAppProxy does not support a final tracking url which has query fragments and params. Contributed by Jian He
(cherry picked from commit 1a68fc43464d3948418f453bb2f80df7ce773097)

(cherry picked from commit 257087417e424e628f090b6b648ccb3b9c880250)
(cherry picked from commit 49468108c203bf093acdc93c1798d90c480c3a17)
2015-09-01 13:32:21 -07:00
Xuan
4fcf71c1e7 YARN-3238. Connection timeouts to nodemanagers are retried at multiple
levels. Contributed by Jason Lowe

(cherry picked from commit 92d67ace3248930c0c0335070cc71a480c566a36)
(cherry picked from commit fefeba4ac8bed44ce2dd0d3c4f0a99953ff8d4df)
(cherry picked from commit d8f02e1c5c3bcc230d942554b2f4cfbc3ed21526)
2015-09-01 11:19:37 -07:00
Xuan
95edb6e64f YARN-3207. Secondary filter matches entites which do not have the key
being filtered for. Contributed by Zhijie Shen

(cherry picked from commit 57db50cbe3ce42618ad6d6869ae337d15b261f4e)
(cherry picked from commit ba18adbb27c37a8fa92223a412ce65eaa462d18b)
(cherry picked from commit 9fd18e94849600ec66832df5ae424eeb0116330c)
2015-08-31 17:44:42 -07:00
Zhijie Shen
28160a0bd6 YARN-2246. Made the proxy tracking URL always be http(s)://proxy addr:port/proxy/<appId> to avoid duplicate sections. Contributed by Devaraj K.
(cherry picked from commit d5855c0e46404cfc1b5a63e59015e68ba668f0ea)
(cherry picked from commit fd75b8c9cadd069673afc80a0fc5661d779897bd)
(cherry picked from commit a62891971380e5f8e4a645ed36bd88aa6fe0e47a)
2015-08-31 17:38:51 -07:00
Jian He
a703952d39 YARN-3094. Reset timer for liveness monitors after RM recovery. Contributed by Jun Gong
(cherry picked from commit 0af6a99a3fcfa4b47d3bcba5e5cc5fe7b312a152)

(cherry picked from commit 61466809552f96a83aa19446d4d59cecd0d2cad5)
(cherry picked from commit ab654746fbad2da12b24b13425dc9bf17c46b50c)
2015-08-31 17:17:47 -07:00
Jian He
8658945b3a YARN-3103. AMRMClientImpl does not update AMRM token properly. Contributed by Jason Lowe
(cherry picked from commit 6d2bdbd7dab179dfb4f19bb41809e97f1db88c6b)

(cherry picked from commit 12522fd9cbd8da8c040a5b7bb71fcdaa256daf89)
(cherry picked from commit f50f5ad49d3b70448647384fc5f020214cb58f10)
2015-08-31 15:42:03 -07:00
Jian He
994c3d049a YARN-3011. Possible IllegalArgumentException in ResourceLocalizationService might lead NM to crash. Contributed by Varun Saxena
(cherry picked from commit 4e15fc08411318e11152fcd5a4648ed1d6fbb480)

(cherry picked from commit 8100c8a68c32978a177af9a3e6639f6de533886d)
(cherry picked from commit 10a6c4f349e6f32ed2a520bf669a0cbfff31c824)
2015-08-31 15:38:45 -07:00
Jian He
3f8da2a9eb YARN-2997. Fixed NodeStatusUpdater to not send alreay-sent completed container statuses on heartbeat. Contributed by Chengbing Liu
(cherry picked from commit cc2a745f7e82c9fa6de03242952347c54c52dccc)

(cherry picked from commit e7e6173049adca2a2ae0e1231adcaca8168bec27)
(cherry picked from commit 3c4ed2497b14140f09b3cae4959be6474c4cdc99)
2015-08-30 20:45:45 -07:00
Tsuyoshi Ozawa
03f9ac2de7 YARN-2922. ConcurrentModificationException in CapacityScheduler's LeafQueue. Contributed by Rohith Sharmaks.
(cherry picked from commit ddc5be48fc35868abf7f59088f747c636e76a42a)
(cherry picked from commit c116743bdda2b1792bf872020a5e2b14d772ac60)
(cherry picked from commit 3c9d26ae14625de3e9437c07eceda0d05f1985b2)
2015-08-30 20:43:14 -07:00
Jian He
e7fc071906 YARN-2992. ZKRMStateStore crashes due to session expiry. Contributed by Karthik Kambatla
(cherry picked from commit 1454efe5d4fe4214ec5ef9142d55dbeca7dab953)

(cherry picked from commit ca0349b87ab1b2d0d2b9dc93de7806d26713165c)
(cherry picked from commit 2f6be218fa41fd0f39633ec5ed0df6e0fa0f54b6)
2015-08-30 20:42:06 -07:00