Commit Graph

1925 Commits

Author SHA1 Message Date
Junping Du 9dc3683d87 YARN-5029. RM needs to send update event with YarnApplicationState as Running to ATS/AHS. Contributed by Xuan Gong.
(cherry picked from commit 39f2bac38b)
2016-05-11 09:33:16 -07:00
Naganarasimha 3732a1e985 YARN-4926. Change nodelabel rest API invalid reponse status to 400. Contributed by Bibin A Chundatt
(cherry picked from commit 2750fb900f)
2016-05-08 23:02:07 +05:30
Yongjun Zhang 5172d0e7b1 YARN-5048. DelegationTokenRenewer#skipTokenRenewal may throw NPE (Jian He via Yongjun Zhang)
(cherry picked from commit 47c41e7ac7)
2016-05-06 22:38:56 -07:00
Jason Lowe 3895058a67 YARN-4747. AHS error 500 due to NPE when container start event is missing. Contributed by Varun Saxena
(cherry picked from commit b2ed6ae731)
2016-05-06 23:00:25 +00:00
Wangda Tan b68e6b1d6d getApplicationReport call may raise NPE for removed queues. (Jian He via wangda)
(cherry picked from commit 23248f63aa)
2016-05-06 15:32:15 -07:00
Jian He a6b24c62ab YARN-4390. Do surgical preemption based on reserved container in CapacityScheduler. Contributed by Wangda Tan
(cherry picked from commit bb62e05925)
2016-05-05 12:56:55 -07:00
Jason Lowe ee86cef2fe YARN-4311. Removing nodes from include and exclude lists will not remove them from decommissioned nodes list. Contributed by Kuhu Shukla
(cherry picked from commit d0da13229c)

Conflicts:

	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNode.java
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java
2016-05-05 14:33:01 +00:00
Varun Vasudev 38a3b86141 YARN-4595. Add support for configurable read-only mounts when launching Docker containers. Contributed by Billie Rinaldi.
(cherry picked from commit 72b047715c)
2016-05-05 13:02:38 +05:30
Wangda Tan 585299146a YARN-4984. LogAggregationService shouldn't swallow exception in handling createAppDir() which cause thread leak. (Junping Du via wangda)
(cherry picked from commit 7bd418e48c)
2016-05-04 11:39:25 -07:00
Junping Du 1ffb0c43d6 YARN-4920. ATS/NM should support a link to dowload/get the logs in text format. Contributed by Xuan Gong.
(cherry picked from commit 3cf223166d452a0f58f92676837a9edb8ddc1139)
2016-05-04 10:36:31 -07:00
Rohith Sharma K S 5aad4070b2 YARN-4947. Test timeout is happening for TestRMWebServicesNodes. Contributed by Bibin A Chundatt
(cherry picked from commit 75e0450593)
2016-05-04 10:26:25 +05:30
Jason Lowe baac4e7db1 YARN-5003. Add container resource to RM audit log. Contributed by Nathan Roberts
(cherry picked from commit ed54f5f1ff)
2016-05-03 22:16:17 +00:00
Junping Du 47f67ae447 YARN-4851. Metric improvements for ATS v1.5 storage components. Li Lu via junping_du.
(cherry picked from commit 06413da72e)
2016-05-03 04:18:01 -07:00
Robert Kanter ac8fb579c6 Remove parent's env vars from child processes 2016-04-29 09:26:09 -07:00
Varun Vasudev 6561e3b500 YARN-3998. Add support in the NodeManager to re-launch containers. Contributed by Jun Gong.
(cherry picked from commit 0f25a1bb52)

 Conflicts:
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ProtoUtils.java
2016-04-29 16:22:06 +05:30
Jian He 6ba39a1597 YARN-5009. NMLeveldbStateStoreService database can grow substantially leading to longer recovery times. Contributed by Jason Lowe
(cherry picked from commit 4a8508501b)
2016-04-28 21:54:30 -07:00
Jian He a9707dceaf YARN-5008. LeveldbRMStateStore database can grow substantially leading to long recovery times. Contributed by Jason Lowe 2016-04-28 21:28:03 -07:00
Li Lu 50b7a35d56 YARN-4956. findbug issue on LevelDBCacheTimelineStore. (Zhiyuan Yang via gtcarrera9)
(cherry picked from commit f16722d2ef)
2016-04-27 10:58:12 -07:00
Karthik Kambatla 864ecb4434 YARN-4807. MockAM#waitForState sleep duration is too long. (Yufei Gu via kasha)
(cherry picked from commit 185c3d4de1)
2016-04-27 09:43:42 -07:00
Jian He 9d3ddb0b4d YARN-4983. JVM and UGI metrics disappear after RM transitioned to standby mode
(cherry picked from commit 4beff01354)
2016-04-26 21:02:04 -07:00
Karthik Kambatla 52bfa90fed YARN-4795. ContainerMetrics drops records. (Daniel Templeton via kasha)
(cherry picked from commit 1a3f1482e2)
2016-04-26 06:18:27 -07:00
Karthik Kambatla a5edb45b18 YARN-1297. FairScheduler: Move some logs to debug and check if debug logging is enabled
(cherry picked from commit 4b1dcbbe0c)
2016-04-26 05:10:29 -07:00
Wangda Tan 45ff579bfa YARN-4846. Fix random failures for TestCapacitySchedulerPreemption#testPreemptionPolicyShouldRespectAlreadyMarkedKillableContainers. (Bibin A Chundatt via wangda)
(cherry picked from commit 7cb3a3da96)
2016-04-22 11:41:08 -07:00
Eric Payne a7f903b2ba YARN-4556. TestFifoScheduler.testResourceOverCommit fails. Contributed by Akihiro Suda
(cherry picked from commit 3dce486d88)
2016-04-21 21:27:10 +00:00
Li Lu 054fa104c5 YARN-4968. A couple of AM retry unit tests need to wait SchedulerApplicationAttempt stopped. (Wangda Tan via gtcarrera9)
(cherry picked from commit 7c6339f66a)
2016-04-21 13:27:47 -07:00
Karthik Kambatla 75cf238354 YARN-4784. Fairscheduler: defaultQueueSchedulingPolicy should not accept FIFO. (Yufei Gu via kasha)
(cherry picked from commit 170c4fd4cd)
2016-04-20 23:58:30 -07:00
Wangda Tan 83a5cdc400 YARN-4890. Unit test intermittent failure: TestNodeLabelContainerAllocation#testQueueUsedCapacitiesUpdate. (Sunil G via wangda)
(cherry picked from commit 33fd95a99c)
2016-04-20 17:38:22 -07:00
Wangda Tan 41cafeb5a1 YARN-4934. Reserved Resource for QueueMetrics needs to be handled correctly in few cases. (Sunil G via wangda)
(cherry picked from commit fdc46bfb37)
2016-04-16 22:50:00 -07:00
Jason Lowe cd148cb347 YARN-4940. yarn node -list -all failed if RM start with decommissioned node. Contributed by sandflee
(cherry picked from commit 69f3d428d5)
2016-04-15 20:38:04 +00:00
Jason Lowe ece01478c5 YARN-4924. NM recovery race can lead to container not cleaned up. Contributed by sandflee
(cherry picked from commit 3150ae8108)
2016-04-14 19:19:46 +00:00
Robert Kanter e79a47670b YARN-4541. Change log message in LocalizedResource#handle() to DEBUG (rchiang via rkanter)
(cherry picked from commit 0d9194df00)
2016-04-13 17:45:36 -07:00
Xuan 5bc64dafc3 YARN-4886. Add HDFS caller context for EntityGroupFSTimelineStore. Contributed by Li Lu
(cherry picked from commit e0cb426758)
2016-04-13 10:39:31 -07:00
Naganarasimha 53c24e00e8 YARN-4810. NM applicationpage cause internal error 500. Contributed by Bibin A Chundatt.
(cherry picked from commit 437e9d6475)
2016-04-12 18:25:11 +05:30
Vinod Kumar Vavilapalli f1dcd40294 YARN-4168. Fixed a failing test TestLogAggregationService.testLocalFileDeletionOnDiskFull. Contributed by Takashi Ohnishi.
(cherry picked from commit 44bbc50d91)
2016-04-11 12:12:30 -07:00
Jason Lowe f1a370ce8b Revert "YARN-4311. Removing nodes from include and exclude lists will not remove them from decommissioned nodes list. Contributed by Kuhu Shukla"
This reverts commit 814ceeb489.
2016-04-11 15:56:29 +00:00
Junping Du a3f8491410 YARN-4928. Some yarn.server.timeline.* tests fail on Windows attempting to use a test root path containing a colon. Contributed by Gergely Novák.
(cherry picked from commit 08ddb3ac6d)
2016-04-11 08:51:22 -07:00
Akira Ajisaka 8cf6630fc6 YARN-4630. Remove useless boxing/unboxing code. Contributed by Kousuke Saruta.
(cherry picked from commit 1ff27f9d12)
2016-04-11 14:55:37 +09:00
Akira Ajisaka 88556294e2 YARN-4938. MiniYarnCluster should not request transitionToActive to RM on non-HA environment. Contributed by Eric Badger.
(cherry picked from commit 1b78b2ba17)
2016-04-11 01:32:49 +09:00
Karthik Kambatla 94a88ae87b YARN-4927. TestRMHA#testTransitionedToActiveRefreshFail fails with FairScheduler. (Bibin A Chundatt via kasha)
(cherry picked from commit ff95fd547b)
2016-04-09 10:31:29 -07:00
Wangda Tan 12ccdd6540 YARN-3215. Respect labels in CapacityScheduler when computing headroom. (Naganarasimha G R via wangda)
(cherry picked from commit ec06957941)
2016-04-08 15:34:24 -07:00
Jian He 77a75de319 YARN-4740. AM may not receive the container complete msg when it restarts. Contributed by Jun Gong 2016-04-08 11:21:07 -07:00
Karthik Kambatla 2b97a50eec YARN-4756. Unnecessary wait in Node Status Updater during reboot. (Eric Badger via kasha)
(cherry picked from commit e82f961a39)
2016-04-07 17:30:54 -07:00
Jian He 42bc565630 YARN-4769. Add support for CSRF header in the dump capacity scheduler logs and kill app buttons in RM web UI. Contributed by Varun Vasudev 2016-04-06 16:14:13 -07:00
Varun Vasudev 8f9b97ccce YARN-4906. Capture container start/finish time in container metrics. Contributed by Jian He.
(cherry picked from commit b41e65e5bc)
2016-04-06 13:42:06 +05:30
Wangda Tan 11e796b5cd YARN-4699. Scheduler UI and REST o/p is not in sync when -replaceLabelsOnNode is used to change label of a node. (Sunil G via wangda)
(cherry picked from commit 21eb428448)
2016-04-05 16:25:55 -07:00
Junping Du 0907ce8c93 YARN-4916. TestNMProxy.tesNMProxyRPCRetry fails. Contributed by Tibor Kiss.
(cherry picked from commit 0005816743)
2016-04-05 09:02:50 -07:00
Junping Du eeff2e35f8 YARN-4893. Fix some intermittent test failures in TestRMAdminService. Contributed by Brahma Reddy Battula.
(cherry picked from commit 6be28bcc46)
2016-04-05 07:05:06 -07:00
Jason Lowe 814ceeb489 YARN-4311. Removing nodes from include and exclude lists will not remove them from decommissioned nodes list. Contributed by Kuhu Shukla
(cherry picked from commit 1cbcd4a491)
2016-04-05 13:41:18 +00:00
Rohith Sharma K S 13a4e25f26 YARN-4609. RM Nodes list page takes too much time to load. Contributed by Bibin A Chundatt
(cherry picked from commit 776b549e2a)
2016-04-05 14:53:24 +05:30
Rohith Sharma K S eec23580b4 YARN-4880. Running TestZKRMStateStorePerf with real zookeeper cluster throws NPE. Contributed by Sunil G
(cherry picked from commit 552237d4a3)
2016-04-05 14:37:31 +05:30
naganarasimha 3772602848 YARN-4746. yarn web services should convert parse failures of appId, appAttemptId and containerId to 400. Contributed by Bibin A Chundatt
(cherry picked from commit 5092c94195)
2016-04-04 18:08:18 +05:30
Rohith Sharma K S c8271cd117 YARN-4607. Pagination support for AppAttempt page TotalOutstandingResource Requests table. Contributed by Bibin A Chundatt
(cherry picked from commit 1e6f92977d)
2016-04-04 08:13:03 +05:30
Allen Wittenauer 92a3dbe44f YARN-4850. test-fair-scheduler.xml isn't valid xml (Yufei Gu via aw)
(cherry picked from commit b1394d6307)
2016-04-01 16:57:31 -07:00
Robert Kanter 633f612d67 YARN-4639. Remove dead code in TestDelegationTokenRenewer added in YARN-3055 (templedf via rkanter)
(cherry picked from commit 7a021471c3)
2016-03-31 15:47:44 -07:00
Wangda Tan d36d9d676d YARN-4634. Scheduler UI/Metrics need to consider cases like non-queue label mappings. (Sunil G via wangda)
(cherry picked from commit 12b11e2e68)
2016-03-31 14:35:59 -07:00
Jian He 3afc2caec8 YARN-4811. Generate histograms in ContainerMetrics for actual container resource usage 2016-03-31 14:31:38 -07:00
Jian He f1f441b80f YARN-4822. Refactor existing Preemption Policy of CS for easier adding new approach to select preemption candidates. Contributed by Wangda Tan 2016-03-30 12:46:36 -07:00
Wangda Tan 6856a7183a YARN-4865. Track Reserved resources in ResourceUsage and QueueCapacities. (Sunil G via wangda)
(cherry picked from commit fc055a3cbe)
2016-03-29 17:10:17 -07:00
Xuan ffe01e05cd YARN-4863. AHS Security login should be in serviceInit() instead of serviceStart(). Contributed by Junping Du
(cherry picked from commit 80182809ae)
2016-03-28 22:18:56 -07:00
Jason Lowe edf17fe8e5 YARN-4773. Log aggregation performs extraneous filesystem operations when rolling log aggregation is disabled. Contributed by Jun Gong
(cherry picked from commit 948b758070)
2016-03-28 23:02:15 +00:00
Jian He c7d843af3b YARN-998. Keep NM resource updated through dynamic resource config for RM/NM restart. Contributed by Junping Du 2016-03-28 11:13:02 -07:00
Jian He bdc648ebe7 YARN-4117. End to end unit test with mini YARN cluster for AMRMProxy Service. Contributed by Giovanni Matteo Fumarola 2016-03-27 20:22:49 -07:00
Karthik Kambatla 4212f2e2bf YARN-4805. Don't go through all schedulers in ParameterizedTestBase. (kasha)
(cherry picked from commit 49ff54c860)
2016-03-26 21:45:29 -07:00
Junping Du c722262c75 YARN-4820. ResourceManager web redirects in HA mode drops query parameters. Contributed by Varun Vasudev.
(cherry picked from commit 19b645c938)
2016-03-23 19:35:14 -07:00
Eric Payne dd1e4107e5 YARN-4686. MiniYARNCluster.start() returns before cluster is completely started. Contributed by Eric Badger.
(cherry picked from commit 92b7e0d413)
2016-03-18 17:05:53 +00:00
Junping Du 66257613b4 YARN-4785. inconsistent value type of the type field for LeafQueueInfo in response of RM REST API.
(cherry picked from commit ca8106d2dd)
2016-03-17 09:25:36 -07:00
Karthik Kambatla bbe9bb078c YARN-4812. TestFairScheduler#testContinuousScheduling fails intermittently. (kasha)
(cherry picked from commit f84af8bd58)
2016-03-17 05:54:40 -07:00
Wangda Tan 484976fa2b YARN-4108. CapacityScheduler: Improve preemption to only kill containers that would satisfy the incoming request. (Wangda Tan)
(cherry picked from commit 7e8c9beb41)
(cherry picked from commit ae14e5d07f)
2016-03-16 17:03:35 -07:00
Karthik Kambatla ab03266831 YARN-4560. Make scheduler error checking message more user friendly. (Ray Chiang via kasha)
(cherry picked from commit 3ef5500783)
2016-03-15 23:47:19 -07:00
Robert Kanter da24fde333 TestResourceLocalizationService.testPublicResourceInitializesLocalDir fails Intermittently due to IllegalArgumentException from cleanup (templedf via rkanter)
(cherry picked from commit 22ca176dfe)
2016-03-15 10:05:28 -07:00
Karthik Kambatla b4c8693096 YARN-4719. Add a helper library to maintain node state and allows common queries. (kasha)
(cherry picked from commit 20d389ce61)
2016-03-14 14:22:21 -07:00
Junping Du 3d5ac829da YARN-4545. Allow YARN distributed shell to use ATS v1.5 APIs. Li Lu via junping_du
(cherry picked from commit f291d82cd4)
2016-03-14 08:30:07 -07:00
Li Lu 2b16a54fbe YARN-4696. Improving EntityGroupFSTimelineStore on exception handling, test setup, and concurrency.
This commit amends commit d49cfb3504 with a missed test file.

(cherry picked from commit 017d2c127b)
2016-03-10 13:04:57 -08:00
Li Lu 76ef097fd1 YARN-4696. Improving EntityGroupFSTimelineStore on exception handling, test setup, and concurrency. (Steve Loughran via gtcarrera9)
(cherry-picked from commit d49cfb3504)
2016-03-10 10:56:51 -08:00
Wangda Tan f7b38a7fb8 YARN-4465. SchedulerUtils#validateRequest for Label check should happen only when nodelabel enabled. (Bibin A Chundatt via wangda)
(cherry picked from commit 0233d4e0ee)
2016-03-08 14:28:26 -08:00
Jian He fb139b0c40 YARN-4764. Application submission fails when submitted queue is not available in scheduler xml. Contributed by Bibin A Chundatt
(cherry picked from commit 3c33158d1c)
2016-03-08 13:12:33 -08:00
Vinod Kumar Vavilapalli da9f39b107 YARN-4762. Fixed CgroupHandler's creation and usage to avoid NodeManagers crashing when LinuxContainerExecutor is enabled. (Sidharta Seethana via vinodkv)
(cherry picked from commit b2661765a5)
2016-03-07 11:11:29 -08:00
Jason Lowe adcdcfd5c1 YARN-4760. proxy redirect to history server uses wrong URL. Contributed by Eric Badger
(cherry picked from commit 4163e36c2b)
2016-03-07 15:57:44 +00:00
Jason Lowe 4eace7ab43 YARN-4744. Too many signal to container failure in case of LCE. Contributed by Sidharta Seethana
(cherry picked from commit 059caf9989)
2016-03-07 15:45:47 +00:00
Varun Vasudev 78919f8c34 YARN-4245. Generalize config file handling in container-executor. Contributed by Sidharta Seethana.
(cherry picked from commit 8ed2e060e8)
2016-03-07 16:19:27 +05:30
Varun Vasudev e9a0ffc7f1 YARN-4737. Add CSRF filter support in YARN. Contributed by Jonathan Maron.
(cherry picked from commit 43416187c07afb35e3267f94d0a41d8d3cfb5735)
2016-03-07 15:23:36 +05:30
Zhihai Xu 7ac7ca48b7 YARN-4761. NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations on fair scheduler. Contributed by Sangjin Lee
(cherry picked from commit e1ccc9622b)
2016-03-06 19:49:47 -08:00
Rohith Sharma K S 1415e6190a YARN-4763. RMApps Page crashes with NPE. (Bibin A Chundatt via rohithsharmaks)
(cherry picked from commit 77e2c3e8c7365b2aca00b6169829f87c63e4b460)
2016-03-05 13:07:20 +05:30
Jian He 023c2d2e56 YARN-4671. There is no need to acquire CS lock when completing a container. Contributed by Meng Ding 2016-03-01 13:14:51 -08:00
Jian He 589b537631 YARN-4748. ApplicationHistoryManagerOnTimelineStore should not swallow exceptions on generateApplicationReport. Contributed by Li Lu
(cherry picked from commit d93c22ec27)
2016-02-29 18:19:34 -08:00
Karthik Kambatla 84172b047b YARN-4704. TestResourceManager#testResourceAllocation() fails when using FairScheduler. (Yufei Gu via kasha)
(cherry picked from commit 9dafaaaf0d)
2016-02-29 16:10:26 -08:00
Haohui Mai c5db4ab0b4 HADOOP-12813. Migrate TestRPC and related codes to rebase on ProtobufRpcEngine. Contributed by Kai Zheng. 2016-02-29 14:10:18 -08:00
Jason Lowe bd0f5085e3 YARN-4731. container-executor should not follow symlinks in recursive_unlink_children. Contributed by Colin Patrick McCabe
(cherry picked from commit c58a6d53c5)
2016-02-29 15:26:26 +00:00
Rohith Sharma K S 2a1bb6cb67 YARN-4566. Fix test failure in TestMiniYarnClusterNodeUtilization. (Takashi Ohnishi via rohithsharmaks)
(cherry picked from commit e0b14f26f5)
2016-02-29 10:50:23 +08:00
Karthik Kambatla f3b37d8020 YARN-4718. Rename variables in SchedulerNode to reduce ambiguity post YARN-1011. (Inigo Goiri via kasha)
(cherry picked from commit f9692770a5)
2016-02-28 10:01:48 -08:00
Jason Lowe 0bd7ba4ea8 YARN-4723. NodesListManager$UnknownNodeId ClassCastException. Contributed by Kuhu Shukla
(cherry picked from commit 6b0f813e89)
2016-02-26 20:25:56 +00:00
Ming Ma 1656bcec5f YARN-4720. Skip unnecessary NN operations in log aggregation. (Jun Gong via mingma)
(cherry picked from commit 7f3139e54d)
2016-02-26 08:43:14 -08:00
Robert Kanter 872b8d90a6 YARN-4579. Allow DefaultContainerExecutor container log directory permissions to be configurable (rchiang via rkanter)
(cherry picked from commit d7fdec1e6b)
2016-02-25 16:40:05 -08:00
Karthik Kambatla 6a75c5af09 YARN-4729. SchedulerApplicationAttempt#getTotalRequiredResources can throw an NPE. (kasha)
(cherry picked from commit c684f2b007)
2016-02-24 18:34:21 -08:00
Robert Kanter c2098d2470 YARN-4697. NM aggregation thread pool is not bound by limits (haibochen via rkanter)
(cherry picked from commit 954dd57043)
2016-02-24 15:00:48 -08:00
Sangjin Lee 432a2367ce YARN-4722. AsyncDispatcher logs redundant event queue sizes (Jason Lowe via sjlee)
(cherry picked from commit 553b591ba0)
2016-02-24 09:30:37 -08:00
Jason Lowe acffe82353 YARN-2046. Out of band heartbeats are sent only on container kill and possibly too early. Contributed by Ming Ma
(cherry picked from commit d284e187b8)
2016-02-23 20:51:01 +00:00
Junping Du e3ce0ffdc3 YARN-3223. Resource update during NM graceful decommission. Contributed by Brook Zhou.
(cherry picked from commit 9ed17f181d)
2016-02-23 03:35:47 -08:00
Tsuyoshi Ozawa 4ee55d0322 YARN-4648. Move preemption related tests from TestFairScheduler to TestFairSchedulerPreemption. Contributed by Kai Sasaki.
(cherry picked from commit 0e12114c9c)
2016-02-23 19:50:40 +09:00
Varun Vasudev 2c218ca8a8 YARN-4709. NMWebServices produces incorrect JSON for containers. Contributed by Varun Saxena.
(cherry picked from commit 140cb5d745)
2016-02-23 12:32:16 +05:30