Commit Graph

5222 Commits

Author SHA1 Message Date
Szilard Nemeth 59b20a1ebf YARN-10295. CapacityScheduler NPE can cause apps to get stuck without resources. Contributed by Benjamin Teke 2020-06-10 18:15:08 +02:00
Szilard Nemeth 0c209f3f7d YARN-10296. Make ContainerPBImpl#getId/setId synchronized. Contributed by Benjamin Teke 2020-06-10 18:02:05 +02:00
Eric E Payne 2e4892061a YARN-10300: appMasterHost not set in RM ApplicationSummary when AM fails before first heartbeat. Contributed by Eric Badger (ebadger).
(cherry picked from commit 56247db302)
(cherry picked from commit 034d458511)
2020-06-09 21:34:20 +00:00
Jonathan Hung aa19cb20ea YARN-6492. Generate queue metrics for each partition. Contributed by Manikandan R
(cherry picked from commit c30c23cb66)
(cherry picked from commit 7a323a45aa)
(cherry picked from commit a80595a6deb3124a3d6d99057e9d5298cd7237d8)
2020-05-29 10:56:31 -07:00
Jonathan Hung b3e9aff5f7 YARN-10260. Allow transitioning queue from DRAINING to RUNNING state. Contributed by Bilwa S T
(cherry picked from commit fff1d2c122)
(cherry picked from commit 564d3211f2)
(cherry picked from commit a7ea55e015)
2020-05-12 10:53:23 -07:00
Szilard Nemeth 7e6c5e5ad2 YARN-9444. YARN API ResourceUtils's getRequestedResourcesFromConfig doesn't recognize yarn.io/gpu as a valid resource. Contributed by Gergely Pollak
(cherry picked from commit 52e9ee39a1)
2020-05-07 18:37:02 +00:00
Ahmed Hussein 4e29738f4c YARN-8959. TestContainerResizing fails randomly (Ahmed Hussein via jeagles)
Signed-off-by: Jonathan Eagles <jeagles@gmail.com>
(cherry picked from commit 92e3ebb401)
2020-05-06 12:36:30 -05:00
Ahmed Hussein e25ac17e2b YARN-10256. Refactor TestContainerSchedulerQueuing.testContainerUpdateExecTypeGuaranteedToOpportunistic (Ahmed Hussein via jeagles)
Signed-off-by: Jonathan Eagles <jeagles@gmail.com>
(cherry picked from commit f5081a9a5d)
2020-05-04 11:11:14 -05:00
Gabor Bota ec6d2a8402 Preparing for 3.1.5 development
Change-Id: Iabc64aba7392e3b6f9e4e18109fcaa2cfc01d1f9
2020-04-29 11:18:18 +02:00
Jonathan Hung 6dc1f7e154 YARN-9954. Configurable max application tags and max tag length. Contributed by Bilwa S T
(cherry picked from commit 49ae9b2137)
(cherry picked from commit d1af4e0fae)
2020-04-17 10:36:26 -07:00
Jonathan Hung 6271a2852e YARN-10212. Create separate configuration for max global AM attempts. Contributed by Bilwa S T
(cherry picked from commit 57659422abbf6d9bf52e6e27fca775254bb77a56)
(cherry picked from commit e3a52804b03d646f15048c078f8c5292d5cbecfa)
(cherry picked from commit 54599b177c)
2020-04-09 10:44:39 -07:00
Akira Ajisaka d8033bfa96
HADOOP-14836. Upgrade maven-clean-plugin to 3.1.0 (#1933)
(cherry picked from commit e53d472bb0)
2020-04-09 01:50:01 +09:00
Jonathan Hung 9c6dd8c83a YARN-10200. Add number of containers to RMAppManager summary
(cherry picked from commit 2de0572cdc1c6fdbfaab108b169b2d5b0c077e86)
(cherry picked from commit 5d3fb0ebe9)
2020-03-25 10:33:18 -07:00
Eric E Payne 80e394e81b YARN-942. TestContainerSchedulerQueuing.testKillOnlyRequiredOpportunisticContainers fails sporadically Contributed by Ahmed Hussein (ahussein)
(cherry picked from commit ede05b19d1)
2020-03-10 15:25:10 +00:00
Wei-Chiu Chuang 58b025c8f4 HADOOP-16882. Update jackson-databind to 2.9.10.2 in branch-3.1, branch-2.10. Contributed by Lisheng Sun. 2020-02-28 16:24:38 -08:00
Inigo Goiri 4924622e6e YARN-10161. TestRouterWebServicesREST is corrupting STDOUT. Contributed by Jim Brennan.
(cherry picked from commit a43510e21d)
2020-02-27 13:20:17 -08:00
Elixir Kook 9ccefe2262
YARN-10156. Fix typo 'complaint' which means quite different in Federation.md (#1856)
(cherry picked from commit d608e94f92)
2020-02-26 17:32:47 +09:00
Sunil G a6124cd2b8 YARN-10139. ValidateAndGetSchedulerConfiguration API fails when cluster max allocation > default 8GB. Contributed by Prabhu Joseph.
(cherry picked from commit 6526f95bd2)
2020-02-19 11:18:53 +05:30
Prabhu Joseph 00fde836f7 YARN-10022. RM Rest API to validate the CapacityScheduler Configuration change.
Contributed by Kinga Marton.

(cherry-picked from commit 1ab9c692fa)
2020-02-11 22:23:07 +05:30
Jonathan Hung 5dfd1dcfe3 YARN-10116. Expose diagnostics in RMAppManager summary
(cherry picked from commit 314e2f9d2e)
(cherry picked from commit 147897da4b420b4749f3c7b410f4c329632c3352)
2020-02-05 11:17:24 -08:00
Eric Badger 4af7d14ce2 YARN-10084. Allow inheritance of max app lifetime / default app lifetime. Contributed by Eric Payne. 2020-01-29 22:45:02 +00:00
Abhishek Modi 868a0f5ef0 YARN-9790. Failed to set default-application-lifetime if maximum-application-lifetime is less than or equal to zero. Contributed by kyungwan nam.
(cherry picked from commit d2d963f3d4)
2020-01-23 15:50:28 +00:00
Szilard Nemeth 9638985428 YARN-7913. Improve error handling when application recovery fails with exception. Contributed by Wilfred Spiegelenburg 2020-01-22 16:30:59 +01:00
Szilard Nemeth 2cdaeca84b YARN-8148. Update decimal values for queue capacities shown on queue status CLI. Contributed by Prabhu Joseph 2020-01-20 09:30:48 +01:00
Sunil G af89b5b086 YARN-8373. RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH. Contributed by Wilfred Spiegelenburg. 2020-01-15 17:07:34 +05:30
Eric E Payne ec40c1f400 YARN-9018. Add functionality to AuxiliaryLocalPathHandler to return all locations to read for a given path. Contributed by Kuhu Shukla (kshukla)
(cherry picked from commit 93233a7d6e)
2020-01-09 17:41:10 +00:00
Eric Badger cba0bbe98c YARN-8672. TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out. Contributed by Chandni Singh and Jim Brennan. 2020-01-08 20:02:43 +00:00
Eric E Payne a110024fcc YARN-7387: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer fails intermittently. Contributed by Jim Brennan (Jim_Brennan)
(cherry picked from commit b1e07d27cc)
2020-01-08 19:41:53 +00:00
Eric E Payne 08a1464444 YARN-10072: TestCSAllocateCustomResource failures. Contributed by Jim Brennan (Jim_Brennan)
(cherry picked from commit 6899be5a17)
2020-01-08 17:50:43 +00:00
Prabhu Joseph 807d7aa3af YARN-10053. Use Shared Group Mapping Service in Placement Rules. Contributed by Wilfred Spiegelenburg.
(cherry picked from commit 217b56ffdd)
2020-01-02 14:41:47 +05:30
Akira Ajisaka 01edc654df
YARN-10055. bower install fails. (#1778)
(cherry picked from commit 34ff7dbaf5)
2019-12-23 20:27:06 +09:00
Eric Badger 0f01206e15 YARN-10009. In Capacity Scheduler, DRC can treat minimum user limit percent as a max when custom resource is defined. Contributed by Eric Payne
(cherry picked from commit 355ec33416)
2019-12-20 19:34:29 +00:00
Jonathan Hung 750fb4c321 YARN-9894. CapacitySchedulerPerf test for measuring hundreds of apps in a large number of queues. Contributed by Eric Payne
(cherry picked from commit 7b93575b92)
(cherry picked from commit 0707d0a0ae)
2019-12-18 13:25:30 -08:00
Jonathan Hung 406c35dd12 YARN-10039. Allow disabling app submission from REST endpoints 2019-12-18 10:58:32 -08:00
Eric Badger 86120637a4 YARN-10033. TestProportionalCapacityPreemptionPolicy not initializing vcores for effective max resources. Contributed by Eric Payne.
(cherry picked from commit f47dcf2d4c)
2019-12-17 17:39:51 +00:00
Jonathan Hung 9b4368a62f YARN-10012. Guaranteed and max capacity queue metrics for custom resources. Contributed by Manikandan R
(cherry picked from commit 92bce918dc)
(cherry picked from commit 9228e3f0ad)
2019-12-08 16:38:21 -08:00
Sunil G 67cf1f94cd YARN-4901. QueueMetrics needs to be cleared before MockRM is initialized. Contributed by Peter Bacsko.
(cherry picked from commit 002dcc4ebf)
(cherry picked from commit 69dc329acc)
2019-12-08 14:43:20 -08:00
Sunil G 57c499ef19 YARN-9949. Add missing queue configs for root queue in RMWebService#CapacitySchedulerInfo.
Contributed by Prabhu Joseph.
2019-11-27 23:01:16 +05:30
Szilard Nemeth 1172ebd6e8 YARN-9937. Add missing queue configs in RMWebService#CapacitySchedulerQueueInfo.
Contributed by Prabhu Joseph.
2019-11-27 22:30:10 +05:30
Szilard Nemeth 80a97e6ac2 YARN-9011. Race condition during decommissioning. Contributed by Peter Bacsko 2019-11-26 14:27:55 +01:00
HUAN-PING SU fb2df5234d
YARN-9966. Code duplication in UserGroupMappingPlacementRule (#1709)
(cherry picked from commit f8e36e03b4)
2019-11-25 15:58:09 +09:00
Szilard Nemeth df157211c9 YARN-9968. Public Localizer is exiting in NodeManager due to NullPointerException. Contributed by Tarun Parimi 2019-11-22 13:00:18 +01:00
Tao Yang c95e772d5d YARN-9838. Fix resource inconsistency for queues when moving app with reserved container to another queue. Contributed by jiulongzhu. 2019-11-22 16:10:07 +08:00
Eric Yang 1669a5c1cb YARN-9983. Fixed typo in YARN Service overview.
Contributed by Denes Gerencser
2019-11-19 14:18:38 -05:00
Sunil G 9a51380208 YARN-9873. Version Number for each Scheduler Config Change. Contributed by Prabhu Joseph.
(cherry picked from commit be901f4962)
2019-11-19 17:17:45 +05:30
Abhishek Modi 96e07b7e97 YARN-9791. Queue Mutation API does not allow to remove a config. Contributed by Prabhu Joseph.
(cherry picked from commit 751b5a1ac8)
2019-11-19 16:41:04 +05:30
Sunil G dab4cc6320 YARN-9909. Offline Format of YarnConfigurationStore. Contributed by Prabhu Joseph. 2019-11-19 16:19:19 +05:30
Sunil G 9b17fa7091 YARN-9950. Unset Ordering Policy of Leaf/Parent queue converted from Parent/Leaf queue respectively. Contributed by Prabhu Joseph.
(cherry picked from commit 51e7d1b37e)
2019-11-19 15:43:19 +05:30
Sunil G f24d1fac35 YARN-9984. FSPreemptionThread can cause NullPointerException while app is unregistered with containers running on a node. Contributed by Wilfred Spiegelenburg.
(cherry picked from commit 215f2052fc)
2019-11-19 14:05:51 +05:30
Prabhu Joseph d271d174ca YARN-9900. Revert Invalid Config and Refresh Support in SchedulerConfig Format.
Signed-off-by: prabhujoseph <pjoseph@cloudera.com>
2019-11-18 15:02:59 +05:30
Jonathan Hung 4d274f60bc Make upstream aware of 2.10.0 release
(cherry picked from commit 7663db59c097c82eeed2df7a91168a4d7123c96b)
(cherry picked from commit 5d2ffcc7aa)
2019-10-30 21:00:55 -07:00
Eric Badger f5a3a9e138 YARN-9914. Use separate configs for free disk space checking for full and not-full disks. Contributed by Jim Brennan
(cherry picked from commit eef34f2d87)
2019-10-25 17:17:14 +00:00
Zhankun Tang 7175fc91d0 YARN-9921. Issue in PlacementConstraint when YARN Service AM retries allocation on component failure. Contributed by Tarun Parimi 2019-10-24 10:07:40 +08:00
Eric E Payne d8b4348036 YARN-9915: Fix FindBug issue in QueueMetrics. Contributed by Prabhu Joseph.
(cherry picked from commit 83d148074f)
2019-10-21 21:14:23 +00:00
Eric E Payne 889255576b YARN-9773: Add QueueMetrics for Custom Resources. Contributed by Manikandan R.
(cherry picked from commit a5034c7988)
2019-10-16 21:23:38 +00:00
Haibo Chen a70c6e9665 YARN-8842. Expose metrics for custom resource types in QueueMetrics. (Contributed by Szilard Nemeth)
(cherry picked from commit 84e22a6af4)
2019-10-15 22:20:56 +00:00
Haibo Chen 4ec409fe09 YARN-8750. Refactor TestQueueMetrics. (Contributed by Szilard Nemeth)
(cherry picked from commit e60b797c88)
2019-10-15 15:41:20 +00:00
Szilard Nemeth c44902212a YARN-8453. Additional Unit tests to verify queue limit and max-limit with multiple resource types. Contributed by Adam Antal 2019-10-15 13:25:57 +02:00
Szilard Nemeth aabc18a777 Revert "YARN-9128. Use SerializationUtils from apache commons to serialize / deserialize ResourceMappings. Contributed by Zoltan Siegl"
This reverts commit 73bc8ef9b8.
2019-10-09 19:59:02 +02:00
Szilard Nemeth 57e88a63cf YARN-9552. FairScheduler: NODE_UPDATE can cause NoSuchElementException. Contributed by Peter Bacsko. 2019-10-09 14:19:56 +02:00
Szilard Nemeth 73bc8ef9b8 YARN-9128. Use SerializationUtils from apache commons to serialize / deserialize ResourceMappings. Contributed by Zoltan Siegl
(cherry picked from commit 6f1ab95168)
(cherry picked from commit 42177e8b78)
2019-10-09 13:28:44 +02:00
Szilard Nemeth d1415b66b7 YARN-9356. Add more tests to ratio method in TestResourceCalculator. Contributed by Zoltan Siegl
(cherry picked from commit 35f093f5b3)
(cherry picked from commit 0ddb48a303)
2019-10-09 13:13:51 +02:00
Jonathan Hung 0dc85bc182 YARN-9760. Support configuring application priorities on a workflow level. Contributed by Varun Saxena 2019-10-08 11:18:52 -07:00
Szilard Nemeth 2c8cfb4f64 YARN-6715. Fix documentation about NodeHealthScriptRunner. Contributed by Peter Bacsko 2019-10-08 17:34:44 +02:00
Sunil G 933d81f0ce YARN-9801. SchedConfCli does not work wiwith https mode. Contributed by Prabhu Joseph
(cherry picked from commit 99cd7572f1)
2019-10-01 20:08:12 +05:30
Sunil G 71792f2122 YARN-9864. Format CS Configuration present in Configuration Store. Contributeed by Prabhu Joseph 2019-10-01 20:03:35 +05:30
bibinchundatt 6529a30d9e YARN-9858. Optimize RMContext getExclusiveEnforcedPartitions. Contributed by Jonathan Hung. 2019-10-01 16:06:38 +05:30
Jonathan Hung 95ec7050b5 Addendum to YARN-9730. Support forcing configured partitions to be exclusive based on app node label
(cherry picked from commit d86a1acc866cbda845fb3896dc824baf12217383)
(cherry picked from commit f4f210d2e5)
2019-09-25 17:49:52 -07:00
Jonathan Hung 783cbced1d YARN-9730. Support forcing configured partitions to be exclusive based on app node label
(cherry picked from commit 73a044a63822303f792183244e25432528ecfb1e)
(cherry picked from commit dd094d79023f6598e47146166aa8c213e03d41b7)
2019-09-24 13:52:14 -07:00
Jonathan Hung 6a1d2d56bd YARN-9762. Add submission context label to audit logs. Contributed by Manoj Kumar
(cherry picked from commit 3d78b1223d)
(cherry picked from commit a1fa9a8a7f)
2019-09-23 13:13:09 -07:00
Sunil G 45cf3de2e9 YARN-9833. Race condition when DirectoryCollection.checkDirs() runs during container launch. Contributed by Peter Bacsko.
(cherry picked from commit c474e24c0b)
2019-09-18 09:23:46 +05:30
Weiwei Yang 1978317068 YARN-2255. YARN Audit logging not added to log4j.properties. Contributed by Aihua Xu.
(cherry picked from commit f8c14326ee)
2019-09-18 09:20:37 +08:00
Eric Yang dae22c962d YARN-9837. Fixed reading YARN Service JSON spec file larger than 128k.
Contributed by Tarun Parimi
2019-09-17 13:17:13 -04:00
Jonathan Hung d75693bd6e YARN-9824. Fall back to configured queue ordering policy class name
(cherry picked from commit f8f8598ea5)
(cherry picked from commit 1dbf87c9ff)
2019-09-10 15:31:58 -07:00
Jonathan Hung 80735a15a5 YARN-8541 (branch-3.1 addendum): RM startup failure on recovery after user deletion 2019-09-09 20:15:42 -07:00
bibinchundatt aee8fb567b YARN-8948. PlacementRule interface should be for all YarnSchedulers. Contributed by Bibin A Chundatt.
(cherry picked from commit a68d766e87)
(cherry picked from commit e10050678d)
2019-09-09 20:00:46 -07:00
Wangda Tan 9e8ff94d16 YARN-8361. Change App Name Placement Rule to use App Name instead of App Id for configuration. (Zian Chen via wangda)
Change-Id: I17e5021f8f611a9c5e3bd4b38f25e08585afc6b1
(cherry picked from commit a2e49f41a8)
2019-09-09 20:00:33 -07:00
Wangda Tan 81d63d5ea1 YARN-8016. Refine PlacementRule interface and add a app-name queue mapping rule as an example. (Zian Chen via wangda)
Change-Id: I35caf1480e0f76f5f3a53528af09312e39414bbb
(cherry picked from commit a90471b3e6)
2019-09-09 19:59:50 -07:00
Jonathan Hung 0e88bcd8e6 YARN-9820. RM logs InvalidStateTransitionException when app is submitted. Contributed by Prabhu Joseph 2019-09-09 00:27:33 -07:00
Jonathan Hung 080fc6d943 YARN-9764. Print application submission context label in application summary. Contributed by Manoj Kumar
(cherry picked from commit 43e389b980)
(cherry picked from commit 45220d1157)
2019-09-08 19:15:12 -07:00
Wangda Tan 0ee7d09138 YARN-9813. RM does not start on JDK11 when UIv2 is enabled. (Adam Antal/Eric Yang via wangda)
Change-Id: I18b8edc930b2efa0652f59c246931ad0d46827f3
(cherry picked from commit 34b82e6da0)
2019-09-06 19:19:59 -07:00
Tao Yang 36c1caced5 YARN-8995. Log events info in AsyncDispatcher when event queue size cumulatively reaches a certain number every time(addendum). Contributed by Jonathan Hung. 2019-09-07 08:19:02 +08:00
Tao Yang 1f6f4a2457 YARN-9795. ClusterMetrics to include AM allocation delay. Contributed by Fengnan Li. 2019-09-07 07:54:30 +08:00
Jonathan Hung 980a922481 YARN-9763. Print application tags in application summary. Contributed by Manoj Kumar 2019-09-06 10:52:37 -07:00
Jonathan Hung 11f6e3bc41 YARN-9761. Allow overriding application submissions based on server side configs. Contributed by Pralabh Kumar 2019-09-06 10:02:20 -07:00
Billie Rinaldi 619bd1e876 YARN-9718. Fixed yarn.service.am.java.opts shell injection. Contributed by Eric Yang
(cherry picked from commit 2e2e5401f2)
2019-09-05 14:19:05 -07:00
Jonathan Hung 37d1f8c81e YARN-9810. Add queue capacity/maxcapacity percentage metrics. Contributed by Shubham Gupta
(cherry picked from commit 0ccf4b0fe1)
(cherry picked from commit cb806988d72bde1f9837c9e0fb82a3a6c032542c)
2019-09-05 14:06:09 -07:00
Tao Yang d74c069427 YARN-8995. Log events info in AsyncDispatcher when event queue size cumulatively reaches a certain number every time. Contributed by zhuqi. 2019-09-05 16:54:55 +08:00
Zhankun Tang ef79d98788 Preparing for 3.1.4 development 2019-09-04 16:11:36 +08:00
Zhankun Tang fff4fbc957 YARN-9785. Fix DominantResourceCalculator when one resource is zero. Contributed by Bibin A Chundatt, Sunil Govindan, Bilwa S T. 2019-09-04 12:05:29 +08:00
bibinchundatt 3210d1e993 YARN-9797. LeafQueue#activateApplications should use resourceCalculator#fitsIn. Contributed by Bilwa S T.
(cherry picked from commit 03489124ea)
2019-09-03 11:56:19 +05:30
Akira Ajisaka 3c9d2f5317 YARN-9162. Fix TestRMAdminCLI#testHelp. Contributed by Ayush Saxena.
(cherry picked from commit 5db7c49062)
(cherry picked from commit a453f38015)
2019-08-30 17:55:35 -07:00
Eric E Payne 51896ff7e6 YARN-9756: Create metric that sums total memory/vcores preempted per round. Contributed by Manikandan R (manirajv06).
(cherry picked from commit d562050cec)
2019-08-28 21:05:23 +00:00
Jonathan Hung f73842780e YARN-9438. launchTime not written to state store for running applications
(cherry picked from commit 9568656cd21d9c02168e18ce35c6726077bbf3a1)
(cherry picked from commit 0c498de6e87c6bdc959afa31deb03d0907e0e1a1)
2019-08-27 15:45:42 -07:00
Jonathan Hung 6baa0d1e4d YARN-9775. RMWebServices /scheduler-conf GET returns all hadoop configurations for ZKConfigurationStore. Contributed by Prabhu Joseph
(cherry picked from commit 8660e48ca1)
(cherry picked from commit e4249c3202)
2019-08-26 15:55:11 -07:00
bibinchundatt eb618e4f22 YARN-9642. Fix Memory Leak in AbstractYarnScheduler caused by timer. Contributed by Bibin A Chundatt.
(cherry picked from commit d3ce53e507)
2019-08-26 23:25:16 +05:30
Szilard Nemeth fd2e353236 YARN-9100. Add tests for GpuResourceAllocator and do minor code cleanup. Contributed by Peter Bacsko 2019-08-16 15:27:10 +02:00
Szilard Nemeth 0a379e94ba YARN-8586. Extract log aggregation related fields and methods from RMAppImpl. Contributed by Peter Bacsko 2019-08-16 12:15:27 +02:00
Szilard Nemeth 94114378ce YARN-9488. Skip YARNFeatureNotEnabledException from ClientRMService. Contributed by Prabhu Joseph
(cherry picked from commit 1845a83cec)
2019-08-15 17:16:32 +02:00
Szilard Nemeth aa0631a042 YARN-9140. Code cleanup in ResourcePluginManager.initialize and in TestResourcePluginManager. Contributed by Peter Bacsko 2019-08-14 19:04:09 +02:00