Commit Graph

5108 Commits

Author SHA1 Message Date
Haibo Chen f943bff254 YARN-9559. Create AbstractContainersLauncher for pluggable ContainersLauncher logic. (Contributed by Jonathan Hung)
(cherry picked from commit f51702d539)
(cherry picked from commit 8d357343c4)
2019-08-06 15:01:06 -07:00
Eric Badger 698e74d097 YARN-8045. Reduce log output from container status calls. Contributed by Craig Condit
(cherry picked from commit 144a55f0e3)
2019-08-02 20:41:26 +00:00
Eric E Payne 36af8845de YARN-9596: QueueMetrics has incorrect metrics when labelled partitions are involved. Contributed by Muhammad Samir Khan.
(cherry picked from commit 42683aef1a)
2019-07-30 19:45:00 +00:00
Jonathan Hung 3ff2148482 YARN-9668. UGI conf doesn't read user overridden configurations on RM and NM startup. (Contributed by Jonanthan Hung) 2019-07-22 10:54:08 -07:00
Weiwei Yang 48192531ad YARN-9682. Wrong log message when finalizing the upgrade. Contributed by kyungwan nam.
(cherry picked from commit 85d9111a88)
2019-07-17 11:08:21 +08:00
Szilard Nemeth 30c7b43227 YARN-9127. Create more tests to verify GpuDeviceInformationParser. Contributed by Peter Bacsko
(cherry picked from commit 18ee1092b4)
2019-07-15 12:15:36 +02:00
Szilard Nemeth bb37c6cb7f YARN-9337. Addendum to fix compilation error due to mockito spy call 2019-07-13 00:42:14 +02:00
Erik Krogen 07a6510e6a HDFS-13286. [SBN read] Add haadmin commands to transition between standby and observer. Contributed by Chao Sun. 2019-07-12 11:03:31 -07:00
Szilard Nemeth 773591ee42 YARN-9626. UI2 - Fair scheduler queue apps page issues. Contributed by Zoltan Siegl
(cherry picked from commit 557056e18e)
2019-07-12 17:41:23 +02:00
Szilard Nemeth 531e0c0bc1 YARN-9337. GPU auto-discovery script runs even when the resource is given by hand. Contributed by Adam Antal
(cherry picked from commit 61b0c2bb7c)
2019-07-12 17:30:50 +02:00
Szilard Nemeth 43c89d1e2b YARN-9235. If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown. Contributed by Antal Balint Steinbach, Adam Antal
(cherry picked from commit c416284bb7)
2019-07-12 17:07:25 +02:00
Szilard Nemeth 872a039bac YARN-9625. UI2 - No link to a queue on the Queues page for Fair Scheduler. Contributed by Zoltan Siegl
(cherry picked from commit 9cec023186)
2019-07-11 20:02:19 +02:00
Szilard Nemeth d590745046 YARN-9573. DistributedShell cannot specify LogAggregationContext. Contributed by Adam Antal. 2019-07-11 19:54:31 +02:00
bibinchundatt 5effeae1f3 YARN-9557. Application fails in diskchecker when ReadWriteDiskValidator is configured. Contributed by Bilwa S T.
(cherry picked from commit 5f8395f393)
2019-07-10 14:47:29 +05:30
Sunil G 9eb96b0fbf YARN-9644. First RMContext object is always leaked during switch over. Contributed by Bibin A Chundatt.
(cherry picked from commit d18986e4e8)
2019-07-04 11:06:41 +05:30
Szilard Nemeth 46177ade8b YARN-9629. Support configurable MIN_LOG_ROLLING_INTERVAL. Contributed by Adam Antal.
(cherry picked from commit a2a8be18cb)
2019-07-03 14:24:53 +02:00
Sunil G d2a5749482 YARN-9327. Improve synchronisation in ProtoUtils#convertToProtoFormat block. Contributed by Bibin A Chundatt.
(cherry picked from commit 0c8813f135)
2019-07-02 12:15:40 +05:30
Weiwei Yang 46b81a982b YARN-9655. AllocateResponse in FederationInterceptor lost applicationPriority. Contributed by hunshenshi.
(cherry picked from commit 570eee30e5)
2019-07-02 10:17:56 +08:00
bibinchundatt 4f622ecad8 YARN-9639. DecommissioningNodesWatcher cause memory leak. Contributed by Bilwa S T.
(cherry picked from commit be80334cdf)
2019-06-27 10:11:30 +05:30
Zhankun Tang 829202740a YARN-9584. Should put initializeProcessTrees method call before get pid. Contributed by Wanqiang Ji.
(cherry picked from commit 67414a1a80)
2019-06-18 13:20:07 +08:00
Weiwei Yang 56a4935048 YARN-9621. Fix TestDSWithMultipleNodeManager.testDistributedShellWithPlacementConstraint on branch-3.1. Contributed by Prabhu Joseph. 2019-06-17 17:22:49 +08:00
Sean Mackrory fee1e67453 HADOOP-16213. Update guava to 27.0-jre. Contributed by Gabor Bota. 2019-06-13 07:38:43 -06:00
Sunil G c343554e2a YARN-9543. [UI2] Handle ATSv2 server down or failures cases gracefully in YARN UI v2. Contributed by Zoltan Siegl and Akhil P B.
(cherry picked from commit 52128e352a)
2019-06-12 19:28:43 +05:30
Sunil G bc028d3ebb YARN-9545. Create healthcheck REST endpoint for ATSv2. Contributed by Zoltan Siegl.
(cherry picked from commit 72203f7a12)
2019-06-12 19:28:10 +05:30
Sunil G 1bb9e9a4f2 Revert "YARN-9545. Create healthcheck REST endpoint for ATSv2. Contributed by Zoltan Siegl."
This reverts commit d65371c4e8.
2019-06-12 19:27:21 +05:30
bibinchundatt f42e246f8a YARN-9547. ContainerStatusPBImpl default execution type is not returned. Contributed by Bilwa S T.
(cherry picked from commit 3303723f55)
2019-06-11 23:43:54 +05:30
bibinchundatt d386f595f9 YARN-9565. RMAppImpl#ranNodes not cleared on FinalTransition. Contributed by Bilwa S T.
(cherry picked from commit 60c95e9b6a)
2019-06-11 23:15:02 +05:30
bibinchundatt 4a39165b41 YARN-9594. Fix missing break statement in ContainerScheduler#handle. Contributed by lujie.
(cherry picked from commit 6d80b9bc3f)

 Conflicts:
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java
2019-06-11 23:05:06 +05:30
Sunil G d65371c4e8 YARN-9545. Create healthcheck REST endpoint for ATSv2. Contributed by Zoltan Siegl.
(cherry picked from commit f1d3a17d3e)
2019-06-06 06:25:02 +05:30
Sunil G 6be665cfc6 YARN-8906. [UI2] NM hostnames not displayed correctly in Node Heatmap Chart. Contributed by Akhil PB.
(cherry picked from commit 59719dc560)
2019-06-03 15:54:33 +05:30
Sunil G 894ef5c07b YARN-8947. [UI2] Active User info missing from UI2. Contributed by Akhil PB.
(cherry picked from commit 7f46dda513)
2019-06-03 12:25:40 +05:30
Weiwei Yang 23f9508a89 YARN-9507. Fix NPE in NodeManager#serviceStop on startup failure. Contributed by Bilwa S T.
(cherry picked from commit 4530f4500d)
2019-06-03 14:26:16 +08:00
Eric Yang 413a6b63bc YARN-9542. Fix LogsCLI guessAppOwner ignores custome file format suffix.
Contributed by Prabhu Joseph

(cherry picked from commit b2a39e8883)
2019-05-29 18:05:47 -04:00
Eric E Payne 9c3ab58aa7 YARN-8625. Aggregate Resource Allocation for each job is not present in ATS. Contributed by Prabhu Joseph.
(cherry picked from commit 3c63551101)
2019-05-29 19:08:27 +00:00
Ahmed Hussein f2202f7990 YARN-9563. Resource report REST API could return NaN or Inf (Ahmed Hussein via jeagles)
Signed-off-by: Jonathan Eagles <jeagles@gmail.com>
(cherry picked from commit abf76ac371)
2019-05-29 12:47:27 -05:00
Takanobu Asanuma 8098ddaf40 HADOOP-16331. Fix ASF License check in pom.xml. Contributed by Akira Ajisaka.
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2019-05-29 17:39:49 +09:00
Akira Ajisaka f8bd5deec1
HADOOP-16323. https everywhere in Maven settings. 2019-05-27 15:28:21 +09:00
Sunil G f09befd2ea YARN-9519. TFile log aggregation file format is not working for yarn.log-aggregation.TFile.remote-app-log-dir config. Contributed by Adam Antal.
(cherry picked from commit 7d831eca64)
2019-05-14 10:50:02 -07:00
Sunil G 8b306e34e0 YARN-9504. [UI2] Fair scheduler queue view page does not show actual capacity. Contributed by Zoltan Siegl.
(cherry picked from commit 64c7f36ab1)
2019-05-10 14:28:08 +05:30
Eric Yang bf013aa06e YARN-8622. Fixed container-executor compilation on MacOSX.
Contributed by Siyao Meng

(cherry picked from commit ef97a20831)
2019-05-09 14:55:38 -04:00
Haibo Chen ea1f0f282b YARN-9529. Log correct cpu controller path on error while initializing CGroups. (Contributed by Jonathan Hung)
(cherry picked from commit 597fa47ad1)
(cherry picked from commit c6573562cb)
2019-05-06 11:59:20 -07:00
Eric E Payne 41ffaea342 YARN-9285: RM UI progress column is of wrong type. Contributed by Ahmed Hussein.
(cherry picked from commit b094b94d43)
2019-05-02 19:57:44 +00:00
Weiwei Yang 94a895b94f YARN-9307. node_partitions constraint does not work. Contributed by kyungwan nam. 2019-04-26 13:16:43 +08:00
Weiwei Yang d242b166ed YARN-9325. TestQueueManagementDynamicEditPolicy fails intermittent. Contributed by Prabhu Joseph.
(cherry picked from commit 1c8046d67e)
2019-04-23 14:25:33 +08:00
Eric Yang 8b228a42e9 YARN-8587. Added retries for fetching docker exit code.
Contributed by Charo Zhang

(cherry picked from commit c16c49b8c3)
2019-04-19 15:40:56 -04:00
Eric Yang 68a98be8a2 YARN-6695. Fixed NPE in publishing appFinished events to ATSv2.
Contributed by Prabhu Joseph

(cherry picked from commit df76cdc895)
2019-04-18 12:31:34 -04:00
Weiwei Yang c37065eae9 YARN-9463. Add queueName info when failing with queue capacity sanity check. Contributed by Aihua Xu.
(cherry picked from commit 8c1bba375b)
2019-04-10 23:04:27 +08:00
Weiwei Yang bd0c9bc160 YARN-9413. Queue resource leak after app fail for CapacityScheduler. Contributed by Tao Yang.
(cherry picked from commit ec143cbf67)
2019-04-06 20:38:06 +08:00
Eric Yang dbc02bcda7 YARN-9391. Fixed node manager environment leaks into Docker containers.
Contributed by Jim Brennan

(cherry picked from commit 3c45762a0b)
2019-03-25 15:55:46 -04:00
Sunil G 6941033396 YARN-8803. [UI2] Show flow runs in the order of recently created time in graph widgets. Contributed by Akhil PB.
(cherry picked from commit c79f139519)
2019-03-06 16:50:19 +05:30
Sunil G 379a9bfd9a YARN-9138. Improve test coverage for nvidia-smi binary execution of GpuDiscoverer. Contributed by Szilard Nemeth.
(cherry picked from commit 46045c5cb3)
2019-03-06 16:02:39 +05:30
bibinchundatt e663a6af89 Revert "YARN-8132. Final Status of applications shown as UNDEFINED in ATS app queries. Contributed by Prabhu Joseph"
This reverts commit 7db50ffceb.
2019-03-04 17:03:45 +05:30
Sunil G 80d507d1a4 YARN-9139. Simplify initializer code of GpuDiscoverer. Contributed by Szilard Nemeth. 2019-03-01 19:28:33 +05:30
Eric Yang 72eacb3e10 YARN-9334. Allow YARN Service client to send SPNEGO challenge header when authentication type is not simple.
Contributed by Billie Rinaldi

(cherry picked from commit 04b228e43b)
2019-02-28 10:12:09 -08:00
Sunil G 817028364a YARN-9121. Replace GpuDiscoverer.getInstance() to a readable object for easy access control. Contributed by Szilard Nemeth. 2019-02-27 17:46:43 +05:30
Weiwei Yang 10d4a9a7fb YARN-9248. RMContainerImpl:Invalid event: ACQUIRED at KILLED. Contributed by lujie.
(cherry picked from commit 8c30114b00)
2019-02-27 17:39:37 +08:00
Sunil G 51b010b19f YARN-9087. Improve logging for initialization of Resource plugins. Contributed by Szilard Nemeth. 2019-02-27 11:57:32 +05:30
Sunil G 84928ba3d1 YARN-9168. DistributedShell client timeout should be -1 by default. Contributed by Zhankun Tang.
(cherry picked from commit 6cec90653d)
2019-02-25 15:29:53 +05:30
Sunil G cb0f45b75b YARN-9213. RM Web UI v1 does not show custom resource allocations for containers page. Contributed by Szilard Nemeth.
(cherry picked from commit f282f9c362)
2019-02-25 11:39:05 +05:30
Weiwei Yang dab22e74a4 YARN-9316. TestPlacementConstraintsUtil#testInterAppConstraintsByAppID fails intermittently. Contributed by Prabhu Joseph.
(cherry picked from commit 9cd5c5447f)
2019-02-24 22:57:40 +08:00
bibinchundatt 6d1cf7b395 YARN-9317. Avoid repeated YarnConfiguration#timelineServiceV2Enabled check. Contributed by Prabhu Joseph 2019-02-23 08:03:28 +05:30
bibinchundatt 7db50ffceb YARN-8132. Final Status of applications shown as UNDEFINED in ATS app queries. Contributed by Prabhu Joseph 2019-02-22 20:52:40 +05:30
Sunil G d6377c8b68 YARN-9118. Handle exceptions with parsing user defined GPU devices in GpuDiscoverer. Contributed by Szilard Nemeth.
(cherry picked from commit 95fbbfed75)
2019-02-22 20:23:51 +05:30
Weiwei Yang 040d475030 YARN-9238. Avoid allocating opportunistic containers to previous/removed/non-exist application attempt. Contributed by lujie.
(cherry picked from commit 9c88695bcd)
2019-02-22 21:43:53 +08:00
Weiwei Yang 1ffa7f8349 YARN-9315. TestCapacitySchedulerMetrics fails intermittently. Contributed by Prabhu Joseph. 2019-02-21 18:16:22 +08:00
bibinchundatt 77c7e8492e YARN-9286. [Timeline Server] Sorting based on FinalStatus shows pop-up message. Contributed by Bilwa S T.
(cherry picked from commit b8de78c570)
2019-02-20 01:21:12 +05:30
Sunil G 2576aea729 YARN-7824. [UI2] Yarn Component Instance page should include link to container logs. Contributed by Akhil PB.
(cherry picked from commit a060e8cb51)
2019-02-17 20:20:40 +05:30
Adam Antal 511ffb5f70
YARN-9283. Javadoc of LinuxContainerExecutor#addSchedPriorityCommand has a wrong property name as reference
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 9385ec45d7)
2019-02-15 18:50:42 +09:00
Sunil G 8551831ac2 YARN-8295. [UI2] Improve Resource Usage tab error message when there are no data available. Contributed by Charan Hebri.
(cherry picked from commit 5b55f3538c)
2019-02-15 12:43:26 +05:30
Akira Ajisaka 3b10c07946
YARN-9284. Fix the unit of yarn.service.am-resource.memory in the document. Contributed by Masahiro Tanaka.
(cherry picked from commit 3a39d9a2d2)
2019-02-15 15:44:02 +09:00
bibinchundatt 8b074e6be8 YARN-9295. [UI2] Fix label typo in Cluster Overview page. Contributed by Charan Hebri.
(cherry picked from commit b66d5ae9e2)
2019-02-14 23:11:59 +05:30
Sunil G ec08eed542 YARN-7761. [UI2] Clicking 'master container log' or 'Link' next to 'log' under application's appAttempt goes to Old UI's Log link. Contributed by Akhil PB.
(cherry picked from commit d321d0e747)
2019-02-14 20:56:52 +05:30
Weiwei Yang f3c1e456b0 YARN-9253. Add UT to verify Placement Constraint in Distributed Shell. Contributed by Prabhu Joseph.
(cherry picked from commit 711d22f166)
2019-02-12 16:26:38 +08:00
Giovanni Matteo Fumarola c6582cc04c YARN-9191. Add cli option in DS to support enforceExecutionType in resource requests. Contributed by Abhishek Modi.
(cherry picked from commit f738b397ae)
2019-02-12 14:21:57 +08:00
Eric Yang 102db40870 YARN-8761. Service AM support for decommissioning component instances.
Contributed by Billie Rinaldi

(cherry picked from commit 4c465f5535)
2019-02-08 08:38:43 -08:00
Masatake Iwasaki 11ebdaab48 YARN-9282. Typo in javadoc of class LinuxContainerExecutor: hadoop.security.authetication should be 'authentication'. Contributed by Charan Hebri.
(cherry picked from commit e0ab1bdece)
2019-02-09 00:29:58 +09:00
Sunil G 73956d5de9 YARN-9257. Distributed Shell client throws a NPE for a non-existent queue. Contributed by Charan Hebri.
(cherry picked from commit fbc08145cf)
2019-02-08 11:23:21 +05:30
Eric E Payne 834a862bd0 YARN-7171: RM UI should sort memory / cores numerically. Contributed by Ahmed Hussein
(cherry picked from commit d1ca9432dd)
2019-02-07 20:36:54 +00:00
Sunil G 6ffe6ea899 YARN-9206. RMServerUtils does not count SHUTDOWN as an accepted state. Contributed by Kuhu Shukla. 2019-02-07 19:08:41 +05:30
Wangda Tan a1e09b4c0c Make upstream aware of 3.1.2 release
Change-Id: I397bc6ef75498726df4763bd07a8bf8fe1c38365
(cherry picked from commit 308f3168fa)
(cherry picked from commit 649da5af04)
2019-02-05 14:09:33 -08:00
Weiwei Yang 41bdcf4110 YARN-9262. TestRMAppAttemptTransitions is failing with an NPE. Contributed by lujie.
(cherry picked from commit 28ad20a711)
2019-02-04 14:00:54 +05:30
Sunil G 3b03ff6fdd YARN-9099. GpuResourceAllocator#getReleasingGpus calculates number of GPUs in a wrong way. Contributed by Szilard Nemeth.
(cherry picked from commit 71c49fa60f)
2019-01-31 09:26:38 +05:30
Eric E Payne 0cb05a9fe3 YARN-6616: YARN AHS shows submitTime for jobs same as startTime. Contributed by Prabhu Joseph
(cherry picked from commit 04105bbfdb)
2019-01-29 18:04:01 +00:00
Weiwei Yang 4257043232 YARN-9237. NM should ignore sending finished apps to RM during RM fail-over. Contributed by Jiandan Yang.
(cherry picked from commit 4f63ffe444)
2019-01-29 11:03:26 +08:00
Eric Yang 29ccb8689f YARN-8901. Fixed restart policy NEVER/ON_FAILURE with component dependency.
Contributed by Suma Shivaprasad

(cherry picked from commit f5a95f7998)
2019-01-28 18:12:39 -05:00
Rohith Sharma K S 6e059c7930 Revert "YARN-8270 Adding JMX Metrics for Timeline Collector and Reader. Contributed by Sushil Ks."
This reverts commit 5b72aa04e1.
2019-01-28 10:55:12 +05:30
Jonathan Hung 6092d913b1 YARN-9222. Print launchTime in ApplicationSummary
(cherry picked from commit 6cace58e21)
(cherry picked from commit bf760e7e81)
2019-01-25 13:50:44 -08:00
Haibo Chen 61a6cc8d23 YARN-7088. Add application launch time to Resource Manager REST API. (Kanwaljeet Sachdev via Haibo Chen)
(cherry picked from commit bb92bfb4ef)
2019-01-24 15:58:37 -08:00
Sunil G 45c4df152f YARN-8961. [UI2] Flow Run End Time shows 'Invalid date'. Contributed by Akhil PB
(cherry picked from commit c726445990)
2019-01-24 15:02:53 +05:30
Weiwei Yang 2471d8a6e7 YARN-9205. When using custom resource type, application will fail to run due to the CapacityScheduler throws InvalidResourceRequestException(GREATER_THEN_MAX_ALLOCATION). Contributed by Zhankun Tang.
(cherry picked from commit bc6374f282)
2019-01-23 18:18:38 +08:00
Weiwei Yang b61754b1bd YARN-9210. RM nodes web page can not display node info. Contributed by Jiandan Yang.
(cherry picked from commit d43df31751)
2019-01-22 11:01:00 +08:00
Weiwei Yang 4edd883d48 YARN-9204. RM fails to start if absolute resource is specified for partition capacity in CS queues. Contributed by Jiandan Yang.
(cherry picked from commit abde1e1f58)
2019-01-21 21:27:40 +08:00
Wangda Tan a685ffe9a9 YARN-9194. Invalid event: REGISTERED and LAUNCH_FAILED at FAILED, and NullPointerException happens in RM while shutdown a NM. (lujie via wangda)
Change-Id: I4359f59a73a278a941f4bb9d106dd38c9cb471fe
(cherry picked from commit 6d7eedfd28)
(cherry picked from commit fe7cb2d84a)
2019-01-17 15:17:34 -08:00
rahul3 8dea94d071
YARN-9203. Fix typos in yarn-default.xml.
This closes #437

Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 0a46baecd3)
2019-01-17 16:20:15 +09:00
Akira Ajisaka d07d275f94
YARN-8747. [UI2] YARN UI2 page loading failed due to js error under some time zone configuration. Contributed by collinma.
(cherry picked from commit 104ef5df36)
2019-01-16 14:38:29 +09:00
Weiwei Yang 91e9c9f96e YARN-9173. FairShare calculation broken for large values after YARN-8833. Contributed Wilfred Spiegelenburg. 2019-01-08 13:56:21 +08:00
Wangda Tan 3570713a69 YARN-8822. Nvidia-docker v2 support for YARN GPU feature. (Charo Zhang via wangda)
Change-Id: I416268888a7b6f097d218d84e8497dd70b4b6d8f
2019-01-07 12:30:30 -08:00
Eric Yang a446949f56 HADOOP-16031. Fixed TestSecureLogins unit test. Contributed by Akira Ajisaka
(cherry picked from commit bba76b6f31)
2019-01-07 13:24:50 -05:00
Wangda Tan 31ea2f7806 Preparing for 3.1.3 development
Change-Id: I3c3d3ee47dc4fef239127b4452ff14676fa26e3d
2019-01-07 10:04:58 -08:00
Weiwei Yang d6464629ca YARN-9164. Shutdown NM may cause NPE when opportunistic container scheduling is enabled. Contributed by lujie.
(cherry picked from commit cfe89e6f96)
2019-01-04 01:37:47 +08:00