Commit Graph

5122 Commits

Author SHA1 Message Date
Eric Badger a995e6352f YARN-9442. container working directory has group read permissions. Contributed by Jim Brennan.
(cherry picked from commit 2ac029b949)

Conflicts:
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/test-container-executor.c

(cherry picked from commit cec71691be)
2019-08-13 17:16:57 +00:00
Szilard Nemeth cb91ab73b0 YARN-9135. NM State store ResourceMappings serialization are tested with Strings instead of real Device objects. Contributed by Peter Bacsko
(cherry picked from commit 8b3c6791b1)
2019-08-13 15:47:57 +02:00
Szilard Nemeth a762a6be29 Revert "YARN-9135. NM State store ResourceMappings serialization are tested with Strings instead of real Device objects. Contributed by Peter Bacsko"
This reverts commit b20fd9e212.
Commit is reverted since unnecessary files were added, accidentally.
2019-08-13 15:47:57 +02:00
Szilard Nemeth 9da9b6d58e YARN-9723. ApplicationPlacementContext is not required for terminated jobs during recovery. Contributed by Prabhu Joseph
(cherry picked from commit e4b538bbda)
2019-08-12 15:16:49 +02:00
Szilard Nemeth 148121d889 YARN-9451. AggregatedLogsBlock shows wrong NM http port. Contributed by Prabhu Joseph
(cherry picked from commit b91099efd6)
2019-08-12 15:06:48 +02:00
Szilard Nemeth 6b4ded7647 YARN-9135. NM State store ResourceMappings serialization are tested with Strings instead of real Device objects. Contributed by Peter Bacsko 2019-08-12 14:03:50 +02:00
Sunil G 58ad5ad493 YARN-9729. [UI2] Fix error message for logs when ATSv2 is offline. Contributed by Zoltan Siegl.
(cherry picked from commit 1c5b28659f)
2019-08-11 11:49:57 +05:30
Szilard Nemeth be9ac8adf9 Logging fileSize of log files under NM Local Dir. Contributed by Prabhu Joseph
(cherry picked from commit 54ac80176e)
2019-08-09 13:23:49 +02:00
Sunil G 1c4364cb9a YARN-9715. [UI2] yarn-container-log URI need to be encoded to avoid potential misuses. Contributed by Akhil PB.
(cherry picked from commit acffec7a92)
2019-08-09 16:06:07 +05:30
Adam Antal 600a61f410 YARN-9124. Resolve contradiction in ResourceUtils: addMandatoryResources / checkMandatoryResources work differently (#1121)
(cherry picked from commit cbcada804d)
2019-08-09 11:44:22 +02:00
Szilard Nemeth 410f7a3069 YARN-9092. Create an object for cgroups mount enable and cgroups mount path as they belong together. Contributed by Gergely Pollak
(cherry picked from commit e0c21c6da9)
2019-08-09 10:25:12 +02:00
Szilard Nemeth b2f39f81fe YARN-9096: Some GpuResourcePlugin and ResourcePluginManager methods are synchronized unnecessarily. Contributed by Gergely Pollak
(cherry picked from commit 742e30b473)
2019-08-09 10:05:40 +02:00
Szilard Nemeth 943dfc78d1 YARN-9094: Remove unused interface method: NodeResourceUpdaterPlugin#handleUpdatedResourceFromRM. Contributed by Gergely Pollak
(cherry picked from commit 72d7e570a7)
2019-08-09 09:53:14 +02:00
Eric E Payne b131214685 YARN-9685: NPE when rendering the info table of leaf queue in non-accessible partitions. Contributed by Tao Yang.
(cherry picked from commit 3b38f2019e)
2019-08-08 13:08:05 +00:00
Haibo Chen f943bff254 YARN-9559. Create AbstractContainersLauncher for pluggable ContainersLauncher logic. (Contributed by Jonathan Hung)
(cherry picked from commit f51702d539)
(cherry picked from commit 8d357343c4)
2019-08-06 15:01:06 -07:00
Eric Badger 698e74d097 YARN-8045. Reduce log output from container status calls. Contributed by Craig Condit
(cherry picked from commit 144a55f0e3)
2019-08-02 20:41:26 +00:00
Eric E Payne 36af8845de YARN-9596: QueueMetrics has incorrect metrics when labelled partitions are involved. Contributed by Muhammad Samir Khan.
(cherry picked from commit 42683aef1a)
2019-07-30 19:45:00 +00:00
Jonathan Hung 3ff2148482 YARN-9668. UGI conf doesn't read user overridden configurations on RM and NM startup. (Contributed by Jonanthan Hung) 2019-07-22 10:54:08 -07:00
Weiwei Yang 48192531ad YARN-9682. Wrong log message when finalizing the upgrade. Contributed by kyungwan nam.
(cherry picked from commit 85d9111a88)
2019-07-17 11:08:21 +08:00
Szilard Nemeth 30c7b43227 YARN-9127. Create more tests to verify GpuDeviceInformationParser. Contributed by Peter Bacsko
(cherry picked from commit 18ee1092b4)
2019-07-15 12:15:36 +02:00
Szilard Nemeth bb37c6cb7f YARN-9337. Addendum to fix compilation error due to mockito spy call 2019-07-13 00:42:14 +02:00
Erik Krogen 07a6510e6a HDFS-13286. [SBN read] Add haadmin commands to transition between standby and observer. Contributed by Chao Sun. 2019-07-12 11:03:31 -07:00
Szilard Nemeth 773591ee42 YARN-9626. UI2 - Fair scheduler queue apps page issues. Contributed by Zoltan Siegl
(cherry picked from commit 557056e18e)
2019-07-12 17:41:23 +02:00
Szilard Nemeth 531e0c0bc1 YARN-9337. GPU auto-discovery script runs even when the resource is given by hand. Contributed by Adam Antal
(cherry picked from commit 61b0c2bb7c)
2019-07-12 17:30:50 +02:00
Szilard Nemeth 43c89d1e2b YARN-9235. If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown. Contributed by Antal Balint Steinbach, Adam Antal
(cherry picked from commit c416284bb7)
2019-07-12 17:07:25 +02:00
Szilard Nemeth 872a039bac YARN-9625. UI2 - No link to a queue on the Queues page for Fair Scheduler. Contributed by Zoltan Siegl
(cherry picked from commit 9cec023186)
2019-07-11 20:02:19 +02:00
Szilard Nemeth d590745046 YARN-9573. DistributedShell cannot specify LogAggregationContext. Contributed by Adam Antal. 2019-07-11 19:54:31 +02:00
bibinchundatt 5effeae1f3 YARN-9557. Application fails in diskchecker when ReadWriteDiskValidator is configured. Contributed by Bilwa S T.
(cherry picked from commit 5f8395f393)
2019-07-10 14:47:29 +05:30
Sunil G 9eb96b0fbf YARN-9644. First RMContext object is always leaked during switch over. Contributed by Bibin A Chundatt.
(cherry picked from commit d18986e4e8)
2019-07-04 11:06:41 +05:30
Szilard Nemeth 46177ade8b YARN-9629. Support configurable MIN_LOG_ROLLING_INTERVAL. Contributed by Adam Antal.
(cherry picked from commit a2a8be18cb)
2019-07-03 14:24:53 +02:00
Sunil G d2a5749482 YARN-9327. Improve synchronisation in ProtoUtils#convertToProtoFormat block. Contributed by Bibin A Chundatt.
(cherry picked from commit 0c8813f135)
2019-07-02 12:15:40 +05:30
Weiwei Yang 46b81a982b YARN-9655. AllocateResponse in FederationInterceptor lost applicationPriority. Contributed by hunshenshi.
(cherry picked from commit 570eee30e5)
2019-07-02 10:17:56 +08:00
bibinchundatt 4f622ecad8 YARN-9639. DecommissioningNodesWatcher cause memory leak. Contributed by Bilwa S T.
(cherry picked from commit be80334cdf)
2019-06-27 10:11:30 +05:30
Zhankun Tang 829202740a YARN-9584. Should put initializeProcessTrees method call before get pid. Contributed by Wanqiang Ji.
(cherry picked from commit 67414a1a80)
2019-06-18 13:20:07 +08:00
Weiwei Yang 56a4935048 YARN-9621. Fix TestDSWithMultipleNodeManager.testDistributedShellWithPlacementConstraint on branch-3.1. Contributed by Prabhu Joseph. 2019-06-17 17:22:49 +08:00
Sean Mackrory fee1e67453 HADOOP-16213. Update guava to 27.0-jre. Contributed by Gabor Bota. 2019-06-13 07:38:43 -06:00
Sunil G c343554e2a YARN-9543. [UI2] Handle ATSv2 server down or failures cases gracefully in YARN UI v2. Contributed by Zoltan Siegl and Akhil P B.
(cherry picked from commit 52128e352a)
2019-06-12 19:28:43 +05:30
Sunil G bc028d3ebb YARN-9545. Create healthcheck REST endpoint for ATSv2. Contributed by Zoltan Siegl.
(cherry picked from commit 72203f7a12)
2019-06-12 19:28:10 +05:30
Sunil G 1bb9e9a4f2 Revert "YARN-9545. Create healthcheck REST endpoint for ATSv2. Contributed by Zoltan Siegl."
This reverts commit d65371c4e8.
2019-06-12 19:27:21 +05:30
bibinchundatt f42e246f8a YARN-9547. ContainerStatusPBImpl default execution type is not returned. Contributed by Bilwa S T.
(cherry picked from commit 3303723f55)
2019-06-11 23:43:54 +05:30
bibinchundatt d386f595f9 YARN-9565. RMAppImpl#ranNodes not cleared on FinalTransition. Contributed by Bilwa S T.
(cherry picked from commit 60c95e9b6a)
2019-06-11 23:15:02 +05:30
bibinchundatt 4a39165b41 YARN-9594. Fix missing break statement in ContainerScheduler#handle. Contributed by lujie.
(cherry picked from commit 6d80b9bc3f)

 Conflicts:
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java
2019-06-11 23:05:06 +05:30
Sunil G d65371c4e8 YARN-9545. Create healthcheck REST endpoint for ATSv2. Contributed by Zoltan Siegl.
(cherry picked from commit f1d3a17d3e)
2019-06-06 06:25:02 +05:30
Sunil G 6be665cfc6 YARN-8906. [UI2] NM hostnames not displayed correctly in Node Heatmap Chart. Contributed by Akhil PB.
(cherry picked from commit 59719dc560)
2019-06-03 15:54:33 +05:30
Sunil G 894ef5c07b YARN-8947. [UI2] Active User info missing from UI2. Contributed by Akhil PB.
(cherry picked from commit 7f46dda513)
2019-06-03 12:25:40 +05:30
Weiwei Yang 23f9508a89 YARN-9507. Fix NPE in NodeManager#serviceStop on startup failure. Contributed by Bilwa S T.
(cherry picked from commit 4530f4500d)
2019-06-03 14:26:16 +08:00
Eric Yang 413a6b63bc YARN-9542. Fix LogsCLI guessAppOwner ignores custome file format suffix.
Contributed by Prabhu Joseph

(cherry picked from commit b2a39e8883)
2019-05-29 18:05:47 -04:00
Eric E Payne 9c3ab58aa7 YARN-8625. Aggregate Resource Allocation for each job is not present in ATS. Contributed by Prabhu Joseph.
(cherry picked from commit 3c63551101)
2019-05-29 19:08:27 +00:00
Ahmed Hussein f2202f7990 YARN-9563. Resource report REST API could return NaN or Inf (Ahmed Hussein via jeagles)
Signed-off-by: Jonathan Eagles <jeagles@gmail.com>
(cherry picked from commit abf76ac371)
2019-05-29 12:47:27 -05:00
Takanobu Asanuma 8098ddaf40 HADOOP-16331. Fix ASF License check in pom.xml. Contributed by Akira Ajisaka.
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2019-05-29 17:39:49 +09:00
Akira Ajisaka f8bd5deec1
HADOOP-16323. https everywhere in Maven settings. 2019-05-27 15:28:21 +09:00
Sunil G f09befd2ea YARN-9519. TFile log aggregation file format is not working for yarn.log-aggregation.TFile.remote-app-log-dir config. Contributed by Adam Antal.
(cherry picked from commit 7d831eca64)
2019-05-14 10:50:02 -07:00
Sunil G 8b306e34e0 YARN-9504. [UI2] Fair scheduler queue view page does not show actual capacity. Contributed by Zoltan Siegl.
(cherry picked from commit 64c7f36ab1)
2019-05-10 14:28:08 +05:30
Eric Yang bf013aa06e YARN-8622. Fixed container-executor compilation on MacOSX.
Contributed by Siyao Meng

(cherry picked from commit ef97a20831)
2019-05-09 14:55:38 -04:00
Haibo Chen ea1f0f282b YARN-9529. Log correct cpu controller path on error while initializing CGroups. (Contributed by Jonathan Hung)
(cherry picked from commit 597fa47ad1)
(cherry picked from commit c6573562cb)
2019-05-06 11:59:20 -07:00
Eric E Payne 41ffaea342 YARN-9285: RM UI progress column is of wrong type. Contributed by Ahmed Hussein.
(cherry picked from commit b094b94d43)
2019-05-02 19:57:44 +00:00
Weiwei Yang 94a895b94f YARN-9307. node_partitions constraint does not work. Contributed by kyungwan nam. 2019-04-26 13:16:43 +08:00
Weiwei Yang d242b166ed YARN-9325. TestQueueManagementDynamicEditPolicy fails intermittent. Contributed by Prabhu Joseph.
(cherry picked from commit 1c8046d67e)
2019-04-23 14:25:33 +08:00
Eric Yang 8b228a42e9 YARN-8587. Added retries for fetching docker exit code.
Contributed by Charo Zhang

(cherry picked from commit c16c49b8c3)
2019-04-19 15:40:56 -04:00
Eric Yang 68a98be8a2 YARN-6695. Fixed NPE in publishing appFinished events to ATSv2.
Contributed by Prabhu Joseph

(cherry picked from commit df76cdc895)
2019-04-18 12:31:34 -04:00
Weiwei Yang c37065eae9 YARN-9463. Add queueName info when failing with queue capacity sanity check. Contributed by Aihua Xu.
(cherry picked from commit 8c1bba375b)
2019-04-10 23:04:27 +08:00
Weiwei Yang bd0c9bc160 YARN-9413. Queue resource leak after app fail for CapacityScheduler. Contributed by Tao Yang.
(cherry picked from commit ec143cbf67)
2019-04-06 20:38:06 +08:00
Eric Yang dbc02bcda7 YARN-9391. Fixed node manager environment leaks into Docker containers.
Contributed by Jim Brennan

(cherry picked from commit 3c45762a0b)
2019-03-25 15:55:46 -04:00
Sunil G 6941033396 YARN-8803. [UI2] Show flow runs in the order of recently created time in graph widgets. Contributed by Akhil PB.
(cherry picked from commit c79f139519)
2019-03-06 16:50:19 +05:30
Sunil G 379a9bfd9a YARN-9138. Improve test coverage for nvidia-smi binary execution of GpuDiscoverer. Contributed by Szilard Nemeth.
(cherry picked from commit 46045c5cb3)
2019-03-06 16:02:39 +05:30
bibinchundatt e663a6af89 Revert "YARN-8132. Final Status of applications shown as UNDEFINED in ATS app queries. Contributed by Prabhu Joseph"
This reverts commit 7db50ffceb.
2019-03-04 17:03:45 +05:30
Sunil G 80d507d1a4 YARN-9139. Simplify initializer code of GpuDiscoverer. Contributed by Szilard Nemeth. 2019-03-01 19:28:33 +05:30
Eric Yang 72eacb3e10 YARN-9334. Allow YARN Service client to send SPNEGO challenge header when authentication type is not simple.
Contributed by Billie Rinaldi

(cherry picked from commit 04b228e43b)
2019-02-28 10:12:09 -08:00
Sunil G 817028364a YARN-9121. Replace GpuDiscoverer.getInstance() to a readable object for easy access control. Contributed by Szilard Nemeth. 2019-02-27 17:46:43 +05:30
Weiwei Yang 10d4a9a7fb YARN-9248. RMContainerImpl:Invalid event: ACQUIRED at KILLED. Contributed by lujie.
(cherry picked from commit 8c30114b00)
2019-02-27 17:39:37 +08:00
Sunil G 51b010b19f YARN-9087. Improve logging for initialization of Resource plugins. Contributed by Szilard Nemeth. 2019-02-27 11:57:32 +05:30
Sunil G 84928ba3d1 YARN-9168. DistributedShell client timeout should be -1 by default. Contributed by Zhankun Tang.
(cherry picked from commit 6cec90653d)
2019-02-25 15:29:53 +05:30
Sunil G cb0f45b75b YARN-9213. RM Web UI v1 does not show custom resource allocations for containers page. Contributed by Szilard Nemeth.
(cherry picked from commit f282f9c362)
2019-02-25 11:39:05 +05:30
Weiwei Yang dab22e74a4 YARN-9316. TestPlacementConstraintsUtil#testInterAppConstraintsByAppID fails intermittently. Contributed by Prabhu Joseph.
(cherry picked from commit 9cd5c5447f)
2019-02-24 22:57:40 +08:00
bibinchundatt 6d1cf7b395 YARN-9317. Avoid repeated YarnConfiguration#timelineServiceV2Enabled check. Contributed by Prabhu Joseph 2019-02-23 08:03:28 +05:30
bibinchundatt 7db50ffceb YARN-8132. Final Status of applications shown as UNDEFINED in ATS app queries. Contributed by Prabhu Joseph 2019-02-22 20:52:40 +05:30
Sunil G d6377c8b68 YARN-9118. Handle exceptions with parsing user defined GPU devices in GpuDiscoverer. Contributed by Szilard Nemeth.
(cherry picked from commit 95fbbfed75)
2019-02-22 20:23:51 +05:30
Weiwei Yang 040d475030 YARN-9238. Avoid allocating opportunistic containers to previous/removed/non-exist application attempt. Contributed by lujie.
(cherry picked from commit 9c88695bcd)
2019-02-22 21:43:53 +08:00
Weiwei Yang 1ffa7f8349 YARN-9315. TestCapacitySchedulerMetrics fails intermittently. Contributed by Prabhu Joseph. 2019-02-21 18:16:22 +08:00
bibinchundatt 77c7e8492e YARN-9286. [Timeline Server] Sorting based on FinalStatus shows pop-up message. Contributed by Bilwa S T.
(cherry picked from commit b8de78c570)
2019-02-20 01:21:12 +05:30
Sunil G 2576aea729 YARN-7824. [UI2] Yarn Component Instance page should include link to container logs. Contributed by Akhil PB.
(cherry picked from commit a060e8cb51)
2019-02-17 20:20:40 +05:30
Adam Antal 511ffb5f70
YARN-9283. Javadoc of LinuxContainerExecutor#addSchedPriorityCommand has a wrong property name as reference
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 9385ec45d7)
2019-02-15 18:50:42 +09:00
Sunil G 8551831ac2 YARN-8295. [UI2] Improve Resource Usage tab error message when there are no data available. Contributed by Charan Hebri.
(cherry picked from commit 5b55f3538c)
2019-02-15 12:43:26 +05:30
Akira Ajisaka 3b10c07946
YARN-9284. Fix the unit of yarn.service.am-resource.memory in the document. Contributed by Masahiro Tanaka.
(cherry picked from commit 3a39d9a2d2)
2019-02-15 15:44:02 +09:00
bibinchundatt 8b074e6be8 YARN-9295. [UI2] Fix label typo in Cluster Overview page. Contributed by Charan Hebri.
(cherry picked from commit b66d5ae9e2)
2019-02-14 23:11:59 +05:30
Sunil G ec08eed542 YARN-7761. [UI2] Clicking 'master container log' or 'Link' next to 'log' under application's appAttempt goes to Old UI's Log link. Contributed by Akhil PB.
(cherry picked from commit d321d0e747)
2019-02-14 20:56:52 +05:30
Weiwei Yang f3c1e456b0 YARN-9253. Add UT to verify Placement Constraint in Distributed Shell. Contributed by Prabhu Joseph.
(cherry picked from commit 711d22f166)
2019-02-12 16:26:38 +08:00
Giovanni Matteo Fumarola c6582cc04c YARN-9191. Add cli option in DS to support enforceExecutionType in resource requests. Contributed by Abhishek Modi.
(cherry picked from commit f738b397ae)
2019-02-12 14:21:57 +08:00
Eric Yang 102db40870 YARN-8761. Service AM support for decommissioning component instances.
Contributed by Billie Rinaldi

(cherry picked from commit 4c465f5535)
2019-02-08 08:38:43 -08:00
Masatake Iwasaki 11ebdaab48 YARN-9282. Typo in javadoc of class LinuxContainerExecutor: hadoop.security.authetication should be 'authentication'. Contributed by Charan Hebri.
(cherry picked from commit e0ab1bdece)
2019-02-09 00:29:58 +09:00
Sunil G 73956d5de9 YARN-9257. Distributed Shell client throws a NPE for a non-existent queue. Contributed by Charan Hebri.
(cherry picked from commit fbc08145cf)
2019-02-08 11:23:21 +05:30
Eric E Payne 834a862bd0 YARN-7171: RM UI should sort memory / cores numerically. Contributed by Ahmed Hussein
(cherry picked from commit d1ca9432dd)
2019-02-07 20:36:54 +00:00
Sunil G 6ffe6ea899 YARN-9206. RMServerUtils does not count SHUTDOWN as an accepted state. Contributed by Kuhu Shukla. 2019-02-07 19:08:41 +05:30
Wangda Tan a1e09b4c0c Make upstream aware of 3.1.2 release
Change-Id: I397bc6ef75498726df4763bd07a8bf8fe1c38365
(cherry picked from commit 308f3168fa)
(cherry picked from commit 649da5af04)
2019-02-05 14:09:33 -08:00
Weiwei Yang 41bdcf4110 YARN-9262. TestRMAppAttemptTransitions is failing with an NPE. Contributed by lujie.
(cherry picked from commit 28ad20a711)
2019-02-04 14:00:54 +05:30
Sunil G 3b03ff6fdd YARN-9099. GpuResourceAllocator#getReleasingGpus calculates number of GPUs in a wrong way. Contributed by Szilard Nemeth.
(cherry picked from commit 71c49fa60f)
2019-01-31 09:26:38 +05:30
Eric E Payne 0cb05a9fe3 YARN-6616: YARN AHS shows submitTime for jobs same as startTime. Contributed by Prabhu Joseph
(cherry picked from commit 04105bbfdb)
2019-01-29 18:04:01 +00:00
Weiwei Yang 4257043232 YARN-9237. NM should ignore sending finished apps to RM during RM fail-over. Contributed by Jiandan Yang.
(cherry picked from commit 4f63ffe444)
2019-01-29 11:03:26 +08:00
Eric Yang 29ccb8689f YARN-8901. Fixed restart policy NEVER/ON_FAILURE with component dependency.
Contributed by Suma Shivaprasad

(cherry picked from commit f5a95f7998)
2019-01-28 18:12:39 -05:00
Rohith Sharma K S 6e059c7930 Revert "YARN-8270 Adding JMX Metrics for Timeline Collector and Reader. Contributed by Sushil Ks."
This reverts commit 5b72aa04e1.
2019-01-28 10:55:12 +05:30