Commit Graph

5357 Commits

Author SHA1 Message Date
Rohith Sharma K S 4d9c5300e2 YARN-8567. Fetching yarn logs fails for long running application if it is not present in timeline store. Contributed by Tarun Parimi. 2019-09-05 18:16:35 +05:30
Eric Yang b87a727ff4 YARN-9374. Improve Timeline service resilience when HBase is unavailable.
Contributed by Prabhu Joseph and Szilard Nemeth
2019-09-05 16:32:18 +05:30
Eric Yang 02779cdc3a YARN-8499 ATSv2 Generalize TimelineStorageMonitor.
Contributed by Prabhu Joseph
2019-09-05 16:31:35 +05:30
Eric Yang 6110af2d1d YARN-7537. Add ability to load hbase config from distributed file system.
Contributed by Prabhu Joseph
2019-09-05 16:28:04 +05:30
Vrushali C bcacb57114 YARN-9335 [atsv2] Restrict the number of elements held in timeline collector when backend is unreachable for async calls. Contributed by Abhishesk Modi. 2019-09-05 16:27:39 +05:30
Vrushali C 6acc1a2bd0 YARN-9382 Publish container killed, paused and resumed events to ATSv2. Contributed by Abhishesk Modi. 2019-09-05 15:39:38 +05:30
Vrushali C f52a88fdc8 YARN-9303 Username splits won't help timelineservice.app_flow table. Contributed by Prabhu Joseph. 2019-09-05 15:39:38 +05:30
Giovanni Matteo Fumarola 998aa3de2c YARN-9418. ATSV2 /apps//entities/YARN_CONTAINER rest api does not show metrics. Contributed by Prabhu Joseph. 2019-09-05 15:39:38 +05:30
Rohith Sharma K S 8de93fca3c YARN-9389. FlowActivity and FlowRun table prefix is wrong. Contributed by Prabhu Joseph. 2019-09-05 15:39:38 +05:30
Rohith Sharma K S 0ccc5a2695 YARN-9387. Update document for ATS HBase Custom tablenames (-entityTableName). Contributed by Prabhu Joseph. 2019-09-05 15:39:38 +05:30
Vrushali C d451ff7534 YARN-3841 [atsv2 Storage implementation] Adding retry semantics to HDFS backing storage. Contributed by Abhishek Modi. 2019-09-05 15:39:38 +05:30
Vrushali C 66e1599761 YARN-3879 [Storage implementation] Create HDFS backing storage implementation for ATS reads. Contributed by Abhishek Modi. 2019-09-05 15:39:38 +05:30
Tao Yang 6f9764076a YARN-8995. Log events info in AsyncDispatcher when event queue size cumulatively reaches a certain number every time. Contributed by zhuqi. 2019-09-05 16:53:16 +08:00
Vrushali C 84a9c3f999 YARN-5336 Limit the flow name size & consider cleanup for hex chars. Contributed by Sushil Ks 2019-09-05 12:43:02 +05:30
Rohith Sharma K S 108c569e3b YARN-6735. Have a way to turn off container metrics from NMs. Contributed by Abhishek Modi. 2019-09-05 12:42:06 +05:30
Rohith Sharma K S 5345508fa3 YARN-6149. Allow port range to be specified while starting NM Timeline collector manager. Contributed by Abhishek Modi. 2019-09-05 12:38:37 +05:30
Suma Shivaprasad 0a6f90d4fc YARN-9034. ApplicationCLI should have option to take clusterId. Contributed by Rohith Sharma K S. 2019-09-05 12:38:07 +05:30
Rohith Sharma K S 4a4a892d32 YARN-7754. [Atsv2] Update document for running v1 and v2 TS. Contributed by Suma Shivaprasad. 2019-09-05 12:31:37 +05:30
Rohith Sharma K S a3496a368b YARN-8871. Document ATSv2 integrated LogWebService. Contributed by Suma Shivaprasad. 2019-09-05 12:30:50 +05:30
Rohith Sharma K S 252afdc8e6 YARN-9804. Update ATSv2 document for latest feature supports. 2019-09-05 09:00:22 +05:30
Zhankun Tang 269aa7ebfe YARN-9785. Fix DominantResourceCalculator when one resource is zero. Contributed by Bibin A Chundatt, Sunil Govindan, Bilwa S T.
(cherry picked from commit bb26514ba9)
2019-09-03 15:02:15 +08:00
bibinchundatt 1e6095f16b YARN-9797. LeafQueue#activateApplications should use resourceCalculator#fitsIn. Contributed by Bilwa S T.
(cherry picked from commit 03489124ea)
2019-09-03 11:55:13 +05:30
Akira Ajisaka a453f38015 YARN-9162. Fix TestRMAdminCLI#testHelp. Contributed by Ayush Saxena.
(cherry picked from commit 5db7c49062)
2019-08-30 17:52:07 -07:00
Rohith Sharma K S 2fc4123fe0 YARN-9714. ZooKeeper connection in ZKRMStateStore leaks after RM transitioned to standby. Contributed by Tao Yang. 2019-08-30 10:36:23 +05:30
Rohith Sharma K S 7616495fb7 YARN-9796. Fix ASF license issue in branch-3.2. Contributed by Prabhu Joseph. 2019-08-29 12:01:38 +05:30
Rohith Sharma K S 81c0809463 YARN-9640. Slow event processing could cause too many attempt unregister events. Contributed by Bibin A Chundatt. 2019-08-29 09:30:20 +05:30
Eric E Payne d562050cec YARN-9756: Create metric that sums total memory/vcores preempted per round. Contributed by Manikandan R (manirajv06). 2019-08-28 20:53:43 +00:00
Jonathan Hung f36ccf0ac1 YARN-9438. launchTime not written to state store for running applications
(cherry picked from commit 9568656cd21d9c02168e18ce35c6726077bbf3a1)
2019-08-27 15:54:22 -07:00
Akira Ajisaka 2d8799f4bc HADOOP-15832. Upgrade BouncyCastle to 1.60. Contributed by Robert Kanter. 2019-08-27 19:08:39 +00:00
Jonathan Hung e4249c3202 YARN-9775. RMWebServices /scheduler-conf GET returns all hadoop configurations for ZKConfigurationStore. Contributed by Prabhu Joseph
(cherry picked from commit 8660e48ca1)
2019-08-26 15:51:38 -07:00
bibinchundatt 7f20c31e31 YARN-9642. Fix Memory Leak in AbstractYarnScheduler caused by timer. Contributed by Bibin A Chundatt.
(cherry picked from commit d3ce53e507)
2019-08-26 23:23:49 +05:30
Rohith Sharma K S ab98f91638 YARN-8917. Absolute (maximum) capacity of level3+ queues is wrongly calculated for absolute resource. Contributed by Tao Yang. 2019-08-26 21:13:02 +05:30
Szilard Nemeth 6980f1740f YARN-9217. Nodemanager will fail to start if GPU is misconfigured on the node or GPU drivers missing. Contributed by Peter Bacsko 2019-08-21 16:49:34 +02:00
Szilard Nemeth a83718f130 YARN-9100. Add tests for GpuResourceAllocator and do minor code cleanup. Contributed by Peter Bacsko 2019-08-16 15:24:44 +02:00
Szilard Nemeth df616370f0 YARN-8586. Extract log aggregation related fields and methods from RMAppImpl. Contributed by Peter Bacsko 2019-08-16 11:52:51 +02:00
Szilard Nemeth 8fee3808c5 YARN-9749. TestAppLogAggregatorImpl#testDFSQuotaExceeded fails on trunk. Contributed by Adam Antal
(cherry picked from commit 2a05e0ff3b)
2019-08-16 08:52:34 +02:00
Szilard Nemeth e616037d1f YARN-9488. Skip YARNFeatureNotEnabledException from ClientRMService. Contributed by Prabhu Joseph
(cherry picked from commit 1845a83cec)
2019-08-15 17:16:06 +02:00
Adam Antal d5446b3a23 YARN-9676. Add DEBUG and TRACE level messages to AppLogAggregatorImpl… (#1261)
* YARN-9676. Add DEBUG and TRACE level messages to AppLogAggregatorImpl and connected classes

* Using {} placeholder, and increasing loglevel if log aggregation failed.

(cherry picked from commit c89bdfacc8)
2019-08-14 17:36:41 +02:00
Szilard Nemeth 4bb238c480 YARN-9133. Make tests more easy to comprehend in TestGpuResourceHandler. Contributed by Peter Bacsko 2019-08-14 17:16:54 +02:00
Szilard Nemeth 4dc477b606 YARN-9140. Code cleanup in ResourcePluginManager.initialize and in TestResourcePluginManager. Contributed by Peter Bacsko 2019-08-14 17:01:41 +02:00
Szilard Nemeth 9a87e74e54 YARN-9134. No test coverage for redefining FPGA / GPU resource types in TestResourceUtils. Contributed by Peter Bacsko 2019-08-14 16:46:34 +02:00
Eric Badger cec71691be YARN-9442. container working directory has group read permissions. Contributed by Jim Brennan.
(cherry picked from commit 2ac029b949)

Conflicts:
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/test-container-executor.c
2019-08-13 16:34:29 +00:00
Szilard Nemeth c5aea8ca56 YARN-9723. ApplicationPlacementContext is not required for terminated jobs during recovery. Contributed by Prabhu Joseph
(cherry picked from commit e4b538bbda)
2019-08-12 15:16:18 +02:00
Szilard Nemeth 844259203f YARN-9451. AggregatedLogsBlock shows wrong NM http port. Contributed by Prabhu Joseph
(cherry picked from commit b91099efd6)
2019-08-12 15:06:16 +02:00
Szilard Nemeth b20fd9e212 YARN-9135. NM State store ResourceMappings serialization are tested with Strings instead of real Device objects. Contributed by Peter Bacsko 2019-08-12 14:02:17 +02:00
Sunil G 02b4635ff0 YARN-9729. [UI2] Fix error message for logs when ATSv2 is offline. Contributed by Zoltan Siegl.
(cherry picked from commit 1c5b28659f)
2019-08-11 11:49:25 +05:30
Szilard Nemeth 2e6beb1550 Logging fileSize of log files under NM Local Dir. Contributed by Prabhu Joseph
(cherry picked from commit 54ac80176e)
2019-08-09 13:20:10 +02:00
Sunil G 9fb6c6e2a1 YARN-9715. [UI2] yarn-container-log URI need to be encoded to avoid potential misuses. Contributed by Akhil PB.
(cherry picked from commit acffec7a92)
2019-08-09 16:07:04 +05:30
Szilard Nemeth 3e9071207a SUBMARINE-57. Add more elaborate message if submarine command is not recognized. Contributed by Adam Antal
(cherry picked from commit e5f4cd0fda)
2019-08-09 12:14:49 +02:00
Adam Antal 4c4f7d9c80 YARN-9124. Resolve contradiction in ResourceUtils: addMandatoryResources / checkMandatoryResources work differently (#1121)
(cherry picked from commit cbcada804d)
2019-08-09 11:43:30 +02:00
Szilard Nemeth 02d0e54596 YARN-9092. Create an object for cgroups mount enable and cgroups mount path as they belong together. Contributed by Gergely Pollak
(cherry picked from commit e0c21c6da9)
2019-08-09 10:23:10 +02:00
Szilard Nemeth f0dfb8b832 YARN-9096: Some GpuResourcePlugin and ResourcePluginManager methods are synchronized unnecessarily. Contributed by Gergely Pollak
(cherry picked from commit 742e30b473)
2019-08-09 10:02:35 +02:00
Szilard Nemeth 3bcf44f070 YARN-9094: Remove unused interface method: NodeResourceUpdaterPlugin#handleUpdatedResourceFromRM. Contributed by Gergely Pollak
(cherry picked from commit 72d7e570a7)
2019-08-09 09:50:32 +02:00
Eric E Payne e47c483d9f YARN-9685: NPE when rendering the info table of leaf queue in non-accessible partitions. Contributed by Tao Yang.
(cherry picked from commit 3b38f2019e)
2019-08-08 12:54:31 +00:00
Haibo Chen 8d357343c4 YARN-9559. Create AbstractContainersLauncher for pluggable ContainersLauncher logic. (Contributed by Jonathan Hung)
(cherry picked from commit f51702d539)
2019-08-06 14:59:49 -07:00
Eric E Payne 168dc3f258 YARN-9596: QueueMetrics has incorrect metrics when labelled partitions are involved. Contributed by Muhammad Samir Khan.
(cherry picked from commit 42683aef1a)
2019-07-30 19:19:33 +00:00
Jonathan Hung 15344006bc YARN-9668. UGI conf doesn't read user overridden configurations on RM and NM startup. (Contributed by Jonanthan Hung) 2019-07-22 10:46:45 -07:00
Weiwei Yang bf3d9f6282 YARN-9682. Wrong log message when finalizing the upgrade. Contributed by kyungwan nam.
(cherry picked from commit 85d9111a88)
2019-07-17 10:47:25 +08:00
bibinchundatt 4866735cde YARN-9645. Fix Invalid event FINISHED_CONTAINERS_PULLED_BY_AM at NEW on NM restart. Contributed by Bilwa S T.
(cherry picked from commit 7a93be0f60)
2019-07-16 14:06:36 +05:30
Szilard Nemeth 7c9cfc0996 YARN-9326. Fair Scheduler configuration defaults are not documented in case of min and maxResources. Contributed by Adam Antal
(cherry picked from commit 5446308360)
2019-07-15 13:30:58 +02:00
Szilard Nemeth 28d6a453a9 YARN-9127. Create more tests to verify GpuDeviceInformationParser. Contributed by Peter Bacsko
(cherry picked from commit 18ee1092b4)
2019-07-15 12:02:39 +02:00
Szilard Nemeth 2fcbdf4131 YARN-9337. Addendum to fix compilation error due to mockito spy call
(cherry picked from commit bb37c6cb7f)
2019-07-13 00:45:38 +02:00
Szilard Nemeth 4fa0de9f04 YARN-9626. UI2 - Fair scheduler queue apps page issues. Contributed by Zoltan Siegl
(cherry picked from commit 557056e18e)
2019-07-12 17:40:57 +02:00
Szilard Nemeth 0ede873090 YARN-9337. GPU auto-discovery script runs even when the resource is given by hand. Contributed by Adam Antal
(cherry picked from commit 61b0c2bb7c)
2019-07-12 17:29:47 +02:00
Szilard Nemeth c61c969668 YARN-9235. If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown. Contributed by Antal Balint Steinbach, Adam Antal
(cherry picked from commit c416284bb7)
2019-07-12 16:53:26 +02:00
Szilard Nemeth 3e3bbb7f5e YARN-9625. UI2 - No link to a queue on the Queues page for Fair Scheduler. Contributed by Zoltan Siegl
(cherry picked from commit 9cec023186)
2019-07-11 20:01:52 +02:00
Szilard Nemeth 4216090f19 YARN-9573. DistributedShell cannot specify LogAggregationContext. Contributed by Adam Antal. 2019-07-11 19:24:11 +02:00
bibinchundatt 5f8395f393 YARN-9557. Application fails in diskchecker when ReadWriteDiskValidator is configured. Contributed by Bilwa S T. 2019-07-10 10:34:39 +05:30
Szilard Nemeth 4638fa00fc YARN-9629. Support configurable MIN_LOG_ROLLING_INTERVAL. Contributed by Adam Antal.
(cherry picked from commit a2a8be18cb)
2019-07-04 10:26:29 +02:00
Sunil G d18986e4e8 YARN-9644. First RMContext object is always leaked during switch over. Contributed by Bibin A Chundatt. 2019-07-04 11:05:54 +05:30
Sunil G bea79e7645 YARN-9327. Improve synchronisation in ProtoUtils#convertToProtoFormat block. Contributed by Bibin A Chundatt.
(cherry picked from commit 0c8813f135)
2019-07-02 12:15:05 +05:30
Weiwei Yang c9bccaf148 YARN-9655. AllocateResponse in FederationInterceptor lost applicationPriority. Contributed by hunshenshi.
(cherry picked from commit 570eee30e5)
2019-07-02 10:05:22 +08:00
Erik Krogen 49d7bb6a92 HDFS-13286. [SBN read] Add haadmin commands to transition between standby and observer. Contributed by Chao Sun. 2019-06-28 14:20:01 -07:00
Eric Yang 860606fc67 YARN-9581. Add support for get multiple RM webapp URLs.
Contributed by Prabhu Joseph

(cherry picked from commit f02b0e1994)
2019-06-28 14:57:50 -04:00
bibinchundatt a2f4e4698b YARN-9639. DecommissioningNodesWatcher cause memory leak. Contributed by Bilwa S T.
(cherry picked from commit be80334cdf)
2019-06-27 10:04:40 +05:30
Weiwei Yang 1944a7d844 YARN-9209. When nodePartition is not set in Placement Constraints, containers are allocated only in default partition. Contributed by Tarun Parimi.
(cherry picked from commit 83dcb9d87e)
2019-06-21 17:52:22 +08:00
Wanqiang Ji f148b29508 YARN-9630. [UI2] Add a link in docs's top page
Signed-off-by: Masatake Iwasaki <iwasakims@apache.org>
(cherry picked from commit eb6be4643f)
2019-06-18 14:57:01 +09:00
Zhankun Tang 1e7201f9aa YARN-9584. Should put initializeProcessTrees method call before get pid. Contributed by Wanqiang Ji.
(cherry picked from commit 67414a1a80)
2019-06-18 13:18:27 +08:00
Inigo Goiri 65f7ec2f39 YARN-8856. TestTimelineReaderWebServicesHBaseStorage tests failing with NoClassDefFoundError. Contributed by Sushil Ks.
(cherry picked from commit eeaf8edaa7)
2019-06-13 14:22:16 -07:00
Sean Mackrory e0b3cbd221 HADOOP-16213. Update guava to 27.0-jre. Contributed by Gabor Bota. 2019-06-13 07:53:40 -06:00
Sunil G 253dcde517 YARN-9543. [UI2] Handle ATSv2 server down or failures cases gracefully in YARN UI v2. Contributed by Zoltan Siegl and Akhil P B.
(cherry picked from commit 52128e352a)
2019-06-12 19:25:02 +05:30
Sunil G 72203f7a12 YARN-9545. Create healthcheck REST endpoint for ATSv2. Contributed by Zoltan Siegl. 2019-06-12 19:23:40 +05:30
Sunil G f1ead03672 Revert "YARN-9545. Create healthcheck REST endpoint for ATSv2. Contributed by Zoltan Siegl."
This reverts commit f1d3a17d3e.
2019-06-12 19:10:23 +05:30
bibinchundatt 3303723f55 YARN-9547. ContainerStatusPBImpl default execution type is not returned. Contributed by Bilwa S T. 2019-06-11 23:42:29 +05:30
bibinchundatt d9284d4a57 YARN-9565. RMAppImpl#ranNodes not cleared on FinalTransition. Contributed by Bilwa S T.
(cherry picked from commit 60c95e9b6a)
2019-06-11 23:13:18 +05:30
bibinchundatt a37011bd5e YARN-9594. Fix missing break statement in ContainerScheduler#handle. Contributed by lujie.
(cherry picked from commit 6d80b9bc3f)

 Conflicts:
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java
2019-06-11 23:01:03 +05:30
Eric Yang 68aec0a98d YARN-9581. Fixed yarn logs cli to access RM2.
Contributed by Prabhu Joseph

(cherry picked from commit cb9bc6e64c)
2019-06-06 16:43:25 -04:00
Sunil G f1d3a17d3e YARN-9545. Create healthcheck REST endpoint for ATSv2. Contributed by Zoltan Siegl. 2019-06-06 06:24:01 +05:30
Weiwei Yang 6e2b091515 YARN-9580. Fulfilled reservation information in assignment is lost when transferring in ParentQueue#assignContainers. Contributed by Tao Yang. 2019-06-04 15:24:37 +08:00
Sunil G 2f012044ff YARN-8906. [UI2] NM hostnames not displayed correctly in Node Heatmap Chart. Contributed by Akhil PB.
(cherry picked from commit 59719dc560)
2019-06-03 15:54:07 +05:30
Sunil G 58042dadc3 YARN-8947. [UI2] Active User info missing from UI2. Contributed by Akhil PB.
(cherry picked from commit 7f46dda513)
2019-06-03 12:25:16 +05:30
Weiwei Yang e027c87da2 YARN-9507. Fix NPE in NodeManager#serviceStop on startup failure. Contributed by Bilwa S T.
(cherry picked from commit 4530f4500d)
2019-06-03 14:15:20 +08:00
Eric Yang b2a39e8883 YARN-9542. Fix LogsCLI guessAppOwner ignores custome file format suffix.
Contributed by Prabhu Joseph
2019-05-29 18:04:13 -04:00
Eric E Payne 2e561cef47 YARN-8625. Aggregate Resource Allocation for each job is not present in ATS. Contributed by Prabhu Joseph.
(cherry picked from commit 3c63551101)
2019-05-29 18:43:13 +00:00
Ahmed Hussein 777f7345ef YARN-9563. Resource report REST API could return NaN or Inf (Ahmed Hussein via jeagles)
Signed-off-by: Jonathan Eagles <jeagles@gmail.com>
(cherry picked from commit abf76ac371)
2019-05-29 12:14:01 -05:00
Takanobu Asanuma a9a3450560 HADOOP-16331. Fix ASF License check in pom.xml. Contributed by Akira Ajisaka.
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2019-05-29 17:34:16 +09:00
Akira Ajisaka c917ba782e
YARN-9500. Fix typos in ResourceModel.md. Contributed by leiqiang.
(cherry picked from commit 4a692bc3be)
2019-05-28 16:54:43 +09:00
Akira Ajisaka 855dc997d6
HADOOP-16323. https everywhere in Maven settings. 2019-05-27 15:27:33 +09:00
bibinchundatt 71f5bfb822 YARN-9508. YarnConfiguration areNodeLabel enabled is costly in allocation flow. Contributed by Bilwa S T.
(cherry picked from commit 570fa2da20)
2019-05-15 13:31:07 +05:30
Sunil G f4ee38df29 YARN-9519. TFile log aggregation file format is not working for yarn.log-aggregation.TFile.remote-app-log-dir config. Contributed by Adam Antal.
(cherry picked from commit 7d831eca64)
2019-05-14 10:49:09 -07:00