Commit Graph

3552 Commits

Author SHA1 Message Date
Sunil G d65371c4e8 YARN-9545. Create healthcheck REST endpoint for ATSv2. Contributed by Zoltan Siegl.
(cherry picked from commit f1d3a17d3e)
2019-06-06 06:25:02 +05:30
Weiwei Yang 23f9508a89 YARN-9507. Fix NPE in NodeManager#serviceStop on startup failure. Contributed by Bilwa S T.
(cherry picked from commit 4530f4500d)
2019-06-03 14:26:16 +08:00
Eric Yang 413a6b63bc YARN-9542. Fix LogsCLI guessAppOwner ignores custome file format suffix.
Contributed by Prabhu Joseph

(cherry picked from commit b2a39e8883)
2019-05-29 18:05:47 -04:00
Eric E Payne 9c3ab58aa7 YARN-8625. Aggregate Resource Allocation for each job is not present in ATS. Contributed by Prabhu Joseph.
(cherry picked from commit 3c63551101)
2019-05-29 19:08:27 +00:00
Ahmed Hussein f2202f7990 YARN-9563. Resource report REST API could return NaN or Inf (Ahmed Hussein via jeagles)
Signed-off-by: Jonathan Eagles <jeagles@gmail.com>
(cherry picked from commit abf76ac371)
2019-05-29 12:47:27 -05:00
Takanobu Asanuma 8098ddaf40 HADOOP-16331. Fix ASF License check in pom.xml. Contributed by Akira Ajisaka.
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2019-05-29 17:39:49 +09:00
Akira Ajisaka f8bd5deec1
HADOOP-16323. https everywhere in Maven settings. 2019-05-27 15:28:21 +09:00
Eric Yang bf013aa06e YARN-8622. Fixed container-executor compilation on MacOSX.
Contributed by Siyao Meng

(cherry picked from commit ef97a20831)
2019-05-09 14:55:38 -04:00
Haibo Chen ea1f0f282b YARN-9529. Log correct cpu controller path on error while initializing CGroups. (Contributed by Jonathan Hung)
(cherry picked from commit 597fa47ad1)
(cherry picked from commit c6573562cb)
2019-05-06 11:59:20 -07:00
Eric E Payne 41ffaea342 YARN-9285: RM UI progress column is of wrong type. Contributed by Ahmed Hussein.
(cherry picked from commit b094b94d43)
2019-05-02 19:57:44 +00:00
Weiwei Yang 94a895b94f YARN-9307. node_partitions constraint does not work. Contributed by kyungwan nam. 2019-04-26 13:16:43 +08:00
Weiwei Yang d242b166ed YARN-9325. TestQueueManagementDynamicEditPolicy fails intermittent. Contributed by Prabhu Joseph.
(cherry picked from commit 1c8046d67e)
2019-04-23 14:25:33 +08:00
Eric Yang 8b228a42e9 YARN-8587. Added retries for fetching docker exit code.
Contributed by Charo Zhang

(cherry picked from commit c16c49b8c3)
2019-04-19 15:40:56 -04:00
Eric Yang 68a98be8a2 YARN-6695. Fixed NPE in publishing appFinished events to ATSv2.
Contributed by Prabhu Joseph

(cherry picked from commit df76cdc895)
2019-04-18 12:31:34 -04:00
Weiwei Yang c37065eae9 YARN-9463. Add queueName info when failing with queue capacity sanity check. Contributed by Aihua Xu.
(cherry picked from commit 8c1bba375b)
2019-04-10 23:04:27 +08:00
Weiwei Yang bd0c9bc160 YARN-9413. Queue resource leak after app fail for CapacityScheduler. Contributed by Tao Yang.
(cherry picked from commit ec143cbf67)
2019-04-06 20:38:06 +08:00
Eric Yang dbc02bcda7 YARN-9391. Fixed node manager environment leaks into Docker containers.
Contributed by Jim Brennan

(cherry picked from commit 3c45762a0b)
2019-03-25 15:55:46 -04:00
Sunil G 379a9bfd9a YARN-9138. Improve test coverage for nvidia-smi binary execution of GpuDiscoverer. Contributed by Szilard Nemeth.
(cherry picked from commit 46045c5cb3)
2019-03-06 16:02:39 +05:30
bibinchundatt e663a6af89 Revert "YARN-8132. Final Status of applications shown as UNDEFINED in ATS app queries. Contributed by Prabhu Joseph"
This reverts commit 7db50ffceb.
2019-03-04 17:03:45 +05:30
Sunil G 80d507d1a4 YARN-9139. Simplify initializer code of GpuDiscoverer. Contributed by Szilard Nemeth. 2019-03-01 19:28:33 +05:30
Sunil G 817028364a YARN-9121. Replace GpuDiscoverer.getInstance() to a readable object for easy access control. Contributed by Szilard Nemeth. 2019-02-27 17:46:43 +05:30
Weiwei Yang 10d4a9a7fb YARN-9248. RMContainerImpl:Invalid event: ACQUIRED at KILLED. Contributed by lujie.
(cherry picked from commit 8c30114b00)
2019-02-27 17:39:37 +08:00
Sunil G 51b010b19f YARN-9087. Improve logging for initialization of Resource plugins. Contributed by Szilard Nemeth. 2019-02-27 11:57:32 +05:30
Sunil G cb0f45b75b YARN-9213. RM Web UI v1 does not show custom resource allocations for containers page. Contributed by Szilard Nemeth.
(cherry picked from commit f282f9c362)
2019-02-25 11:39:05 +05:30
Weiwei Yang dab22e74a4 YARN-9316. TestPlacementConstraintsUtil#testInterAppConstraintsByAppID fails intermittently. Contributed by Prabhu Joseph.
(cherry picked from commit 9cd5c5447f)
2019-02-24 22:57:40 +08:00
bibinchundatt 6d1cf7b395 YARN-9317. Avoid repeated YarnConfiguration#timelineServiceV2Enabled check. Contributed by Prabhu Joseph 2019-02-23 08:03:28 +05:30
bibinchundatt 7db50ffceb YARN-8132. Final Status of applications shown as UNDEFINED in ATS app queries. Contributed by Prabhu Joseph 2019-02-22 20:52:40 +05:30
Sunil G d6377c8b68 YARN-9118. Handle exceptions with parsing user defined GPU devices in GpuDiscoverer. Contributed by Szilard Nemeth.
(cherry picked from commit 95fbbfed75)
2019-02-22 20:23:51 +05:30
Weiwei Yang 040d475030 YARN-9238. Avoid allocating opportunistic containers to previous/removed/non-exist application attempt. Contributed by lujie.
(cherry picked from commit 9c88695bcd)
2019-02-22 21:43:53 +08:00
Weiwei Yang 1ffa7f8349 YARN-9315. TestCapacitySchedulerMetrics fails intermittently. Contributed by Prabhu Joseph. 2019-02-21 18:16:22 +08:00
bibinchundatt 77c7e8492e YARN-9286. [Timeline Server] Sorting based on FinalStatus shows pop-up message. Contributed by Bilwa S T.
(cherry picked from commit b8de78c570)
2019-02-20 01:21:12 +05:30
Adam Antal 511ffb5f70
YARN-9283. Javadoc of LinuxContainerExecutor#addSchedPriorityCommand has a wrong property name as reference
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 9385ec45d7)
2019-02-15 18:50:42 +09:00
Masatake Iwasaki 11ebdaab48 YARN-9282. Typo in javadoc of class LinuxContainerExecutor: hadoop.security.authetication should be 'authentication'. Contributed by Charan Hebri.
(cherry picked from commit e0ab1bdece)
2019-02-09 00:29:58 +09:00
Eric E Payne 834a862bd0 YARN-7171: RM UI should sort memory / cores numerically. Contributed by Ahmed Hussein
(cherry picked from commit d1ca9432dd)
2019-02-07 20:36:54 +00:00
Sunil G 6ffe6ea899 YARN-9206. RMServerUtils does not count SHUTDOWN as an accepted state. Contributed by Kuhu Shukla. 2019-02-07 19:08:41 +05:30
Weiwei Yang 41bdcf4110 YARN-9262. TestRMAppAttemptTransitions is failing with an NPE. Contributed by lujie.
(cherry picked from commit 28ad20a711)
2019-02-04 14:00:54 +05:30
Sunil G 3b03ff6fdd YARN-9099. GpuResourceAllocator#getReleasingGpus calculates number of GPUs in a wrong way. Contributed by Szilard Nemeth.
(cherry picked from commit 71c49fa60f)
2019-01-31 09:26:38 +05:30
Eric E Payne 0cb05a9fe3 YARN-6616: YARN AHS shows submitTime for jobs same as startTime. Contributed by Prabhu Joseph
(cherry picked from commit 04105bbfdb)
2019-01-29 18:04:01 +00:00
Weiwei Yang 4257043232 YARN-9237. NM should ignore sending finished apps to RM during RM fail-over. Contributed by Jiandan Yang.
(cherry picked from commit 4f63ffe444)
2019-01-29 11:03:26 +08:00
Rohith Sharma K S 6e059c7930 Revert "YARN-8270 Adding JMX Metrics for Timeline Collector and Reader. Contributed by Sushil Ks."
This reverts commit 5b72aa04e1.
2019-01-28 10:55:12 +05:30
Jonathan Hung 6092d913b1 YARN-9222. Print launchTime in ApplicationSummary
(cherry picked from commit 6cace58e21)
(cherry picked from commit bf760e7e81)
2019-01-25 13:50:44 -08:00
Haibo Chen 61a6cc8d23 YARN-7088. Add application launch time to Resource Manager REST API. (Kanwaljeet Sachdev via Haibo Chen)
(cherry picked from commit bb92bfb4ef)
2019-01-24 15:58:37 -08:00
Weiwei Yang 2471d8a6e7 YARN-9205. When using custom resource type, application will fail to run due to the CapacityScheduler throws InvalidResourceRequestException(GREATER_THEN_MAX_ALLOCATION). Contributed by Zhankun Tang.
(cherry picked from commit bc6374f282)
2019-01-23 18:18:38 +08:00
Weiwei Yang b61754b1bd YARN-9210. RM nodes web page can not display node info. Contributed by Jiandan Yang.
(cherry picked from commit d43df31751)
2019-01-22 11:01:00 +08:00
Weiwei Yang 4edd883d48 YARN-9204. RM fails to start if absolute resource is specified for partition capacity in CS queues. Contributed by Jiandan Yang.
(cherry picked from commit abde1e1f58)
2019-01-21 21:27:40 +08:00
Wangda Tan a685ffe9a9 YARN-9194. Invalid event: REGISTERED and LAUNCH_FAILED at FAILED, and NullPointerException happens in RM while shutdown a NM. (lujie via wangda)
Change-Id: I4359f59a73a278a941f4bb9d106dd38c9cb471fe
(cherry picked from commit 6d7eedfd28)
(cherry picked from commit fe7cb2d84a)
2019-01-17 15:17:34 -08:00
Weiwei Yang 91e9c9f96e YARN-9173. FairShare calculation broken for large values after YARN-8833. Contributed Wilfred Spiegelenburg. 2019-01-08 13:56:21 +08:00
Wangda Tan 3570713a69 YARN-8822. Nvidia-docker v2 support for YARN GPU feature. (Charo Zhang via wangda)
Change-Id: I416268888a7b6f097d218d84e8497dd70b4b6d8f
2019-01-07 12:30:30 -08:00
Wangda Tan 31ea2f7806 Preparing for 3.1.3 development
Change-Id: I3c3d3ee47dc4fef239127b4452ff14676fa26e3d
2019-01-07 10:04:58 -08:00
Weiwei Yang d6464629ca YARN-9164. Shutdown NM may cause NPE when opportunistic container scheduling is enabled. Contributed by lujie.
(cherry picked from commit cfe89e6f96)
2019-01-04 01:37:47 +08:00
Eric Yang b4fa1830a8 YARN-9040. Fixed memory leak in LevelDBCacheTimelineStore and DBIterator.
Contributed by Tarun Parimi

(cherry picked from commit 71e0b0d800)
2018-12-17 12:08:35 -05:00
Eric Yang 690d760174 YARN-9125. Fixed Carriage Return detection in Docker container launch command.
Contributed by Billie Rinaldi

(cherry picked from commit b2d7204ed0)
2018-12-14 17:55:38 -05:00
Weiwei Yang 14ecdb62b6 YARN-9009. Fix flaky test TestEntityGroupFSTimelineStore.testCleanLogs. Contributed by OrDTesters.
(cherry picked from commit 1c09a10e96)
2018-12-10 12:17:00 +08:00
Jonathan Hung 7b523e6a77 YARN-9085. Add Guaranteed and MaxCapacity to CSQueueMetrics
(cherry picked from commit 978ab3e958227220cb6f1a08ae6e7cdb8a46628b)
(cherry picked from commit dca69d178dba21c41fd1293187f29143f7e81e19)
2018-12-07 10:45:57 -08:00
Eric Yang 7ef4ff1905 YARN-9071. Improved status update for reinitialized containers.
Contributed by Chandni Singh

(cherry picked from commit 1b790f4dd1)
2018-12-05 19:05:26 -05:00
Jonathan Hung 2cb9479bfc YARN-9036. Escape newlines in health report in YARN UI. Contributed by Keqiu Hu 2018-11-30 10:16:39 -08:00
bibinchundatt 8be2d16b94 YARN-9069. Fix SchedulerInfo#getSchedulerType for custom schedulers. Contributed by Bilwa S T.
(cherry picked from commit 07142f54a8)
2018-11-29 22:08:35 +05:30
Jason Lowe d9457df989 YARN-8812. Containers fail during creating a symlink which started with hyphen for a resource file. Contributed by Oleksandr Shevchenko
(cherry picked from commit 3ce99e32f7)
2018-11-28 08:54:04 -06:00
Eric Yang bec5036397 YARN-8665. Added Yarn service cancel upgrade option.
Contributed by Chandni Singh
2018-11-27 16:29:08 -05:00
Eric Yang 463de48f04 YARN-8986. Added port publish for Docker container running with bridge.
Contributed by Charo Zhang
2018-11-27 14:28:37 -05:00
Weiwei Yang 17a41f5d86 YARN-8833. Avoid potential integer overflow when computing fair shares. Contributed by liyakun.
(cherry picked from commit d027a24f03)
2018-11-18 23:24:37 +08:00
Eric Yang fce0350289 YARN-8160. Support upgrade of service that use docker containers.
Contributed by Chandni Singh
2018-11-16 16:01:25 -05:00
Rohith Sharma K S 095635d984 YARN-8303. YarnClient should contact TimelineReader for application/attempt/container report.
(cherry picked from commit ee3355be3c)
2018-11-16 18:38:11 +05:30
Akira Ajisaka daad077121
YARN-8233. NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null. Contributed by Tao Yang. 2018-11-10 14:37:35 +09:00
Weiwei Yang a3b61baf94 YARN-8977. Remove unnecessary type casting when calling AbstractYarnScheduler#getSchedulerNode. Contributed by Wanqiang Ji.
(cherry picked from commit c96cbe8659)
2018-11-07 22:50:05 +08:00
Akira Ajisaka 52af95fdce
Revert "YARN-8233. NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null. Contributed by Tao Yang."
This reverts commit dd8479e80d.
2018-11-07 11:33:31 +09:00
Akira Ajisaka dd8479e80d
YARN-8233. NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null. Contributed by Tao Yang.
(cherry picked from commit 951c98f890)
2018-11-07 11:19:11 +09:00
Jason Lowe 7335d940de YARN-8865. RMStateStore contains large number of expired RMDelegationToken. Contributed by Wilfred Spiegelenburg
(cherry picked from commit ab6aa4c726)
2018-11-06 08:52:29 -06:00
Weiwei Yang 631b31110c YARN-8970. Improve the debug message in CS#allocateContainerOnSingleNode. Contributed by Zhankun Tang.
(cherry picked from commit 5d6554c722)
2018-11-06 14:53:28 +08:00
Weiwei Yang 71999f4464 YARN-8969. AbstractYarnScheduler#getNodeTracker should return generic type to avoid type casting. Contributed by Wanqiang Ji.
(cherry picked from commit c7fcca0d7e)
2018-11-06 13:23:42 +08:00
Jonathan Hung 221494a75c YARN-7225. Add queue and partition info to RM audit log. Contributed by Eric Payne
(cherry picked from commit 2ab611d48b)
2018-11-01 14:31:22 -07:00
Rohith Sharma K S d9a494b1e0 YARN-8950. Fix compilation issue due to dependency convergence error for hbase.profile=2.0.
(cherry picked from commit 4ec4ec6971)
2018-10-30 11:50:56 +05:30
Weiwei Yang 70efe253f3 YARN-8944. TestContainerAllocation.testUserLimitAllocationMultipleContainers failure after YARN-8896. Contributed by Wilfred Spiegelenburg.
(cherry picked from commit 1d90a0dd23)
2018-10-29 11:56:31 +08:00
Jason Lowe 3be72b7aa2 YARN-8904. TestRMDelegationTokens can fail in testRMDTMasterKeyStateOnRollingMasterKey. Contributed by Wilfred Spiegelenburg
(cherry picked from commit 93fb3b4b9c)
2018-10-23 12:55:48 -05:00
Rohith Sharma K S 3e3b088856 YARN-8826. Fix lingering timeline collector after serviceStop in TimelineCollectorManager. Contributed by Prabha Manepalli.
(cherry picked from commit 0b62983c5a)
2018-10-23 14:08:06 +05:30
Sunil G 0cb184d6e9 YARN-8868. Set HTTPOnly attribute to Cookie. Contributed by Chandni Singh.
(cherry picked from commit 2202e00ba8)
2018-10-23 09:58:05 +05:30
Eric Yang e86efa8712 YARN-8910. Fixed misleading log statement when container max retries is infinite.
Contributed by Chandni Singh

(cherry picked from commit 47ad98b2e1)
2018-10-19 13:50:32 -04:00
Weiwei Yang beca90ece8 YARN-8907. Fix incorrect logging message in TestCapacityScheduler. Contributed by Zhankun Tang.
(cherry picked from commit 13cc0f50ea)
2018-10-19 10:02:46 +08:00
Wangda Tan 46baafedf1 YARN-8896. Limit the maximum number of container assignments per heartbeat. (Zhankun Tang via wangda)
Change-Id: I6e72f8362bd7f5c2a844cb9e3c4732492314e9f1
(cherry picked from commit 780be14f07)
2018-10-18 12:29:19 -07:00
Weiwei Yang a0060cf8ee Revert "YARN-8468. Enable the use of queue based maximum container allocation limit and implement it in FairScheduler. Contributed by Antal Bálint Steinbach."
This reverts commit ce4a0898df.
2018-10-10 21:41:00 +08:00
Weiwei Yang 3968ce1073 YARN-8858. CapacityScheduler should respect maximum node resource when per-queue maximum-allocation is being used. Contributed by Wangda Tan.
(cherry picked from commit edce866489)
2018-10-10 09:48:56 +08:00
Weiwei Yang ce4a0898df YARN-8468. Enable the use of queue based maximum container allocation limit and implement it in FairScheduler. Contributed by Antal Bálint Steinbach. 2018-10-09 22:30:42 +08:00
Wangda Tan 86a1ad4428 YARN-8844. TestNMProxy unit test is failing. (Eric Yang via wangda)
Change-Id: I241fa8701b6f1dbcad87fd2e9a429e32e7aa40f5
(cherry picked from commit b3ac886933)
2018-10-04 10:49:29 -07:00
Shane Kumpf adbc010d0f YARN-8785. Improve the error message when a bind mount is not whitelisted. Contributed by Simon Prewo
(cherry picked from commit 5edb9d3b97)
2018-10-02 07:26:45 -06:00
Eric E Payne c306da08ec YARN-8774. Memory leak when CapacityScheduler allocates from reserved container with non-default label. Contributed by Tao Yang.
(cherry picked from commit 8598b498bc)
2018-09-28 15:34:23 +00:00
Vrushali C 5b72aa04e1 YARN-8270 Adding JMX Metrics for Timeline Collector and Reader. Contributed by Sushil Ks.
(cherry picked from commit 90e2e493b3)
2018-09-28 10:31:38 +05:30
Jason Lowe a56a345e07 YARN-8804. resourceLimits may be wrongly calculated when leaf-queue is blocked in cluster with 3+ level queues. Contributed by Tao Yang
(cherry picked from commit 6b988d821e)
2018-09-26 16:15:48 -07:00
Rohith Sharma K S d473152e6a YARN-8824. App Nodelabel missed after RM restart for finished apps. Contributed by Bibin A Chundatt. 2018-09-26 20:36:34 +05:30
Rohith Sharma K S 47306cc2db YARN-8815. RM fails to recover finished unmanaged AM. Contributed by Bibin A Chundatt.
(cherry picked from commit 50bc7746d7)
2018-09-25 11:40:09 +05:30
Eric Yang e9315f6688 YARN-8801. Fixed header comments for docker utility functions.
Contributed by Zian Chen
2018-09-20 13:12:29 -04:00
Jason Lowe 3fb6787295 YARN-8784. DockerLinuxContainerRuntime prevents access to distributed cache entries on a full disk. Contributed by Eric Badger
(cherry picked from commit 6b5838ed32)
2018-09-19 16:49:21 -05:00
Weiwei Yang aaf0b119e5 YARN-8771. CapacityScheduler fails to unreserve when cluster resource contains empty resource type. Contributed by Tao Yang.
(cherry picked from commit 0712537e79)
2018-09-19 19:38:09 +08:00
Jason Lowe 3d77094cf2 YARN-8648. Container cgroups are leaked when using docker. Contributed by Jim Brennan
(cherry picked from commit 2df0a8dcb3)
2018-09-18 15:43:10 -05:00
Weiwei Yang 00a469138d YARN-8720. CapacityScheduler does not enforce max resource allocation check at queue level. Contributed by Tarun Parimi.
(cherry picked from commit f1a893fdbc)
2018-09-14 16:40:35 +08:00
Jason Lowe 88687213cc YARN-8680. YARN NM: Implement Iterable Abstraction for LocalResourceTracker state. Contributed by Pradeep Ambati
(cherry picked from commit 250b50018e)
2018-09-13 14:12:20 -05:00
Weiwei Yang a7b1d1e006 YARN-8729. Node status updater thread could be lost after it is restarted. Contributed by Tao Yang.
(cherry picked from commit 39c1ea1ed4)
2018-09-13 23:16:52 +08:00
Sunil G c879ca38de YARN-8630. ATSv2 REST APIs should honor filter-entity-list-by-user in non-secure cluster when ACls are enabled. Contributed by Rohith Sharma K S.
(cherry picked from commit f4bda5e8e9)
2018-09-13 17:48:01 +05:30
Eric E Payne b6bc0f409a YARN-8709: CS preemption monitor always fails since one under-served queue was deleted. Contributed by Tao Yang.
(cherry picked from commit 987d8191ad)
2018-09-10 20:02:39 +00:00
Eric Yang 0b97dc5869 YARN-8751. Reduce conditions that mark node manager as unhealthy.
Contributed by Craig Condit

(cherry picked from commit 7d62334387)
2018-09-07 20:32:11 -04:00
Shane Kumpf 2d68708a1d YARN-8638. Allow linux container runtimes to be pluggable. Contributed by Craig Condit
(cherry picked from commit dffb7bfe6c)
2018-09-05 06:55:25 -06:00
bibinchundatt e2e0fc26a2 YARN-8535. Fix DistributedShell unit tests. Contributed by Abhishek Modi.
(cherry picked from commit eed8415dc1)
2018-09-02 13:37:38 +05:30
Shane Kumpf b8618556ee YARN-8642. Add support for tmpfs mounts with the Docker runtime. Contributed by Craig Condit
(cherry picked from commit 73625168c0)
2018-08-29 07:11:38 -06:00
Weiwei Yang f164568b47 YARN-8723. Fix a typo in CS init error message when resource calculator is not correctly set. Contributed by Abhishek Modi.
(cherry picked from commit 3fa4639421)
2018-08-29 11:15:02 +08:00
Billie Rinaldi eefd780918 YARN-8675. Remove default hostname for docker containers when net=host. Contributed by Suma Shivaprasad
(cherry picked from commit 05b2bbeb35)
2018-08-27 11:42:09 -07:00
Haibo Chen e4282c077b YARN-8051. TestRMEmbeddedElector#testCallbackSynchronization is flaky. (Robert Kanter via Haibo Chen)
(cherry picked from commit 93d47a0ed5)

Conflicts:
	hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMEmbeddedElector.java
2018-08-24 13:24:08 -05:00
Jason Lowe 84654451fa YARN-8649. NPE in localizer hearbeat processing if a container is killed while localizing. Contributed by lujie
(cherry picked from commit 585ebd873a)
2018-08-23 09:37:43 -05:00
Rohith Sharma K S 63d5214332 YARN-8129. Improve error message for invalid value in fields attribute. Contributed by Abhishek Modi.
(cherry picked from commit d3fef7a5c5)
2018-08-21 12:11:12 +05:30
Wei-Chiu Chuang 0d155de159 HADOOP-14212. Addendum patch: Expose SecurityEnabled boolean field in JMX for other services besides NameNode. Contributed by Adam Antal. 2018-08-20 14:49:28 -07:00
Wei-Chiu Chuang 78fb14ba49 HADOOP-14212. Expose SecurityEnabled boolean field in JMX for other services besides NameNode. Contributed by Adam Antal. 2018-08-20 14:49:24 -07:00
Jason Lowe 44c4928b64 YARN-8242. YARN NM: OOM error while reading back the state store on recovery. Contributed by Pradeep Ambati and Kanwaljeet Sachdev
(cherry picked from commit 65e7469712)
2018-08-20 10:21:57 -05:00
Rohith Sharma K S a3d4a25bbf YARN-8679. [ATSv2] If HBase cluster is down for long time, high chances that NM ContainerManager dispatcher get blocked. Contributed by Wangda Tan.
(cherry picked from commit 4aacbfff60)
2018-08-18 11:04:09 +05:30
Eric Yang 5237bdfb5a YARN-8667. Cleanup symlinks when container restarted by NM.
Contributed by Chandni Singh

(cherry picked from commit d42806160e)
2018-08-16 18:44:47 -04:00
Jason Lowe 819a2a6f10 YARN-8656. container-executor should not write cgroup tasks files for docker containers. Contributed by Jim Brennan
(cherry picked from commit cb21eaa026)
2018-08-16 10:09:56 -05:00
Jason Lowe 95cd6de5c6 YARN-8640. Restore previous state in container-executor after failure. Contributed by Jim Brennan
(cherry picked from commit d1d129aa9d)
2018-08-14 10:26:21 -05:00
Weiwei Yang 734bc42289 YARN-8575. Avoid committing allocation proposal to unavailable nodes in async scheduling. Contributed by Tao Yang.
(cherry picked from commit 0a71bf1452)
2018-08-10 15:10:27 +08:00
Weiwei Yang 991514f7c3 YARN-8521. NPE in AllocationTagsManager when a container is removed more than once. Contributed by Weiwei Yang.
(cherry picked from commit 08d5060605)
2018-08-10 08:44:53 +08:00
Wangda Tan 68279fcd65 YARN-8588. Logging improvements for better debuggability. (Suma Shivaprasad via wangda)
Change-Id: I66aa4b0ec031ae5ce0fae558e2f8cbcbbfebc442
(cherry picked from commit 344c335a92)
2018-08-09 12:04:25 -07:00
Weiwei Yang 0ee7e80047 YARN-8559. Expose mutable-conf scheduler's configuration in RM /scheduler-conf endpoint. Contributed by Weiwei Yang.
(cherry picked from commit d352f167eb)
2018-08-10 00:43:53 +08:00
Jason Lowe 3dd299a770 YARN-8331. Race condition in NM container launched after done. Contributed by Pradeep Ambati
(cherry picked from commit cd04e954d2)
2018-08-09 10:23:02 -05:00
Wangda Tan 450c791ecf YARN-8629. Container cleanup fails while trying to delete Cgroups. (Suma Shivaprasad via wangda)
Change-Id: I392ef4f8baa84d5d7b1f2e438c560b5426b6d4f2
(cherry picked from commit d4258fcad7)
2018-08-07 12:41:55 -07:00
Jason Lowe 619019ccca YARN-8263. DockerClient still touches hadoop.tmp.dir. Contributed by Craig Condit
(cherry picked from commit 7526815e32)
2018-08-02 10:45:52 -05:00
Sunil G 1f77b20f08 YARN-8593. Add RM web service endpoint to get user information. Contributed by Akhil PB.
(cherry picked from commit 735b492556)
2018-08-02 08:35:54 +05:30
Billie Rinaldi 2a94823f32 YARN-8403. Change the log level for fail to download resource from INFO to ERROR. Contributed by Eric Yang
(cherry picked from commit 67c65da261)
2018-08-01 08:58:15 -07:00
Sunil G ff35f0c308 YARN-8606. Opportunistic scheduling does not work post RM failover. Contributed by Bibin A Chundatt.
(cherry picked from commit a48a0cc7fd)
2018-08-01 12:17:53 +05:30
Sunil G cbfd7358d2 YARN-8397. Potential thread leak in ActivitiesManager. Contributed by Rohith Sharma K S.
(cherry picked from commit 6310c0d17d)
2018-08-01 08:34:09 +05:30
Eric Yang 7640d62716 YARN-8579. Recover NMToken of previous attempted component data.
Contributed by Gour Saha
2018-07-31 18:35:31 -04:00
Wangda Tan 5583711419 Preparing for 3.1.2 release
Change-Id: If2793e2ed2b5b349a9e1f98f78df43f309dcfcbd
2018-07-31 13:08:55 -07:00
Wangda Tan 7b552c9d72 YARN-8418. App local logs could leaked if log aggregation fails to initialize for the app. (Bibin A Chundatt via wangda)
Change-Id: I29a23ca4b219b48c92e7975cd44cddb8b0e04104
(cherry picked from commit 4b540bbfcf)
2018-07-31 12:13:36 -07:00
Jonathan Hung b91cf90e1c YARN-7974. Allow updating application tracking url after registration. Contributed by Jonathan Hung 2018-07-30 17:57:25 -07:00
bibinchundatt 8cd2a73777 YARN-8584. Several typos in Log Aggregation related classes. Contributed by Szilard Nemeth.
(cherry picked from commit 2b39ad2698)
2018-07-31 00:07:08 +05:30
Sunil G f1eb5777a0 YARN-8591. [ATSv2] NPE while checking for entity acl in non-secure cluster. Contributed by Rohith Sharma K S.
(cherry picked from commit 63e08ec071)
2018-07-30 14:49:03 +05:30
bibinchundatt 2e7876a725 YARN-8558. NM recovery level db not cleaned up properly on container finish. Contributed by Bibin A Chundatt.
(cherry picked from commit 3d586841ab)
2018-07-28 20:56:35 +05:30
Eric Yang c2c3eee69c YARN-8508. Release GPU resource for killed container.
Contributed by Chandni Singh

(cherry picked from commit ed9d60e888)
2018-07-27 19:36:21 -04:00
Eric Yang 8e3807afe0 YARN-8330. Improved publishing ALLOCATED events to ATS.
Contributed by Suma Shivaprasad

(cherry picked from commit f93ecf5c1e)
2018-07-25 18:51:42 -04:00
Eric E Payne 830ef12af8 YARN-4606. CapacityScheduler: applications could get starved because computation of #activeUsers considers pending apps. Contributed by Manikandan R
(cherry picked from commit 9485c9aee6)
2018-07-25 16:30:30 +00:00
bibinchundatt 8e65057eb1 YARN-8541. RM startup failure on recovery after user deletion. Contributed by Bibin A Chundatt. 2018-07-25 15:54:32 +05:30
Weiwei Yang b89624a943 YARN-8546. Resource leak caused by a reserved container being released more than once under async scheduling. Contributed by Tao Yang.
(Cherry-picked from commit 5be9f4a5d0)
2018-07-25 17:53:40 +08:00
Haibo Chen 7e7792dd7b YARN-6966. NodeManager metrics may return wrong negative values when NM restart. (Szilard Nemeth via Haibo Chen)
(cherry picked from commit 9d3c39e9dd)
2018-07-24 12:50:43 -07:00
Sunil G 4488fd8295 YARN-7748. TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted fails due to multiple container fail events. Contributed by Weiwei Yang.
(cherry picked from commit 35ce6eb1f5)
2018-07-24 22:21:15 +05:30
bibinchundatt a684a2efb8 YARN-8548. AllocationRespose proto setNMToken initBuilder not done. Contributed by Bilwa S T.
(cherry picked from commit ff7c2eda34)
2018-07-24 16:30:31 +05:30
bibinchundatt 0710107f8d YARN-8544. [DS] AM registration fails when hadoop authorization is enabled. Contributed by Bibin A Chundatt.
(cherry picked from commit 8461278833)
2018-07-24 13:11:31 +05:30
Eric Yang 23b8546a80 YARN-8380. Support bind propagation options for mounts in docker runtime.
Contributed by Billie Rinaldi

(cherry picked from commit 8688a0c7f8)
2018-07-23 20:13:41 -04:00
Weiwei Yang 004e1f248e YARN-8528. Final states in ContainerAllocation might be modified externally causing unexpected allocation results. Contributed by Xintong Song. 2018-07-20 22:43:47 +08:00
Eric Yang 76b8beb289 YARN-8501. Reduce complexity of RMWebServices getApps method.
Contributed by Szilard Nemeth

(cherry picked from commit 5836e0a46b)
2018-07-19 12:32:55 -04:00
Robert Kanter dfa71428ea YARN-8518. test-container-executor test_is_empty() is broken (Jim_Brennan via rkanter)
(cherry picked from commit 1bc106a738)
2018-07-18 16:07:48 -07:00
Robert Kanter 1c7d916347 Only mount non-empty directories for cgroups (miklos.szegedi@cloudera.com via rkanter)
(cherry picked from commit 0838fe8337)
2018-07-18 16:07:48 -07:00
Robert Kanter 27e2b4b364 Disable mounting cgroups by default (miklos.szegedi@cloudera.com via rkanter)
(cherry picked from commit 351cf87c92)
2018-07-18 16:07:48 -07:00
Eric Yang d82edec3c0 YARN-8538. Fixed memory leaks in container-executor and test cases.
Contributed by Billie Rinaldi
2018-07-18 13:44:49 -04:00
Wangda Tan 44beab0b63 YARN-8511. When AM releases a container, RM removes allocation tags before it is released by NM. (Weiwei Yang via wangda)
Change-Id: I6f9f409f2ef685b405cbff547dea9623bf3322d9
(cherry picked from commit 752dcce5f4)
2018-07-16 11:04:08 -07:00
Eric E Payne 9a79e893f7 YARN-8421: when moving app, activeUsers is increased, even though app does not have outstanding request. Contributed by Kyungwan Nam
(cherry picked from commit 937ef39b3f)
2018-07-16 16:32:05 +00:00