diff --git a/hadoop-common-project/hadoop-common/CHANGES.txt b/hadoop-common-project/hadoop-common/CHANGES.txt
index 7a7106197d9..4e6dc469ebe 100644
--- a/hadoop-common-project/hadoop-common/CHANGES.txt
+++ b/hadoop-common-project/hadoop-common/CHANGES.txt
@@ -312,6 +312,11 @@ Release 2.4.0 - UNRELEASED
HADOOP-10295. Allow distcp to automatically identify the checksum type of
source files and use it for the target. (jing9 and Laurent Goujon)
+ HADOOP-10333. Fix grammatical error in overview.html document.
+ (René Nyffenegger via suresh)
+
+ HADOOP-10343. Change info to debug log in LossyRetryInvocationHandler. (arpit)
+
OPTIMIZATIONS
BUG FIXES
@@ -328,15 +333,36 @@ Release 2.4.0 - UNRELEASED
HADOOP-10330. TestFrameDecoder fails if it cannot bind port 12345.
(Arpit Agarwal)
-Release 2.3.0 - UNRELEASED
+ HADOOP-10326. M/R jobs can not access S3 if Kerberos is enabled. (bc Wong
+ via atm)
+
+ HADOOP-10338. Cannot get the FileStatus of the root inode from the new
+ Globber (cmccabe)
+
+ HADOOP-10249. LdapGroupsMapping should trim ldap password read from file.
+ (Dilli Armugam via suresh)
+
+Release 2.3.1 - UNRELEASED
INCOMPATIBLE CHANGES
- HADOOP-8545. Filesystem Implementation for OpenStack Swift
- (Dmitry Mezhensky, David Dobbins, Stevel via stevel)
+ NEW FEATURES
+
+ IMPROVEMENTS
+
+ OPTIMIZATIONS
+
+ BUG FIXES
+
+Release 2.3.0 - 2014-02-18
+
+ INCOMPATIBLE CHANGES
NEW FEATURES
+ HADOOP-8545. Filesystem Implementation for OpenStack Swift
+ (Dmitry Mezhensky, David Dobbins, Stevel via stevel)
+
IMPROVEMENTS
HADOOP-10046. Print a log message when SSL is enabled.
diff --git a/hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html b/hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html
index efbaeae4b14..d2b6156573d 100644
--- a/hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html
+++ b/hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html
@@ -1,3 +1,2953 @@
+
+
Hadoop 2.3.0 Release Notes
+
+
+
+Hadoop 2.3.0 Release Notes
+These release notes include new developer and user-facing incompatibilities, features, and major improvements.
+
+Changes since Hadoop 2.2.0
+
+- YARN-1642.
+ Blocker sub-task reported by Karthik Kambatla and fixed by Karthik Kambatla (resourcemanager)
+ RMDTRenewer#getRMClient should use ClientRMProxy
+ RMDTRenewer#getRMClient gets a proxy to the RM in the conf directly instead of going through ClientRMProxy.
+
+{code}
+ final YarnRPC rpc = YarnRPC.create(conf);
+ return (ApplicationClientProtocol)rpc.getProxy(ApplicationClientProtocol.class, addr, conf);
+{code}
+- YARN-1630.
+ Major bug reported by Aditya Acharya and fixed by Aditya Acharya (client)
+ Introduce timeout for async polling operations in YarnClientImpl
+ I ran an MR2 application that would have been long running, and killed it programmatically using a YarnClient. The app was killed, but the client hung forever. The message that I saw, which spammed the logs, was "Watiting for application application_1389036507624_0018 to be killed."
+
+The RM log indicated that the app had indeed transitioned from RUNNING to KILLED, but for some reason future responses to the RPC to kill the application did not indicate that the app had been terminated.
+
+I tracked this down to YarnClientImpl.java, and though I was unable to reproduce the bug, I wrote a patch to introduce a bound on the number of times that YarnClientImpl retries the RPC before giving up.
+- YARN-1629.
+ Major bug reported by Sandy Ryza and fixed by Sandy Ryza (scheduler)
+ IndexOutOfBoundsException in Fair Scheduler MaxRunningAppsEnforcer
+ This can occur when the second-to-last app in a queue's pending app list is made runnable. The app is pulled out from under the iterator.
+- YARN-1628.
+ Major bug reported by Mit Desai and fixed by Vinod Kumar Vavilapalli
+ TestContainerManagerSecurity fails on trunk
+ The Test fails with the following error
+
+{noformat}
+java.lang.IllegalArgumentException: java.net.UnknownHostException: InvalidHost
+ at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377)
+ at org.apache.hadoop.yarn.server.security.BaseNMTokenSecretManager.newInstance(BaseNMTokenSecretManager.java:145)
+ at org.apache.hadoop.yarn.server.security.BaseNMTokenSecretManager.createNMToken(BaseNMTokenSecretManager.java:136)
+ at org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:253)
+ at org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:144)
+{noformat}
+- YARN-1624.
+ Major bug reported by Aditya Acharya and fixed by Aditya Acharya (scheduler)
+ QueuePlacementPolicy format is not easily readable via a JAXB parser
+ The current format for specifying queue placement rules in the fair scheduler allocations file does not lend itself to easy parsing via a JAXB parser. In particular, relying on the tag name to encode information about which rule to use makes it very difficult for an xsd-based JAXB parser to preserve the order of the rules, which is essential.
+- YARN-1623.
+ Major improvement reported by Sandy Ryza and fixed by Sandy Ryza (scheduler)
+ Include queue name in RegisterApplicationMasterResponse
+ This provides the YARN change necessary to support MAPREDUCE-5732.
+- YARN-1618.
+ Blocker sub-task reported by Karthik Kambatla and fixed by Karthik Kambatla (resourcemanager)
+ Fix invalid RMApp transition from NEW to FINAL_SAVING
+ YARN-891 augments the RMStateStore to store information on completed applications. In the process, it adds transitions from NEW to FINAL_SAVING. This leads to the RM trying to update entries in the state-store that do not exist. On ZKRMStateStore, this leads to the RM crashing.
+
+Previous description:
+ZKRMStateStore fails to handle updates to znodes that don't exist. For instance, this can happen when an app transitions from NEW to FINAL_SAVING. In these cases, the store should create the missing znode and handle the update.
+- YARN-1616.
+ Trivial improvement reported by Karthik Kambatla and fixed by Karthik Kambatla (resourcemanager)
+ RMFatalEventDispatcher should log the cause of the event
+ RMFatalEventDispatcher#handle() logs the receipt of an event and its type, but leaves out the cause. The cause captures why the event was raised and would help debugging issues.
+- YARN-1608.
+ Trivial bug reported by Karthik Kambatla and fixed by Karthik Kambatla (nodemanager)
+ LinuxContainerExecutor has a few DEBUG messages at INFO level
+ LCE has a few INFO level log messages meant to be at debug level. In fact, they are logged both at INFO and DEBUG.
+- YARN-1607.
+ Major bug reported by Sandy Ryza and fixed by Sandy Ryza
+ TestRM expects the capacity scheduler
+ We should either explicitly set the Capacity Scheduler or make it scheduler-agnostic
+- YARN-1603.
+ Trivial bug reported by Zhijie Shen and fixed by Zhijie Shen
+ Remove two *.orig files which were unexpectedly committed
+ FairScheduler.java.orig and TestFifoScheduler.java.orig
+- YARN-1601.
+ Major bug reported by Alejandro Abdelnur and fixed by Alejandro Abdelnur
+ 3rd party JARs are missing from hadoop-dist output
+ With the build changes of YARN-888 we are leaving out all 3rd party JArs used directly by YARN under /share/hadoop/yarn/lib/.
+
+We did not notice this when running minicluster because they all happen to be in the classpath from hadoop-common and hadoop-yarn.
+
+As 3d party JARs are not 'public' interfaces we cannot rely on them being provided to yarn by common and hdfs. (ie if common and hdfs stop using a 3rd party dependency that yarn uses this would break yarn if yarn does not pull that dependency explicitly).
+
+Also, this will break bigtop hadoop build when they move to use branch-2 as they expect to find jars in /share/hadoop/yarn/lib/
+- YARN-1600.
+ Blocker bug reported by Jason Lowe and fixed by Haohui Mai (resourcemanager)
+ RM does not startup when security is enabled without spnego configured
+ We have a custom auth filter in front of our various UI pages that handles user authentication. However currently the RM assumes that if security is enabled then the user must have configured spnego as well for the RM web pages which is not true in our case.
+- YARN-1598.
+ Critical sub-task reported by Karthik Kambatla and fixed by Karthik Kambatla (client , resourcemanager)
+ HA-related rmadmin commands don't work on a secure cluster
+ The HA-related commands like -getServiceState -checkHealth etc. don't work in a secure cluster.
+- YARN-1579.
+ Trivial sub-task reported by Karthik Kambatla and fixed by Karthik Kambatla (resourcemanager)
+ ActiveRMInfoProto fields should be optional
+ Per discussion on YARN-1568, ActiveRMInfoProto should have optional fields instead of required fields.
+- YARN-1575.
+ Critical sub-task reported by Jason Lowe and fixed by Jason Lowe (nodemanager)
+ Public localizer crashes with "Localized unkown resource"
+ The public localizer can crash with the error:
+
+{noformat}
+2014-01-08 14:11:43,212 [Thread-467] ERROR org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Localized unkonwn resource to java.util.concurrent.FutureTask@852e26
+2014-01-08 14:11:43,212 [Thread-467] INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Public cache exiting
+{noformat}
+- YARN-1574.
+ Blocker sub-task reported by Xuan Gong and fixed by Xuan Gong
+ RMDispatcher should be reset on transition to standby
+ Currently, we move rmDispatcher out of ActiveService. But we still register the Event dispatcher, such as schedulerDispatcher, RMAppEventDispatcher when we initiate the ActiveService.
+
+Almost every time when we transit RM from Active to Standby, we need to initiate the ActiveService. That means we will register the same event Dispatcher which will cause the same event will be handled several times.
+- YARN-1573.
+ Major sub-task reported by Karthik Kambatla and fixed by Karthik Kambatla (resourcemanager)
+ ZK store should use a private password for root-node-acls
+ Currently, when HA is enabled, ZK store uses cluster-timestamp as the password for root node ACLs to give the Active RM exclusive access to the store. A more private value like a random number might be better.
+- YARN-1568.
+ Trivial task reported by Karthik Kambatla and fixed by Karthik Kambatla (resourcemanager)
+ Rename clusterid to clusterId in ActiveRMInfoProto
+ YARN-1029 introduces ActiveRMInfoProto - just realized it defines a field clusterid, which is inconsistent with other fields. Better to fix it immediately than leave the inconsistency.
+- YARN-1567.
+ Major improvement reported by Sandy Ryza and fixed by Sandy Ryza (scheduler)
+ In Fair Scheduler, allow empty queues to change between leaf and parent on allocation file reload
+
+- YARN-1560.
+ Major test reported by Ted Yu and fixed by Ted Yu
+ TestYarnClient#testAMMRTokens fails with null AMRM token
+ The following can be reproduced locally:
+{code}
+testAMMRTokens(org.apache.hadoop.yarn.client.api.impl.TestYarnClient) Time elapsed: 3.341 sec <<< FAILURE!
+junit.framework.AssertionFailedError: null
+ at junit.framework.Assert.fail(Assert.java:48)
+ at junit.framework.Assert.assertTrue(Assert.java:20)
+ at junit.framework.Assert.assertNotNull(Assert.java:218)
+ at junit.framework.Assert.assertNotNull(Assert.java:211)
+ at org.apache.hadoop.yarn.client.api.impl.TestYarnClient.testAMMRTokens(TestYarnClient.java:382)
+{code}
+This test didn't appear in https://builds.apache.org/job/Hadoop-Yarn-trunk/442/consoleFull
+- YARN-1559.
+ Blocker sub-task reported by Karthik Kambatla and fixed by Karthik Kambatla (resourcemanager)
+ Race between ServerRMProxy and ClientRMProxy setting RMProxy#INSTANCE
+ RMProxy#INSTANCE is a non-final static field and both ServerRMProxy and ClientRMProxy set it. This leads to races as witnessed on - YARN-1482.
+
+Sample trace:
+{noformat}
+java.lang.IllegalArgumentException: RM does not support this client protocol
+ at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
+ at org.apache.hadoop.yarn.client.ClientRMProxy.checkAllowedProtocols(ClientRMProxy.java:119)
+ at org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.init(ConfiguredRMFailoverProxyProvider.java:58)
+ at org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:158)
+ at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:88)
+ at org.apache.hadoop.yarn.server.api.ServerRMProxy.createRMProxy(ServerRMProxy.java:56)
+{noformat}
+- YARN-1549.
+ Major test reported by Ted Yu and fixed by haosdent
+ TestUnmanagedAMLauncher#testDSShell fails in trunk
+ The following error is reproducible:
+{code}
+testDSShell(org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher) Time elapsed: 14.911 sec <<< ERROR!
+java.lang.RuntimeException: Failed to receive final expected state in ApplicationReport, CurrentState=RUNNING, ExpectedStates=FINISHED,FAILED,KILLED
+ at org.apache.hadoop.yarn.applications.unmanagedamlauncher.UnmanagedAMLauncher.monitorApplication(UnmanagedAMLauncher.java:447)
+ at org.apache.hadoop.yarn.applications.unmanagedamlauncher.UnmanagedAMLauncher.run(UnmanagedAMLauncher.java:352)
+ at org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher.testDSShell(TestUnmanagedAMLauncher.java:147)
+{code}
+See https://builds.apache.org/job/Hadoop-Yarn-trunk/435
+- YARN-1541.
+ Major bug reported by Jian He and fixed by Jian He
+ Invalidate AM Host/Port when app attempt is done so that in the mean-while client doesn’t get wrong information.
+
+- YARN-1527.
+ Trivial bug reported by Jian He and fixed by Akira AJISAKA
+ yarn rmadmin command prints wrong usage info:
+ The usage should be: yarn rmadmin, instead of java RMAdmin, and the -refreshQueues should be in the second line.
+{code} Usage: java RMAdmin -refreshQueues
+ -refreshNodes
+ -refreshSuperUserGroupsConfiguration
+ -refreshUserToGroupsMappings
+ -refreshAdminAcls
+ -refreshServiceAcl
+ -getGroups [username]
+ -help [cmd]
+ -transitionToActive <serviceId>
+ -transitionToStandby <serviceId>
+ -failover [--forcefence] [--forceactive] <serviceId> <serviceId>
+ -getServiceState <serviceId>
+ -checkHealth <serviceId>
+{code}
+- YARN-1523.
+ Major sub-task reported by Bikas Saha and fixed by Karthik Kambatla
+ Use StandbyException instead of RMNotYetReadyException
+
+- YARN-1522.
+ Major bug reported by Liyin Liang and fixed by Liyin Liang
+ TestApplicationCleanup.testAppCleanup occasionally fails
+ TestApplicationCleanup is occasionally failing with the error:
+{code}
+-------------------------------------------------------------------------------
+Test set: org.apache.hadoop.yarn.server.resourcemanager.TestApplicationCleanup
+-------------------------------------------------------------------------------
+Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 6.215 sec <<< FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestApplicationCleanup
+testAppCleanup(org.apache.hadoop.yarn.server.resourcemanager.TestApplicationCleanup) Time elapsed: 5.555 sec <<< FAILURE!
+junit.framework.AssertionFailedError: expected:<1> but was:<0>
+at org.apache.hadoop.yarn.server.resourcemanager.TestApplicationCleanup.testAppCleanup(TestApplicationCleanup.java:119)
+{code}
+- YARN-1505.
+ Blocker bug reported by Xuan Gong and fixed by Xuan Gong
+ WebAppProxyServer should not set localhost as YarnConfiguration.PROXY_ADDRESS by itself
+ At WebAppProxyServer::startServer(), it will set up YarnConfiguration.PROXY_ADDRESS to localhost:9099 by itself. So, no matter what is the value we set YarnConfiguration.PROXY_ADDRESS in configuration, the proxyserver will bind to localhost:9099
+- YARN-1491.
+ Trivial bug reported by Jonathan Eagles and fixed by Chen He
+ Upgrade JUnit3 TestCase to JUnit 4
+ There are still four references to test classes that extend from junit.framework.TestCase
+
+hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestYarnVersionInfo.java
+hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestWindowsResourceCalculatorPlugin.java
+hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestLinuxResourceCalculatorPlugin.java
+hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestWindowsBasedProcessTree.java
+
+- YARN-1485.
+ Major sub-task reported by Xuan Gong and fixed by Xuan Gong
+ Enabling HA should verify the RM service addresses configurations have been set for every RM Ids defined in RM_HA_IDs
+ After YARN-1325, the YarnConfiguration.RM_HA_IDS will contain multiple RM_Ids. We need to verify that the RM service addresses configurations have been set for all of RM_Ids.
+- YARN-1482.
+ Major sub-task reported by Vinod Kumar Vavilapalli and fixed by Xuan Gong
+ WebApplicationProxy should be always-on w.r.t HA even if it is embedded in the RM
+ This way, even if an RM goes to standby mode, we can affect a redirect to the active. And more importantly, users will not suddenly see all their links stop working.
+- YARN-1481.
+ Major sub-task reported by Vinod Kumar Vavilapalli and fixed by Vinod Kumar Vavilapalli
+ Move internal services logic from AdminService to ResourceManager
+ This is something I found while reviewing YARN-1318, but didn't halt that patch as many cycles went there already. Some top level issues
+ - Not easy to follow RM's service life cycle
+ -- RM adds only AdminService as its service directly.
+ -- Other services are added to RM when AdminService's init calls RM.activeServices.init()
+ - Overall, AdminService shouldn't encompass all of RM's HA state management. It was originally supposed to be the implementation of just the RPC server.
+- YARN-1463.
+ Major test reported by Ted Yu and fixed by Vinod Kumar Vavilapalli
+ Tests should avoid starting http-server where possible or creates spnego keytab/principals
+ Here is stack trace:
+{code}
+testContainerManager[1](org.apache.hadoop.yarn.server.TestContainerManagerSecurity) Time elapsed: 1.756 sec <<< ERROR!
+org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException: ResourceManager failed to start. Final state is STOPPED
+ at org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:253)
+ at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
+ at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
+ at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
+ at org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:110)
+{code}
+- YARN-1454.
+ Critical bug reported by Jian He and fixed by Karthik Kambatla
+ TestRMRestart.testRMDelegationTokenRestoredOnRMRestart is failing intermittently
+
+- YARN-1451.
+ Minor bug reported by Sandy Ryza and fixed by Sandy Ryza
+ TestResourceManager relies on the scheduler assigning multiple containers in a single node update
+ TestResourceManager rely on the capacity scheduler.
+
+It relies on a scheduler that assigns multiple containers in a single heartbeat, which not all schedulers do by default. It also relies on schedulers that don't consider CPU capacities. It would be simple to change the test to use multiple heartbeats and increase the vcore capacities of the nodes in the test.
+- YARN-1450.
+ Major bug reported by Akira AJISAKA and fixed by Binglin Chang (applications/distributed-shell)
+ TestUnmanagedAMLauncher#testDSShell fails on trunk
+ TestUnmanagedAMLauncher fails on trunk. The console output is
+{code}
+Running org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher
+Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 35.937 sec <<< FAILURE! - in org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher
+testDSShell(org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher) Time elapsed: 14.558 sec <<< ERROR!
+java.lang.RuntimeException: Failed to receive final expected state in ApplicationReport, CurrentState=ACCEPTED, ExpectedStates=FINISHED,FAILED,KILLED
+ at org.apache.hadoop.yarn.applications.unmanagedamlauncher.UnmanagedAMLauncher.monitorApplication(UnmanagedAMLauncher.java:447)
+ at org.apache.hadoop.yarn.applications.unmanagedamlauncher.UnmanagedAMLauncher.run(UnmanagedAMLauncher.java:352)
+ at org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher.testDSShell(TestUnmanagedAMLauncher.java:145)
+{code}
+- YARN-1448.
+ Major sub-task reported by Wangda Tan and fixed by Wangda Tan (api , resourcemanager)
+ AM-RM protocol changes to support container resizing
+ As described in YARN-1197, we need add API in RM to support
+1) Add increase request in AllocateRequest
+2) Can get successfully increased/decreased containers from RM in AllocateResponse
+- YARN-1447.
+ Major sub-task reported by Wangda Tan and fixed by Wangda Tan (api)
+ Common PB type definitions for container resizing
+ As described in YARN-1197, we need add some common PB types for container resource change, like ResourceChangeContext, etc. These types will be both used by RM/NM protocols
+- YARN-1446.
+ Major sub-task reported by Jian He and fixed by Jian He (resourcemanager)
+ Change killing application to wait until state store is done
+ When user kills an application, it should wait until the state store is done with saving the killed status of the application. Otherwise, if RM crashes in the middle between user killing the application and writing the status to the store, RM will relaunch this application after it restarts.
+- YARN-1435.
+ Major bug reported by Tassapol Athiapinya and fixed by Xuan Gong (applications/distributed-shell)
+ Distributed Shell should not run other commands except "sh", and run the custom script at the same time.
+ Currently, if we want to run custom script at DS. We can do it like this :
+--shell_command sh --shell_script custom_script.sh
+But it may be better to separate running shell_command and shell_script
+- YARN-1425.
+ Major bug reported by Omkar Vinit Joshi and fixed by Omkar Vinit Joshi
+ TestRMRestart fails because MockRM.waitForState(AttemptId) uses current attempt instead of the attempt passed as argument
+ TestRMRestart is failing on trunk. Fixing it.
+- YARN-1423.
+ Major improvement reported by Sandy Ryza and fixed by Ted Malaska (scheduler)
+ Support queue placement by secondary group in the Fair Scheduler
+
+- YARN-1419.
+ Minor bug reported by Jonathan Eagles and fixed by Jonathan Eagles (scheduler)
+ TestFifoScheduler.testAppAttemptMetrics fails intermittently under jdk7
+ QueueMetrics holds its data in a static variable causing metrics to bleed over from test to test. clearQueueMetrics is to be called for tests that need to measure metrics correctly for a single test. jdk7 comes into play since tests are run out of order, and in the case make the metrics unreliable.
+- YARN-1416.
+ Major bug reported by Omkar Vinit Joshi and fixed by Jian He
+ InvalidStateTransitions getting reported in multiple test cases even though they pass
+ It might be worth checking why they are reporting this.
+Testcase : TestRMAppTransitions, TestRM
+there are large number of such errors.
+can't handle RMAppEventType.APP_UPDATE_SAVED at RMAppState.FAILED
+
+- YARN-1411.
+ Critical sub-task reported by Karthik Kambatla and fixed by Karthik Kambatla
+ HA config shouldn't affect NodeManager RPC addresses
+ When HA is turned on, {{YarnConfiguration#getSoketAddress()}} fetches rpc-addresses corresponding to the specified rm-id. This should only be for RM rpc-addresses. Other confs, like NM rpc-addresses shouldn't be affected by this.
+
+Currently, the NM address settings in yarn-site.xml aren't reflected in the actual ports.
+- YARN-1409.
+ Major bug reported by Tsuyoshi OZAWA and fixed by Tsuyoshi OZAWA
+ NonAggregatingLogHandler can throw RejectedExecutionException
+ This problem is caused by handling APPLICATION_FINISHED events after calling sched.shotdown() in NonAggregatingLongHandler#serviceStop(). org.apache.hadoop.mapred.TestJobCleanup can fail because of RejectedExecutionException by NonAggregatingLogHandler.
+
+{code}
+2013-11-13 10:53:06,970 FATAL [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:dispatch(166)) - Error in dispatcher thread
+java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@d51df63 rejected from java.util.concurrent.ScheduledThreadPoolExecutor@7a20e369[Shutting down, pool size = 4, active threads = 0, queued tasks = 7, completed tasks = 0]
+ at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
+ at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
+ at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:325)
+ at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:530)
+ at org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler.handle(NonAggregatingLogHandler.java:121)
+ at org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler.handle(NonAggregatingLogHandler.java:49)
+ at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:159)
+ at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:95)
+ at java.lang.Thread.run(Thread.java:724)
+{code}
+- YARN-1407.
+ Major bug reported by Sandy Ryza and fixed by Sandy Ryza
+ RM Web UI and REST APIs should uniformly use YarnApplicationState
+ RMAppState isn't a public facing enum like YarnApplicationState, so we shouldn't return values or list filters that come from it. However, some Blocks and AppInfo are still using RMAppState.
+
+It is not 100% clear to me whether or not fixing this would be a backwards-incompatible change. The change would only reduce the set of possible strings that the API returns, so I think not. We have also been changing the contents of RMAppState since 2.2.0, e.g. in YARN-891. It would still be good to fix this ASAP (i.e. for 2.2.1).
+- YARN-1405.
+ Major sub-task reported by Yesha Vora and fixed by Jian He
+ RM hangs on shutdown if calling system.exit in serviceInit or serviceStart
+ Enable yarn.resourcemanager.recovery.enabled=true and Pass a local path to yarn.resourcemanager.fs.state-store.uri. such as "file:///tmp/MYTMP"
+
+if the directory /tmp/MYTMP is not readable or writable, RM should crash and should print "Permission denied Error"
+
+Currently, RM throws "java.io.FileNotFoundException: File file:/tmp/MYTMP/FSRMStateRoot/RMDTSecretManagerRoot does not exist" Error. RM returns Exiting status 1 but RM process does not shutdown.
+
+Snapshot of Resource manager log:
+
+2013-09-27 18:31:36,621 INFO security.NMTokenSecretManagerInRM (NMTokenSecretManagerInRM.java:rollMasterKey(97)) - Rolling master-key for nm-tokens
+2013-09-27 18:31:36,694 ERROR resourcemanager.ResourceManager (ResourceManager.java:serviceStart(640)) - Failed to load/recover state
+java.io.FileNotFoundException: File file:/tmp/MYTMP/FSRMStateRoot/RMDTSecretManagerRoot does not exist
+ at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:379)
+ at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1478)
+ at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1518)
+ at org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:564)
+ at org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadRMDTSecretManagerState(FileSystemRMStateStore.java:188)
+ at org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadState(FileSystemRMStateStore.java:112)
+ at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:635)
+ at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
+ at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:855)
+2013-09-27 18:31:36,697 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1
+- YARN-1403.
+ Major improvement reported by Sandy Ryza and fixed by Sandy Ryza
+ Separate out configuration loading from QueueManager in the Fair Scheduler
+
+- YARN-1401.
+ Major bug reported by Gera Shegalov and fixed by Gera Shegalov (nodemanager)
+ With zero sleep-delay-before-sigkill.ms, no signal is ever sent
+ If you set in yarn-site.xml yarn.nodemanager.sleep-delay-before-sigkill.ms=0 then an unresponsive child JVM is never killed. In MRv1, TT used to immediately SIGKILL in this case.
+- YARN-1400.
+ Trivial bug reported by Raja Aluri and fixed by Raja Aluri (resourcemanager)
+ yarn.cmd uses HADOOP_RESOURCEMANAGER_OPTS. Should be YARN_RESOURCEMANAGER_OPTS.
+ yarn.cmd uses HADOOP_RESOURCEMANAGER_OPTS. Should be YARN_RESOURCEMANAGER_OPTS.
+- YARN-1395.
+ Major bug reported by Chris Nauroth and fixed by Chris Nauroth (applications/distributed-shell)
+ Distributed shell application master launched with debug flag can hang waiting for external ls process.
+ Distributed shell launched with the debug flag will run {{ApplicationMaster#dumpOutDebugInfo}}. This method launches an external process to run ls and print the contents of the current working directory. We've seen that this can cause the application master to hang on {{Process#waitFor}}.
+- YARN-1392.
+ Major new feature reported by Sandy Ryza and fixed by Sandy Ryza (scheduler)
+ Allow sophisticated app-to-queue placement policies in the Fair Scheduler
+ Currently the Fair Scheduler supports app-to-queue placement by username. It would be beneficial to allow more sophisticated policies that rely on primary and secondary groups and fallbacks.
+- YARN-1388.
+ Trivial bug reported by Liyin Liang and fixed by Liyin Liang (resourcemanager)
+ Fair Scheduler page always displays blank fair share
+ YARN-1044 fixed min/max/used resource display problem in the scheduler page. But the "Fair Share" has the same problem and need to fix it.
+- YARN-1387.
+ Major improvement reported by Karthik Kambatla and fixed by Karthik Kambatla (api)
+ RMWebServices should use ClientRMService for filtering applications
+ YARN's REST API allows filtering applications, this should be moved to ClientRMService to allow Java API also support the same functionality.
+- YARN-1386.
+ Critical bug reported by Jason Lowe and fixed by Jason Lowe (nodemanager)
+ NodeManager mistakenly loses resources and relocalizes them
+ When a local resource that should already be present is requested again, the nodemanager checks to see if it still present. However the method it uses to check for presence is via File.exists() as the user of the nodemanager process. If the resource was a private resource localized for another user, it will be localized to a location that is not accessible by the nodemanager user. Therefore File.exists() returns false, the nodemanager mistakenly believes the resource is no longer available, and it proceeds to localize it over and over.
+- YARN-1381.
+ Minor bug reported by Ted Yu and fixed by Ted Yu
+ Same relaxLocality appears twice in exception message of AMRMClientImpl#checkLocalityRelaxationConflict()
+ Here is related code:
+{code}
+ throw new InvalidContainerRequestException("Cannot submit a "
+ + "ContainerRequest asking for location " + location
+ + " with locality relaxation " + relaxLocality + " when it has "
+ + "already been requested with locality relaxation " + relaxLocality);
+{code}
+The last relaxLocality should be reqs.values().iterator().next().remoteRequest.getRelaxLocality()
+- YARN-1378.
+ Major sub-task reported by Jian He and fixed by Jian He (resourcemanager)
+ Implement a RMStateStore cleaner for deleting application/attempt info
+ Now that we are storing the final state of application/attempt instead of removing application/attempt info on application/attempt completion(YARN-891), we need a separate RMStateStore cleaner for cleaning the application/attempt state.
+- YARN-1374.
+ Blocker bug reported by Devaraj K and fixed by Karthik Kambatla (resourcemanager)
+ Resource Manager fails to start due to ConcurrentModificationException
+ Resource Manager is failing to start with the below ConcurrentModificationException.
+
+{code:xml}
+2013-10-30 20:22:42,371 INFO org.apache.hadoop.util.HostsFileReader: Refreshing hosts (include/exclude) list
+2013-10-30 20:22:42,376 INFO org.apache.hadoop.service.AbstractService: Service ResourceManager failed in state INITED; cause: java.util.ConcurrentModificationException
+java.util.ConcurrentModificationException
+ at java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
+ at java.util.AbstractList$Itr.next(AbstractList.java:343)
+ at java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1010)
+ at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
+ at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:187)
+ at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
+ at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:944)
+2013-10-30 20:22:42,378 INFO org.apache.hadoop.yarn.server.resourcemanager.RMHAProtocolService: Transitioning to standby
+2013-10-30 20:22:42,378 INFO org.apache.hadoop.yarn.server.resourcemanager.RMHAProtocolService: Transitioned to standby
+2013-10-30 20:22:42,378 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting ResourceManager
+java.util.ConcurrentModificationException
+ at java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
+ at java.util.AbstractList$Itr.next(AbstractList.java:343)
+ at java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1010)
+ at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
+ at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:187)
+ at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
+ at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:944)
+2013-10-30 20:22:42,379 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: SHUTDOWN_MSG:
+/************************************************************
+SHUTDOWN_MSG: Shutting down ResourceManager at HOST-10-18-40-24/10.18.40.24
+************************************************************/
+{code}
+- YARN-1358.
+ Minor test reported by Chuan Liu and fixed by Chuan Liu (client)
+ TestYarnCLI fails on Windows due to line endings
+ The unit test fails on Windows due to incorrect line endings was used for comparing the output from command line output. Error messages are as follows.
+{noformat}
+junit.framework.ComparisonFailure: expected:<...argument for options[]
+usage: application
+...> but was:<...argument for options[
+]
+usage: application
+...>
+ at junit.framework.Assert.assertEquals(Assert.java:85)
+ at junit.framework.Assert.assertEquals(Assert.java:91)
+ at org.apache.hadoop.yarn.client.cli.TestYarnCLI.testMissingArguments(TestYarnCLI.java:878)
+{noformat}
+- YARN-1357.
+ Minor test reported by Chuan Liu and fixed by Chuan Liu (nodemanager)
+ TestContainerLaunch.testContainerEnvVariables fails on Windows
+ This test fails on Windows due to incorrect use of batch script command. Error messages are as follows.
+{noformat}
+junit.framework.AssertionFailedError: expected:<java.nio.HeapByteBuffer[pos=0 lim=19 cap=19]> but was:<java.nio.HeapByteBuffer[pos=0 lim=19 cap=19]>
+ at junit.framework.Assert.fail(Assert.java:50)
+ at junit.framework.Assert.failNotEquals(Assert.java:287)
+ at junit.framework.Assert.assertEquals(Assert.java:67)
+ at junit.framework.Assert.assertEquals(Assert.java:74)
+ at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch.testContainerEnvVariables(TestContainerLaunch.java:508)
+{noformat}
+- YARN-1351.
+ Trivial bug reported by Konstantin Weitz and fixed by Konstantin Weitz (resourcemanager)
+ Invalid string format in Fair Scheduler log warn message
+ While trying to print a warning, two values of the wrong type (Resource instead of int) are passed into a String.format method call, leading to a runtime exception, in the file:
+
+_trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueueManager.java_.
+
+The warning was intended to be printed whenever the resources don't fit into each other, either because the number of virtual cores or the memory is too small. I changed the %d's into %s, this way the warning will contain both the cores and the memory.
+
+- YARN-1349.
+ Major bug reported by Chris Nauroth and fixed by Chris Nauroth (client)
+ yarn.cmd does not support passthrough to any arbitrary class.
+ The yarn shell script supports passthrough to calling any arbitrary class if the first argument is not one of the per-defined sub-commands. The equivalent cmd script does not implement this and instead fails trying to do a labeled goto to the first argument.
+- YARN-1343.
+ Critical bug reported by Alejandro Abdelnur and fixed by Alejandro Abdelnur (resourcemanager)
+ NodeManagers additions/restarts are not reported as node updates in AllocateResponse responses to AMs
+ If a NodeManager joins the cluster or gets restarted, running AMs never receive the node update indicating the Node is running.
+- YARN-1335.
+ Major improvement reported by Sandy Ryza and fixed by Sandy Ryza (scheduler)
+ Move duplicate code from FSSchedulerApp and FiCaSchedulerApp into SchedulerApplication
+ FSSchedulerApp and FiCaSchedulerApp use duplicate code in a lot of places. They both extend SchedulerApplication. We can move a lot of this duplicate code into SchedulerApplication.
+- YARN-1333.
+ Major improvement reported by Sandy Ryza and fixed by Tsuyoshi OZAWA (scheduler)
+ Support blacklisting in the Fair Scheduler
+
+- YARN-1332.
+ Minor improvement reported by Sandy Ryza and fixed by Sebastian Wong
+ In TestAMRMClient, replace assertTrue with assertEquals where possible
+ TestAMRMClient uses a lot of "assertTrue(amClient.ask.size() == 0)" where "assertEquals(0, amClient.ask.size())" would make it easier to see why it's failing at a glance.
+- YARN-1331.
+ Trivial bug reported by Chris Nauroth and fixed by Chris Nauroth (client)
+ yarn.cmd exits with NoClassDefFoundError trying to run rmadmin or logs
+ The yarn shell script was updated so that the rmadmin and logs sub-commands launch {{org.apache.hadoop.yarn.client.cli.RMAdminCLI}} and {{org.apache.hadoop.yarn.client.cli.LogsCLI}}. The yarn.cmd script also needs to be updated so that the commands work on Windows.
+- YARN-1325.
+ Major sub-task reported by Tsuyoshi OZAWA and fixed by Xuan Gong (resourcemanager)
+ Enabling HA should check Configuration contains multiple RMs
+ Currently, we can enable RM HA configuration without multiple RM ids(YarnConfiguration.RM_HA_IDS). This behaviour can cause wrong operations. ResourceManager should verify that more than 1 RM id must be specified in RM-HA-IDs.
+
+One idea is to support "strict mode" to enforce this check as configuration(e.g. yarn.resourcemanager.ha.strict-mode.enabled).
+- YARN-1323.
+ Major sub-task reported by Karthik Kambatla and fixed by Karthik Kambatla
+ Set HTTPS webapp address along with other RPC addresses in HAUtil
+ YARN-1232 adds the ability to configure multiple RMs, but missed out the https web app address. Need to add that in.
+- YARN-1321.
+ Blocker bug reported by Alejandro Abdelnur and fixed by Alejandro Abdelnur (client)
+ NMTokenCache is a singleton, prevents multiple AMs running in a single JVM to work correctly
+ NMTokenCache is a singleton. Because of this, if running multiple AMs in a single JVM NMTokens for the same node from different AMs step on each other and starting containers fail due to mismatch tokens.
+
+The error observed in the client side is something like:
+
+{code}
+ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:llama (auth:PROXY) via llama (auth:SIMPLE) cause:org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.
+NMToken for application attempt : appattempt_1382038445650_0002_000001 was used for starting container with container token issued for application attempt : appattempt_1382038445650_0001_000001
+{code}
+
+- YARN-1320.
+ Major bug reported by Tassapol Athiapinya and fixed by Xuan Gong (applications/distributed-shell)
+ Custom log4j properties in Distributed shell does not work properly.
+ Distributed shell cannot pick up custom log4j properties (specified with -log_properties). It always uses default log4j properties.
+- YARN-1318.
+ Blocker sub-task reported by Karthik Kambatla and fixed by Karthik Kambatla (resourcemanager)
+ Promote AdminService to an Always-On service and merge in RMHAProtocolService
+ Per discussion in YARN-1068, we want AdminService to handle HA-admin operations in addition to the regular non-HA admin operations. To facilitate this, we need to move AdminService an Always-On service.
+- YARN-1315.
+ Major bug reported by Sandy Ryza and fixed by Sandy Ryza (resourcemanager , scheduler)
+ TestQueueACLs should also test FairScheduler
+
+- YARN-1314.
+ Major bug reported by Tassapol Athiapinya and fixed by Xuan Gong (applications/distributed-shell)
+ Cannot pass more than 1 argument to shell command
+ Distributed shell cannot accept more than 1 parameters in argument parts.
+
+All of these commands are treated as 1 parameter:
+
+/usr/bin/yarn org.apache.hadoop.yarn.applications.distributedshell.Client -jar <distrubuted shell jar> -shell_command echo -shell_args "'"My name" "is Teddy"'"
+/usr/bin/yarn org.apache.hadoop.yarn.applications.distributedshell.Client -jar <distrubuted shell jar> -shell_command echo -shell_args "''My name' 'is Teddy''"
+/usr/bin/yarn org.apache.hadoop.yarn.applications.distributedshell.Client -jar <distrubuted shell jar> -shell_command echo -shell_args "'My name' 'is Teddy'"
+- YARN-1311.
+ Trivial sub-task reported by Vinod Kumar Vavilapalli and fixed by Vinod Kumar Vavilapalli
+ Fix app specific scheduler-events' names to be app-attempt based
+ Today, APP_ADDED and APP_REMOVED are sent to the scheduler. They are misnomers as schedulers only deal with AppAttempts today. This JIRA is for fixing their names so that we can add App-level events in the near future, notably for work-preserving RM-restart.
+- YARN-1307.
+ Major sub-task reported by Tsuyoshi OZAWA and fixed by Tsuyoshi OZAWA (resourcemanager)
+ Rethink znode structure for RM HA
+ Rethink for znode structure for RM HA is proposed in some JIRAs(YARN-659, YARN-1222). The motivation of this JIRA is quoted from Bikas' comment in YARN-1222:
+{quote}
+We should move to creating a node hierarchy for apps such that all znodes for an app are stored under an app znode instead of the app root znode. This will help in removeApplication and also in scaling better on ZK. The earlier code was written this way to ensure create/delete happens under a root znode for fencing. But given that we have moved to multi-operations globally, this isnt required anymore.
+{quote}
+- YARN-1306.
+ Major bug reported by Wei Yan and fixed by Wei Yan
+ Clean up hadoop-sls sample-conf according to YARN-1228
+ Move fair scheduler allocations configuration to fair-scheduler.xml, and move all scheduler stuffs to yarn-site.xml
+- YARN-1305.
+ Major sub-task reported by Tsuyoshi OZAWA and fixed by Tsuyoshi OZAWA (resourcemanager)
+ RMHAProtocolService#serviceInit should handle HAUtil's IllegalArgumentException
+ When yarn.resourcemanager.ha.enabled is true, RMHAProtocolService#serviceInit calls HAUtil.setAllRpcAddresses. If the configuration values are null, it just throws IllegalArgumentException.
+It's messy to analyse which keys are null, so we should handle it and log the name of keys which are null.
+
+A current log dump is as follows:
+{code}
+2013-10-15 06:24:53,431 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: registered UNIX signal handlers for [TERM, HUP, INT]
+2013-10-15 06:24:54,203 INFO org.apache.hadoop.service.AbstractService: Service RMHAProtocolService failed in state INITED; cause: java.lang.IllegalArgumentException: Property value must not be null
+java.lang.IllegalArgumentException: Property value must not be null
+ at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
+ at org.apache.hadoop.conf.Configuration.set(Configuration.java:816)
+ at org.apache.hadoop.conf.Configuration.set(Configuration.java:798)
+ at org.apache.hadoop.yarn.conf.HAUtil.setConfValue(HAUtil.java:100)
+ at org.apache.hadoop.yarn.conf.HAUtil.setAllRpcAddresses(HAUtil.java:105)
+ at org.apache.hadoop.yarn.server.resourcemanager.RMHAProtocolService.serviceInit(RMHAProtocolService.java:60)
+ at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
+ at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
+ at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:187)
+ at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
+ at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:940)
+{code}
+- YARN-1303.
+ Major improvement reported by Tassapol Athiapinya and fixed by Xuan Gong (applications/distributed-shell)
+ Allow multiple commands separating with ";" in distributed-shell
+ In shell, we can do "ls; ls" to run 2 commands at once.
+
+In distributed shell, this is not working. We should improve to allow this to occur. There are practical use cases that I know of to run multiple commands or to set environment variables before a command.
+- YARN-1300.
+ Major bug reported by Ted Yu and fixed by Ted Yu
+ SLS tests fail because conf puts yarn properties in fair-scheduler.xml
+ I was looking at https://builds.apache.org/job/PreCommit-YARN-Build/2165//testReport/org.apache.hadoop.yarn.sls/TestSLSRunner/testSimulatorRunning/
+I am able to reproduce the failure locally.
+
+I found that FairSchedulerConfiguration.getAllocationFile() doesn't read the yarn.scheduler.fair.allocation.file config entry from fair-scheduler.xml
+
+This leads to the following:
+{code}
+Caused by: org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfigurationException: Bad fair scheduler config file: top-level element not <allocations>
+ at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.reloadAllocs(QueueManager.java:302)
+ at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.initialize(QueueManager.java:108)
+ at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.reinitialize(FairScheduler.java:1145)
+{code}
+- YARN-1295.
+ Major bug reported by Sandy Ryza and fixed by Sandy Ryza (nodemanager)
+ In UnixLocalWrapperScriptBuilder, using bash -c can cause "Text file busy" errors
+ I missed this when working on YARN-1271.
+- YARN-1293.
+ Major bug reported by Tsuyoshi OZAWA and fixed by Tsuyoshi OZAWA
+ TestContainerLaunch.testInvalidEnvSyntaxDiagnostics fails on trunk
+ {quote}
+-------------------------------------------------------------------------------
+Test set: org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch
+-------------------------------------------------------------------------------
+Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 12.655 sec <<< FAILURE! - in org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch
+testInvalidEnvSyntaxDiagnostics(org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch) Time elapsed: 0.114 sec <<< FAILURE!
+junit.framework.AssertionFailedError: null
+ at junit.framework.Assert.fail(Assert.java:48)
+ at junit.framework.Assert.assertTrue(Assert.java:20)
+ at junit.framework.Assert.assertTrue(Assert.java:27)
+ at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch.testInvalidEnvSyntaxDiagnostics(TestContainerLaunch.java:273)
+{quote}
+- YARN-1290.
+ Major improvement reported by Wei Yan and fixed by Wei Yan
+ Let continuous scheduling achieve more balanced task assignment
+ Currently, in continuous scheduling (YARN-1010), in each round, the thread iterates over pre-ordered nodes and assigns tasks. This mechanism may overload the first several nodes, while the latter nodes have no tasks.
+
+We should sort all nodes according to available resource. In each round, always assign tasks to nodes with larger capacity, which can balance the load distribution among all nodes.
+- YARN-1288.
+ Major bug reported by Sandy Ryza and fixed by Sandy Ryza (scheduler)
+ Make Fair Scheduler ACLs more user friendly
+ The Fair Scheduler currently defaults the root queue's acl to empty and all other queues' acl to "*". Now that YARN-1258 enables configuring the root queue, we should reverse this. This will also bring the Fair Scheduler in line with the Capacity Scheduler.
+
+We should also not trim the acl strings, which makes it impossible to only specify groups in an acl.
+- YARN-1284.
+ Blocker bug reported by Alejandro Abdelnur and fixed by Alejandro Abdelnur (nodemanager)
+ LCE: Race condition leaves dangling cgroups entries for killed containers
+ When LCE & cgroups are enabled, when a container is is killed (in this case by its owning AM, an MRAM) it seems to be a race condition at OS level when doing a SIGTERM/SIGKILL and when the OS does all necessary cleanup.
+
+LCE code, after sending the SIGTERM/SIGKILL and getting the exitcode, immediately attempts to clean up the cgroups entry for the container. But this is failing with an error like:
+
+{code}
+2013-10-07 15:21:24,359 WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Exit code from container container_1381179532433_0016_01_000011 is : 143
+2013-10-07 15:21:24,359 DEBUG org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Processing container_1381179532433_0016_01_000011 of type UPDATE_DIAGNOSTICS_MSG
+2013-10-07 15:21:24,359 DEBUG org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler: deleteCgroup: /run/cgroups/cpu/hadoop-yarn/container_1381179532433_0016_01_000011
+2013-10-07 15:21:24,359 WARN org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler: Unable to delete cgroup at: /run/cgroups/cpu/hadoop-yarn/container_1381179532433_0016_01_000011
+{code}
+
+
+CgroupsLCEResourcesHandler.clearLimits() has logic to wait for 500 ms for AM containers to avoid this problem. it seems this should be done for all containers.
+
+Still, waiting for extra 500ms seems too expensive.
+
+We should look at a way of doing this in a more 'efficient way' from time perspective, may be spinning while the deleteCgroup() cannot be done with a minimal sleep and a timeout.
+
+- YARN-1283.
+ Major sub-task reported by Yesha Vora and fixed by Omkar Vinit Joshi
+ Invalid 'url of job' mentioned in Job output with yarn.http.policy=HTTPS_ONLY
+ After setting yarn.http.policy=HTTPS_ONLY, the job output shows incorrect "The url to track the job".
+
+Currently, its printing http://RM:<httpsport>/proxy/application_1381162886563_0001/ instead https://RM:<httpsport>/proxy/application_1381162886563_0001/
+
+http://hostname:8088/proxy/application_1381162886563_0001/ is invalid
+
+hadoop jar hadoop-mapreduce-client-jobclient-tests.jar sleep -m 1 -r 1
+13/10/07 18:39:39 INFO client.RMProxy: Connecting to ResourceManager at hostname/100.00.00.000:8032
+13/10/07 18:39:40 INFO mapreduce.JobSubmitter: number of splits:1
+13/10/07 18:39:40 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
+13/10/07 18:39:40 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
+13/10/07 18:39:40 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
+13/10/07 18:39:40 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
+13/10/07 18:39:40 INFO Configuration.deprecation: mapreduce.partitioner.class is deprecated. Instead, use mapreduce.job.partitioner.class
+13/10/07 18:39:40 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
+13/10/07 18:39:40 INFO Configuration.deprecation: mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class
+13/10/07 18:39:40 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
+13/10/07 18:39:40 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
+13/10/07 18:39:40 INFO Configuration.deprecation: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class
+13/10/07 18:39:40 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class
+13/10/07 18:39:40 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
+13/10/07 18:39:40 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class
+13/10/07 18:39:40 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
+13/10/07 18:39:40 INFO Configuration.deprecation: mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class
+13/10/07 18:39:40 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
+13/10/07 18:39:40 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1381162886563_0001
+13/10/07 18:39:40 INFO impl.YarnClientImpl: Submitted application application_1381162886563_0001 to ResourceManager at hostname/100.00.00.000:8032
+13/10/07 18:39:40 INFO mapreduce.Job: The url to track the job: http://hostname:8088/proxy/application_1381162886563_0001/
+13/10/07 18:39:40 INFO mapreduce.Job: Running job: job_1381162886563_0001
+13/10/07 18:39:46 INFO mapreduce.Job: Job job_1381162886563_0001 running in uber mode : false
+13/10/07 18:39:46 INFO mapreduce.Job: map 0% reduce 0%
+13/10/07 18:39:53 INFO mapreduce.Job: map 100% reduce 0%
+13/10/07 18:39:58 INFO mapreduce.Job: map 100% reduce 100%
+13/10/07 18:39:58 INFO mapreduce.Job: Job job_1381162886563_0001 completed successfully
+13/10/07 18:39:58 INFO mapreduce.Job: Counters: 43
+ File System Counters
+ FILE: Number of bytes read=26
+ FILE: Number of bytes written=177279
+ FILE: Number of read operations=0
+ FILE: Number of large read operations=0
+ FILE: Number of write operations=0
+ HDFS: Number of bytes read=48
+ HDFS: Number of bytes written=0
+ HDFS: Number of read operations=1
+ HDFS: Number of large read operations=0
+ HDFS: Number of write operations=0
+ Job Counters
+ Launched map tasks=1
+ Launched reduce tasks=1
+ Other local map tasks=1
+ Total time spent by all maps in occupied slots (ms)=7136
+ Total time spent by all reduces in occupied slots (ms)=6062
+ Map-Reduce Framework
+ Map input records=1
+ Map output records=1
+ Map output bytes=4
+ Map output materialized bytes=22
+ Input split bytes=48
+ Combine input records=0
+ Combine output records=0
+ Reduce input groups=1
+ Reduce shuffle bytes=22
+ Reduce input records=1
+ Reduce output records=0
+ Spilled Records=2
+ Shuffled Maps =1
+ Failed Shuffles=0
+ Merged Map outputs=1
+ GC time elapsed (ms)=60
+ CPU time spent (ms)=1700
+ Physical memory (bytes) snapshot=567582720
+ Virtual memory (bytes) snapshot=4292997120
+ Total committed heap usage (bytes)=846594048
+ Shuffle Errors
+ BAD_ID=0
+ CONNECTION=0
+ IO_ERROR=0
+ WRONG_LENGTH=0
+ WRONG_MAP=0
+ WRONG_REDUCE=0
+ File Input Format Counters
+ Bytes Read=0
+ File Output Format Counters
+ Bytes Written=0
+
+
+- YARN-1268.
+ Major bug reported by Sandy Ryza and fixed by Sandy Ryza (scheduler)
+ TestFairScheduler.testContinuousScheduling is flaky
+ It looks like there's a timeout in it that's causing it to be flaky.
+- YARN-1265.
+ Major bug reported by Sandy Ryza and fixed by Sandy Ryza (resourcemanager , scheduler)
+ Fair Scheduler chokes on unhealthy node reconnect
+ Only nodes in the RUNNING state are tracked by schedulers. When a node reconnects, RMNodeImpl.ReconnectNodeTransition tries to remove it, even if it's in the RUNNING state. The FairScheduler doesn't guard against this.
+
+I think the best way to fix this is to check to see whether a node is RUNNING before telling the scheduler to remove it.
+- YARN-1259.
+ Trivial bug reported by Sandy Ryza and fixed by Robert Kanter (scheduler)
+ In Fair Scheduler web UI, queue num pending and num active apps switched
+ The values returned in FairSchedulerLeafQueueInfo by numPendingApplications and numActiveApplications should be switched.
+- YARN-1258.
+ Major improvement reported by Sandy Ryza and fixed by Sandy Ryza (scheduler)
+ Allow configuring the Fair Scheduler root queue
+ This would be useful for acls, maxRunningApps, scheduling modes, etc.
+
+The allocation file should be able to accept both:
+* An implicit root queue
+* A root queue at the top of the hierarchy with all queues under/inside of it
+- YARN-1253.
+ Blocker new feature reported by Alejandro Abdelnur and fixed by Roman Shaposhnik (nodemanager)
+ Changes to LinuxContainerExecutor to run containers as a single dedicated user in non-secure mode
+ When using cgroups we require LCE to be configured in the cluster to start containers.
+
+When LCE starts containers as the user that submitted the job. While this works correctly in a secure setup, in an un-secure setup this presents a couple issues:
+
+* LCE requires all Hadoop users submitting jobs to be Unix users in all nodes
+* Because users can impersonate other users, any user would have access to any local file of other users
+
+Particularly, the second issue is not desirable as a user could get access to ssh keys of other users in the nodes or if there are NFS mounts, get to other users data outside of the cluster.
+- YARN-1241.
+ Major bug reported by Sandy Ryza and fixed by Sandy Ryza
+ In Fair Scheduler, maxRunningApps does not work for non-leaf queues
+ Setting the maxRunningApps property on a parent queue should make it that the sum of apps in all subqueues can't exceed it
+- YARN-1239.
+ Major sub-task reported by Bikas Saha and fixed by Jian He (resourcemanager)
+ Save version information in the state store
+ When creating root dir for the first time we should write version 1. If root dir exists then we should check that the version in the state store matches the version from config.
+- YARN-1232.
+ Major sub-task reported by Karthik Kambatla and fixed by Karthik Kambatla (resourcemanager)
+ Configuration to support multiple RMs
+ We should augment the configuration to allow users specify two RMs and the individual RPC addresses for them.
+- YARN-1222.
+ Major sub-task reported by Bikas Saha and fixed by Karthik Kambatla
+ Make improvements in ZKRMStateStore for fencing
+ Using multi-operations for every ZK interaction.
+In every operation, automatically creating/deleting a lock znode that is the child of the root znode. This is to achieve fencing by modifying the create/delete permissions on the root znode.
+- YARN-1210.
+ Major sub-task reported by Vinod Kumar Vavilapalli and fixed by Omkar Vinit Joshi
+ During RM restart, RM should start a new attempt only when previous attempt exits for real
+ When RM recovers, it can wait for existing AMs to contact RM back and then kill them forcefully before even starting a new AM. Worst case, RM will start a new AppAttempt after waiting for 10 mins ( the expiry interval). This way we'll minimize multiple AMs racing with each other. This can help issues with downstream components like Pig, Hive and Oozie during RM restart.
+
+In the mean while, new apps will proceed as usual as existing apps wait for recovery.
+
+This can continue to be useful after work-preserving restart, so that AMs which can properly sync back up with RM can continue to run and those that don't are guaranteed to be killed before starting a new attempt.
+- YARN-1199.
+ Major improvement reported by Mit Desai and fixed by Mit Desai
+ Make NM/RM Versions Available
+ Now as we have the NM and RM Versions available, we can display the YARN version of nodes running in the cluster.
+
+
+- YARN-1188.
+ Trivial bug reported by Akira AJISAKA and fixed by Tsuyoshi OZAWA
+ The context of QueueMetrics becomes 'default' when using FairScheduler
+ I found the context of QueueMetrics changed to 'default' from 'yarn' when I was using FairScheduler.
+The context should always be 'yarn' by adding an annotation to FSQueueMetrics like below:
+
+{code}
++ @Metrics(context="yarn")
+public class FSQueueMetrics extends QueueMetrics {
+{code}
+- YARN-1185.
+ Major sub-task reported by Jason Lowe and fixed by Omkar Vinit Joshi (resourcemanager)
+ FileSystemRMStateStore can leave partial files that prevent subsequent recovery
+ FileSystemRMStateStore writes directly to the destination file when storing state. However if the RM were to crash in the middle of the write, the recovery method could encounter a partially-written file and either outright crash during recovery or silently load incomplete state.
+
+To avoid this, the data should be written to a temporary file and renamed to the destination file afterwards.
+- YARN-1183.
+ Major bug reported by Andrey Klochkov and fixed by Andrey Klochkov
+ MiniYARNCluster shutdown takes several minutes intermittently
+ As described in MAPREDUCE-5501 sometimes M/R tests leave MRAppMaster java processes living for several minutes after successful completion of the corresponding test. There is a concurrency issue in MiniYARNCluster shutdown logic which leads to this. Sometimes RM stops before an app master sends it's last report, and then the app master keeps retrying for >6 minutes. In some cases it leads to failures in subsequent tests, and it affects performance of tests as app masters eat resources.
+- YARN-1182.
+ Major bug reported by Karthik Kambatla and fixed by Karthik Kambatla
+ MiniYARNCluster creates and inits the RM/NM only on start()
+ MiniYARNCluster creates and inits the RM/NM only on start(). It should create and init() during init() itself.
+- YARN-1181.
+ Major sub-task reported by Karthik Kambatla and fixed by Karthik Kambatla
+ Augment MiniYARNCluster to support HA mode
+ MiniYARNHACluster, along the lines of MiniYARNCluster, is needed for end-to-end HA tests.
+- YARN-1180.
+ Trivial bug reported by Thomas Graves and fixed by Chen He (capacityscheduler)
+ Update capacity scheduler docs to include types on the configs
+ The capacity scheduler docs (http://hadoop.apache.org/docs/r2.1.0-beta/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html) don't include types for all the configs. For instance the minimum-user-limit-percent doesn't say its an Int. It also the only setting for the Resource Allocation configs that is an Int rather then a float.
+- YARN-1176.
+ Critical bug reported by Thomas Graves and fixed by Jonathan Eagles (resourcemanager)
+ RM web services ClusterMetricsInfo total nodes doesn't include unhealthy nodes
+ In the web services api for the cluster/metrics, the totalNodes reported doesn't include the unhealthy nodes.
+
+this.totalNodes = activeNodes + lostNodes + decommissionedNodes
+ + rebootedNodes;
+- YARN-1172.
+ Major sub-task reported by Karthik Kambatla and fixed by Tsuyoshi OZAWA (resourcemanager)
+ Convert *SecretManagers in the RM to services
+
+- YARN-1145.
+ Major bug reported by Rohith and fixed by Rohith
+ Potential file handle leak in aggregated logs web ui
+ Any problem in getting aggregated logs for rendering on web ui, then LogReader is not closed.
+
+Now, it reader is not closed which causing many connections in close_wait state.
+
+hadoopuser@hadoopuser:> jps
+*27909* JobHistoryServer
+
+DataNode port is 50010. When greped with DataNode port, many connections are in CLOSE_WAIT from JHS.
+hadoopuser@hadoopuser:> netstat -tanlp |grep 50010
+tcp 0 0 10.18.40.48:50010 0.0.0.0:* LISTEN 21453/java
+tcp 1 0 10.18.40.48:20596 10.18.40.48:50010 CLOSE_WAIT *27909*/java
+tcp 1 0 10.18.40.48:19667 10.18.40.152:50010 CLOSE_WAIT *27909*/java
+tcp 1 0 10.18.40.48:20593 10.18.40.48:50010 CLOSE_WAIT *27909*/java
+tcp 1 0 10.18.40.48:12290 10.18.40.48:50010 CLOSE_WAIT *27909*/java
+tcp 1 0 10.18.40.48:19662 10.18.40.152:50010 CLOSE_WAIT *27909*/java
+- YARN-1138.
+ Major bug reported by Yingda Chen and fixed by Chuan Liu (api)
+ yarn.application.classpath is set to point to $HADOOP_CONF_DIR etc., which does not work on Windows
+ yarn-default.xml has "yarn.application.classpath" entry set to $HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/share/hadoop/common/,$HADOOP_COMMON_HOME/share/hadoop/common/lib/,$HADOOP_HDFS_HOME/share/hadoop/hdfs/,$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/,$HADOOP_YARN_HOME/share/hadoop/yarn/*,$HADOOP_YARN_HOME/share/hadoop/yarn/lib. It does not work on Windows which needs to be fixed.
+- YARN-1121.
+ Major sub-task reported by Bikas Saha and fixed by Jian He (resourcemanager)
+ RMStateStore should flush all pending store events before closing
+ on serviceStop it should wait for all internal pending events to drain before stopping.
+- YARN-1119.
+ Major test reported by Robert Parker and fixed by Mit Desai (resourcemanager)
+ Add ClusterMetrics checks to tho TestRMNodeTransitions tests
+ YARN-1101 identified an issue where UNHEALTHY nodes could double decrement the active nodes. We should add checks for RUNNING node transitions.
+- YARN-1109.
+ Major improvement reported by Sandy Ryza and fixed by haosdent (nodemanager)
+ Demote NodeManager "Sending out status for container" logs to debug
+ Diagnosing NodeManager and container launch problems is made more difficult by the enormous number of logs like
+{code}
+Sending out status for container: container_id {, app_attempt_id {, application_id {, id: 18, cluster_timestamp: 1377559361179, }, attemptId: 1, }, id: 1337, }, state: C_RUNNING, diagnostics: "Container killed by the ApplicationMaster.\n", exit_status: -1000
+{code}
+
+On an NM with a few containers I am seeing tens of these per second.
+- YARN-1101.
+ Major bug reported by Robert Parker and fixed by Robert Parker (resourcemanager)
+ Active nodes can be decremented below 0
+ The issue is in RMNodeImpl where both RUNNING and UNHEALTHY states that transition to a deactive state (LOST, DECOMMISSIONED, REBOOTED) use the same DeactivateNodeTransition class. The DeactivateNodeTransition class naturally decrements the active node, however the in cases where the node has transition to UNHEALTHY the active count has already been decremented.
+- YARN-1098.
+ Major sub-task reported by Karthik Kambatla and fixed by Karthik Kambatla (resourcemanager)
+ Separate out RM services into "Always On" and "Active"
+ From discussion on YARN-1027, it makes sense to separate out services that are stateful and stateless. The stateless services can run perennially irrespective of whether the RM is in Active/Standby state, while the stateful services need to be started on transitionToActive() and completely shutdown on transitionToStandby().
+
+The external-facing stateless services should respond to the client/AM/NM requests depending on whether the RM is Active/Standby.
+
+- YARN-1068.
+ Major sub-task reported by Karthik Kambatla and fixed by Karthik Kambatla (resourcemanager)
+ Add admin support for HA operations
+ Support HA admin operations to facilitate transitioning the RM to Active and Standby states.
+- YARN-1060.
+ Major bug reported by Sandy Ryza and fixed by Niranjan Singh (scheduler)
+ Two tests in TestFairScheduler are missing @Test annotation
+ Amazingly, these tests appear to pass with the annotations added.
+- YARN-1053.
+ Blocker bug reported by Omkar Vinit Joshi and fixed by Omkar Vinit Joshi
+ Diagnostic message from ContainerExitEvent is ignored in ContainerImpl
+ If the container launch fails then we send ContainerExitEvent. This event contains exitCode and diagnostic message. Today we are ignoring diagnostic message while handling this event inside ContainerImpl. Fixing it as it is useful in diagnosing the failure.
+- YARN-1044.
+ Critical bug reported by Sangjin Lee and fixed by Sangjin Lee (resourcemanager , scheduler)
+ used/min/max resources do not display info in the scheduler page
+ Go to the scheduler page in RM, and click any queue to display the detailed info. You'll find that none of the resources entries (used, min, or max) would display values.
+
+It is because the values contain brackets ("<" and ">") and are not properly html-escaped.
+- YARN-1033.
+ Major sub-task reported by Nemon Lou and fixed by Karthik Kambatla
+ Expose RM active/standby state to Web UI and REST API
+ Both active and standby RM shall expose it's web server and show it's current state (active or standby) on web page. Users should be able to access this information through the REST API as well.
+- YARN-1029.
+ Major sub-task reported by Bikas Saha and fixed by Karthik Kambatla
+ Allow embedding leader election into the RM
+ It should be possible to embed common ActiveStandyElector into the RM such that ZooKeeper based leader election and notification is in-built. In conjunction with a ZK state store, this configuration will be a simple deployment option.
+- YARN-1028.
+ Major sub-task reported by Bikas Saha and fixed by Karthik Kambatla
+ Add FailoverProxyProvider like capability to RMProxy
+ RMProxy layer currently abstracts RM discovery and implements it by looking up service information from configuration. Motivated by HDFS and using existing classes from Common, we can add failover proxy providers that may provide RM discovery in extensible ways.
+- YARN-1027.
+ Major sub-task reported by Bikas Saha and fixed by Karthik Kambatla
+ Implement RMHAProtocolService
+ Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services.
+- YARN-1022.
+ Trivial bug reported by Bikas Saha and fixed by haosdent
+ Unnecessary INFO logs in AMRMClientAsync
+ Logs like the following should be debug or else every legitimate stop causes unnecessary exception traces in the logs.
+
+464 2013-08-03 20:01:34,459 INFO [AMRM Heartbeater thread] org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl: Heartbeater interrupted
+465 java.lang.InterruptedException: sleep interrupted
+466 at java.lang.Thread.sleep(Native Method)
+467 at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:249)
+468 2013-08-03 20:01:34,460 INFO [AMRM Callback Handler Thread] org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl: Interrupted while waiting for queue
+469 java.lang.InterruptedException
+470 at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer. java:1961)
+471 at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1996)
+472 at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
+473 at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:275)
+- YARN-1021.
+ Major new feature reported by Wei Yan and fixed by Wei Yan (scheduler)
+ Yarn Scheduler Load Simulator
+ The Yarn Scheduler is a fertile area of interest with different implementations, e.g., Fifo, Capacity and Fair schedulers. Meanwhile, several optimizations are also made to improve scheduler performance for different scenarios and workload. Each scheduler algorithm has its own set of features, and drives scheduling decisions by many factors, such as fairness, capacity guarantee, resource availability, etc. It is very important to evaluate a scheduler algorithm very well before we deploy it in a production cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling algorithm. Evaluating in a real cluster is always time and cost consuming, and it is also very hard to find a large-enough cluster. Hence, a simulator which can predict how well a scheduler algorithm for some specific workload would be quite useful.
+
+We want to build a Scheduler Load Simulator to simulate large-scale Yarn clusters and application loads in a single machine. This would be invaluable in furthering Yarn by providing a tool for researchers and developers to prototype new scheduler features and predict their behavior and performance with reasonable amount of confidence, there-by aiding rapid innovation.
+
+The simulator will exercise the real Yarn ResourceManager removing the network factor by simulating NodeManagers and ApplicationMasters via handling and dispatching NM/AMs heartbeat events from within the same JVM.
+
+To keep tracking of scheduler behavior and performance, a scheduler wrapper will wrap the real scheduler.
+
+The simulator will produce real time metrics while executing, including:
+
+* Resource usages for whole cluster and each queue, which can be utilized to configure cluster and queue's capacity.
+* The detailed application execution trace (recorded in relation to simulated time), which can be analyzed to understand/validate the scheduler behavior (individual jobs turn around time, throughput, fairness, capacity guarantee, etc).
+* Several key metrics of scheduler algorithm, such as time cost of each scheduler operation (allocate, handle, etc), which can be utilized by Hadoop developers to find the code spots and scalability limits.
+
+The simulator will provide real time charts showing the behavior of the scheduler and its performance.
+
+A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing how to use simulator to simulate Fair Scheduler and Capacity Scheduler.
+- YARN-1010.
+ Critical improvement reported by Alejandro Abdelnur and fixed by Wei Yan (scheduler)
+ FairScheduler: decouple container scheduling from nodemanager heartbeats
+ Currently scheduling for a node is done when a node heartbeats.
+
+For large cluster where the heartbeat interval is set to several seconds this delays scheduling of incoming allocations significantly.
+
+We could have a continuous loop scanning all nodes and doing scheduling. If there is availability AMs will get the allocation in the next heartbeat after the one that placed the request.
+- YARN-985.
+ Major improvement reported by Ravi Prakash and fixed by Ravi Prakash (nodemanager)
+ Nodemanager should log where a resource was localized
+ When a resource is localized, we should log WHERE on the local disk it was localized. This helps in debugging afterwards (e.g. if the disk was to go bad).
+- YARN-976.
+ Major sub-task reported by Sandy Ryza and fixed by Sandy Ryza (documentation)
+ Document the meaning of a virtual core
+ As virtual cores are a somewhat novel concept, it would be helpful to have thorough documentation that clarifies their meaning.
+- YARN-895.
+ Major sub-task reported by Jian He and fixed by Jian He (resourcemanager)
+ RM crashes if it restarts while the state-store is down
+
+- YARN-891.
+ Major sub-task reported by Bikas Saha and fixed by Jian He (resourcemanager)
+ Store completed application information in RM state store
+ Store completed application/attempt info in RMStateStore when application/attempt completes. This solves some problems like finished application get lost after RM restart and some other races like YARN-1195
+- YARN-888.
+ Major bug reported by Alejandro Abdelnur and fixed by Alejandro Abdelnur
+ clean up POM dependencies
+ Intermediate 'pom' modules define dependencies inherited by leaf modules.
+
+This is causing issues in intellij IDE.
+
+We should normalize the leaf modules like in common, hdfs and tools where all dependencies are defined in each leaf module and the intermediate 'pom' module do not define any dependency.
+- YARN-879.
+ Major bug reported by Junping Du and fixed by Junping Du
+ Fix tests w.r.t o.a.h.y.server.resourcemanager.Application
+ getResources() will return a list of containers that allocated by RM. However, it is now return null directly. The worse thing is: if LOG.debug is enabled, then it will definitely cause NPE exception.
+- YARN-819.
+ Major sub-task reported by Robert Parker and fixed by Robert Parker (nodemanager , resourcemanager)
+ ResourceManager and NodeManager should check for a minimum allowed version
+ Our use case is during upgrade on a large cluster several NodeManagers may not restart with the new version. Once the RM comes back up the NodeManager will re-register without issue to the RM.
+
+The NM should report the version the RM. The RM should have a configuration to disallow the check (default), equal to the RM (to prevent config change for each release), equal to or greater than RM (to allow NM upgrades), and finally an explicit version or version range.
+
+The RM should also have an configuration on how to treat the mismatch: REJECT, or REBOOT the NM.
+- YARN-807.
+ Major improvement reported by Sandy Ryza and fixed by Sandy Ryza
+ When querying apps by queue, iterating over all apps is inefficient and limiting
+ The question "which apps are in queue x" can be asked via the RM REST APIs, through the ClientRMService, and through the command line. In all these cases, the question is answered by scanning through every RMApp and filtering by the app's queue name.
+
+All schedulers maintain a mapping of queues to applications. I think it would make more sense to ask the schedulers which applications are in a given queue. This is what was done in MR1. This would also have the advantage of allowing a parent queue to return all the applications on leaf queues under it, and allow queue name aliases, as in the way that "root.default" and "default" refer to the same queue in the fair scheduler.
+
+
+- YARN-786.
+ Major improvement reported by Sandy Ryza and fixed by Sandy Ryza
+ Expose application resource usage in RM REST API
+ It might be good to require users to explicitly ask for this information, as it's a little more expensive to collect than the other fields in AppInfo.
+- YARN-764.
+ Major bug reported by Nemon Lou and fixed by Nemon Lou (resourcemanager)
+ blank Used Resources on Capacity Scheduler page
+ Even when there are jobs running,used resources is empty on Capacity Scheduler page for leaf queue.(I use google-chrome on windows 7.)
+After changing resource.java's toString method by replacing "<>" with "{}",this bug gets fixed.
+- YARN-709.
+ Major sub-task reported by Jian He and fixed by Jian He (resourcemanager)
+ verify that new jobs submitted with old RM delegation tokens after RM restart are accepted
+ More elaborate test for restoring RM delegation tokens on RM restart.
+New jobs with old RM delegation tokens should be accepted by new RM as long as the token is still valid
+- YARN-674.
+ Major sub-task reported by Vinod Kumar Vavilapalli and fixed by Omkar Vinit Joshi (resourcemanager)
+ Slow or failing DelegationToken renewals on submission itself make RM unavailable
+ This was caused by YARN-280. A slow or a down NameNode for will make it look like RM is unavailable as it may run out of RPC handlers due to blocked client submissions.
+- YARN-649.
+ Major sub-task reported by Sandy Ryza and fixed by Sandy Ryza (nodemanager)
+ Make container logs available over HTTP in plain text
+ It would be good to make container logs available over the REST API for MAPREDUCE-4362 and so that they can be accessed programatically in general.
+- YARN-584.
+ Major bug reported by Sandy Ryza and fixed by Harshit Daga (scheduler)
+ In scheduler web UIs, queues unexpand on refresh
+ In the fair scheduler web UI, you can expand queue information. Refreshing the page causes the expansions to go away, which is annoying for someone who wants to monitor the scheduler page and needs to reopen all the queues they care about each time.
+- YARN-546.
+ Major bug reported by Lohit Vijayarenu and fixed by Sandy Ryza (scheduler)
+ Allow disabling the Fair Scheduler event log
+ Hadoop 1.0 supported an option to turn on/off FairScheduler event logging using mapred.fairscheduler.eventlog.enabled. In Hadoop 2.0, it looks like this option has been removed (or not ported?) which causes event logging to be enabled by default and there is no way to turn it off.
+- YARN-478.
+ Major sub-task reported by Aleksey Gorshkov and fixed by Aleksey Gorshkov
+ fix coverage org.apache.hadoop.yarn.webapp.log
+ fix coverage org.apache.hadoop.yarn.webapp.log
+one patch for trunk, branch-2, branch-0.23
+- YARN-465.
+ Major sub-task reported by Aleksey Gorshkov and fixed by Andrey Klochkov
+ fix coverage org.apache.hadoop.yarn.server.webproxy
+ fix coverage org.apache.hadoop.yarn.server.webproxy
+patch YARN-465-trunk.patch for trunk
+patch YARN-465-branch-2.patch for branch-2
+patch YARN-465-branch-0.23.patch for branch-0.23
+
+There is issue in branch-0.23 . Patch does not creating .keep file.
+For fix it need to run commands:
+
+mkdir yhadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy
+touch yhadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/proxy/.keep
+
+- YARN-461.
+ Major bug reported by Sandy Ryza and fixed by Wei Yan (resourcemanager)
+ Fair scheduler should not accept apps with empty string queue name
+ When an app is submitted with "" for the queue, the RMAppManager passes it on like it does with any other string.
+
+- YARN-427.
+ Major sub-task reported by Aleksey Gorshkov and fixed by Aleksey Gorshkov
+ Coverage fix for org.apache.hadoop.yarn.server.api.*
+ Coverage fix for org.apache.hadoop.yarn.server.api.*
+
+patch YARN-427-trunk.patch for trunk
+patch YARN-427-branch-2.patch for branch-2 and branch-0.23
+- YARN-425.
+ Major sub-task reported by Aleksey Gorshkov and fixed by Aleksey Gorshkov
+ coverage fix for yarn api
+ coverage fix for yarn api
+patch YARN-425-trunk-a.patch for trunk
+patch YARN-425-branch-2.patch for branch-2
+patch YARN-425-branch-0.23.patch for branch-0.23
+- YARN-408.
+ Minor bug reported by Mayank Bansal and fixed by Mayank Bansal (scheduler)
+ Capacity Scheduler delay scheduling should not be disabled by default
+ Capacity Scheduler delay scheduling should not be disabled by default.
+Enabling it to number of nodes in one rack.
+
+Thanks,
+Mayank
+- YARN-353.
+ Major sub-task reported by Hitesh Shah and fixed by Karthik Kambatla (resourcemanager)
+ Add Zookeeper-based store implementation for RMStateStore
+ Add store that write RM state data to ZK
+
+- YARN-312.
+ Major sub-task reported by Junping Du and fixed by Junping Du (api)
+ Add updateNodeResource in ResourceManagerAdministrationProtocol
+ Add fundamental RPC (ResourceManagerAdministrationProtocol) to support node's resource change. For design detail, please refer parent JIRA: YARN-291.
+- YARN-311.
+ Major sub-task reported by Junping Du and fixed by Junping Du (resourcemanager , scheduler)
+ Dynamic node resource configuration: core scheduler changes
+ As the first step, we go for resource change on RM side and expose admin APIs (admin protocol, CLI, REST and JMX API) later. In this jira, we will only contain changes in scheduler.
+The flow to update node's resource and awareness in resource scheduling is:
+1. Resource update is through admin API to RM and take effect on RMNodeImpl.
+2. When next NM heartbeat for updating status comes, the RMNode's resource change will be aware and the delta resource is added to schedulerNode's availableResource before actual scheduling happens.
+3. Scheduler do resource allocation according to new availableResource in SchedulerNode.
+For more design details, please refer proposal and discussions in parent JIRA: YARN-291.
+- YARN-305.
+ Critical bug reported by Lohit Vijayarenu and fixed by Lohit Vijayarenu (resourcemanager)
+ Fair scheduler logs too many "Node offered to app:..." messages
+ Running fair scheduler YARN shows that RM has lots of messages like the below.
+{noformat}
+INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable: Node offered to app: application_1357147147433_0002 reserved: false
+{noformat}
+
+They dont seem to tell much and same line is dumped many times in RM log. It would be good to have it improved with node information or moved to some other logging level with enough debug information
+- YARN-7.
+ Major sub-task reported by Arun C Murthy and fixed by Junping Du
+ Add support for DistributedShell to ask for CPUs along with memory
+
+- MAPREDUCE-5744.
+ Blocker bug reported by Sangjin Lee and fixed by Gera Shegalov
+ Job hangs because RMContainerAllocator$AssignedRequests.preemptReduce() violates the comparator contract
+
+- MAPREDUCE-5743.
+ Major bug reported by Ted Yu and fixed by Ted Yu
+ TestRMContainerAllocator is failing
+
+- MAPREDUCE-5729.
+ Critical bug reported by Karthik Kambatla and fixed by Karthik Kambatla (mrv2)
+ mapred job -list throws NPE
+
+- MAPREDUCE-5725.
+ Major bug reported by Sandy Ryza and fixed by Sandy Ryza
+ TestNetworkedJob relies on the Capacity Scheduler
+
+- MAPREDUCE-5724.
+ Critical bug reported by Alejandro Abdelnur and fixed by Alejandro Abdelnur (jobhistoryserver)
+ JobHistoryServer does not start if HDFS is not running
+
+- MAPREDUCE-5723.
+ Blocker bug reported by Mohammad Kamrul Islam and fixed by Mohammad Kamrul Islam (applicationmaster)
+ MR AM container log can be truncated or empty
+
+- MAPREDUCE-5694.
+ Major bug reported by Mohammad Kamrul Islam and fixed by Mohammad Kamrul Islam
+ MR AM container syslog is empty
+
+- MAPREDUCE-5693.
+ Major bug reported by Gera Shegalov and fixed by Gera Shegalov (mrv2)
+ Restore MRv1 behavior for log flush
+
+- MAPREDUCE-5692.
+ Major improvement reported by Gera Shegalov and fixed by Gera Shegalov (mrv2)
+ Add explicit diagnostics when a task attempt is killed due to speculative execution
+
+- MAPREDUCE-5689.
+ Critical bug reported by Lohit Vijayarenu and fixed by Lohit Vijayarenu
+ MRAppMaster does not preempt reducers when scheduled maps cannot be fulfilled
+
+- MAPREDUCE-5687.
+ Major test reported by Ted Yu and fixed by Jian He
+ TestYARNRunner#testResourceMgrDelegate fails with NPE after YARN-1446
+
+- MAPREDUCE-5685.
+ Blocker bug reported by Yi Song and fixed by Yi Song (client)
+ getCacheFiles() api doesn't work in WrappedReducer.java due to typo
+
+- MAPREDUCE-5679.
+ Major bug reported by Liyin Liang and fixed by Liyin Liang
+ TestJobHistoryParsing has race condition
+
+- MAPREDUCE-5674.
+ Major bug reported by Chuan Liu and fixed by Chuan Liu (client)
+ Missing start and finish time in mapred.JobStatus
+
+- MAPREDUCE-5672.
+ Major improvement reported by Gera Shegalov and fixed by Gera Shegalov (mr-am , mrv2)
+ Provide optional RollingFileAppender for container log4j (syslog)
+
+- MAPREDUCE-5656.
+ Critical bug reported by Jason Lowe and fixed by Jason Lowe
+ bzip2 codec can drop records when reading data in splits
+
+- MAPREDUCE-5650.
+ Major bug reported by Gera Shegalov and fixed by Gera Shegalov (mrv2)
+ Job fails when hprof mapreduce.task.profile.map/reduce.params is specified
+
+- MAPREDUCE-5645.
+ Major bug reported by Jonathan Eagles and fixed by Mit Desai
+ TestFixedLengthInputFormat fails with native libs
+
+- MAPREDUCE-5640.
+ Trivial improvement reported by Jason Lowe and fixed by Jason Lowe (test)
+ Rename TestLineRecordReader in jobclient module
+
+- MAPREDUCE-5632.
+ Major test reported by Ted Yu and fixed by Jonathan Eagles
+ TestRMContainerAllocator#testUpdatedNodes fails
+
+- MAPREDUCE-5631.
+ Major bug reported by Jonathan Eagles and fixed by Jonathan Eagles
+ TestJobEndNotifier.testNotifyRetries fails with Should have taken more than 5 seconds in jdk7
+
+- MAPREDUCE-5625.
+ Major test reported by Jonathan Eagles and fixed by Mariappan Asokan
+ TestFixedLengthInputFormat fails in jdk7 environment
+
+- MAPREDUCE-5623.
+ Major bug reported by Tsuyoshi OZAWA and fixed by Jason Lowe
+ TestJobCleanup fails because of RejectedExecutionException and NPE.
+
+- MAPREDUCE-5616.
+ Major bug reported by Chris Nauroth and fixed by Chris Nauroth (client)
+ MR Client-AppMaster RPC max retries on socket timeout is too high.
+
+- MAPREDUCE-5613.
+ Major bug reported by Gera Shegalov and fixed by Gera Shegalov (applicationmaster)
+ DefaultSpeculator holds and checks hashmap that is always empty
+
+- MAPREDUCE-5610.
+ Major test reported by Jonathan Eagles and fixed by Jonathan Eagles
+ TestSleepJob fails in jdk7
+
+- MAPREDUCE-5604.
+ Minor bug reported by Chris Nauroth and fixed by Chris Nauroth (test)
+ TestMRAMWithNonNormalizedCapabilities fails on Windows due to exceeding max path length
+
+- MAPREDUCE-5601.
+ Major improvement reported by Sandy Ryza and fixed by Sandy Ryza
+ ShuffleHandler fadvises file regions as DONTNEED even when fetch fails
+
+- MAPREDUCE-5598.
+ Major bug reported by Robert Kanter and fixed by Robert Kanter (test)
+ TestUserDefinedCounters.testMapReduceJob is flakey
+
+- MAPREDUCE-5596.
+ Major improvement reported by Sandy Ryza and fixed by Sandy Ryza
+ Allow configuring the number of threads used to serve shuffle connections
+
+- MAPREDUCE-5587.
+ Major bug reported by Jonathan Eagles and fixed by Jonathan Eagles
+ TestTextOutputFormat fails on JDK7
+
+- MAPREDUCE-5586.
+ Major bug reported by Jonathan Eagles and fixed by Jonathan Eagles
+ TestCopyMapper#testCopyFailOnBlockSizeDifference fails when run from hadoop-tools/hadoop-distcp directory
+
+- MAPREDUCE-5585.
+ Major bug reported by Jonathan Eagles and fixed by Jonathan Eagles
+ TestCopyCommitter#testNoCommitAction Fails on JDK7
+
+- MAPREDUCE-5569.
+ Major bug reported by Nathan Roberts and fixed by Nathan Roberts
+ FloatSplitter is not generating correct splits
+
+- MAPREDUCE-5561.
+ Critical bug reported by Cindy Li and fixed by Karthik Kambatla
+ org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl testcase failing on trunk
+
+- MAPREDUCE-5550.
+ Major bug reported by Vrushali C and fixed by Gera Shegalov
+ Task Status message (reporter.setStatus) not shown in UI with Hadoop 2.0
+
+- MAPREDUCE-5546.
+ Major bug reported by Chuan Liu and fixed by Chuan Liu
+ mapred.cmd on Windows set HADOOP_OPTS incorrectly
+
+- MAPREDUCE-5522.
+ Minor bug reported by Jinghui Wang and fixed by Jinghui Wang (test)
+ Incorrectly expect the array of JobQueueInfo returned by o.a.h.mapred.QueueManager#getJobQueueInfos to have a specific order.
+
+- MAPREDUCE-5518.
+ Trivial bug reported by Albert Chu and fixed by Albert Chu (examples)
+ Fix typo "can't read paritions file"
+
+- MAPREDUCE-5514.
+ Blocker bug reported by Zhijie Shen and fixed by Zhijie Shen
+ TestRMContainerAllocator fails on trunk
+
+- MAPREDUCE-5504.
+ Major bug reported by Thomas Graves and fixed by Kousuke Saruta (client)
+ mapred queue -info inconsistent with types
+
+- MAPREDUCE-5487.
+ Major improvement reported by Sandy Ryza and fixed by Sandy Ryza (performance , task)
+ In task processes, JobConf is unnecessarily loaded again in Limits
+
+- MAPREDUCE-5484.
+ Major improvement reported by Sandy Ryza and fixed by Sandy Ryza (task)
+ YarnChild unnecessarily loads job conf twice
+
+- MAPREDUCE-5481.
+ Blocker bug reported by Jason Lowe and fixed by Sandy Ryza (mrv2 , test)
+ Enable uber jobs to have multiple reducers
+
+- MAPREDUCE-5464.
+ Major task reported by Sandy Ryza and fixed by Sandy Ryza
+ Add analogs of the SLOTS_MILLIS counters that jive with the YARN resource model
+
+- MAPREDUCE-5463.
+ Major task reported by Sandy Ryza and fixed by Tsuyoshi OZAWA
+ Deprecate SLOTS_MILLIS counters
+
+- MAPREDUCE-5457.
+ Major improvement reported by Sandy Ryza and fixed by Sandy Ryza
+ Add a KeyOnlyTextOutputReader to enable streaming to write out text files without separators
+
+- MAPREDUCE-5451.
+ Major bug reported by Mostafa Elhemali and fixed by Yingda Chen
+ MR uses LD_LIBRARY_PATH which doesn't mean anything in Windows
+
+- MAPREDUCE-5431.
+ Major bug reported by Timothy St. Clair and fixed by Timothy St. Clair (build)
+ Missing pom dependency in MR-client
+
+- MAPREDUCE-5411.
+ Major sub-task reported by Ashwin Shankar and fixed by Ashwin Shankar (jobhistoryserver)
+ Refresh size of loaded job cache on history server
+
+- MAPREDUCE-5409.
+ Major sub-task reported by Devaraj K and fixed by Gera Shegalov
+ MRAppMaster throws InvalidStateTransitonException: Invalid event: TA_TOO_MANY_FETCH_FAILURE at KILLED for TaskAttemptImpl
+
+- MAPREDUCE-5404.
+ Major bug reported by Ted Yu and fixed by Ted Yu (jobhistoryserver)
+ HSAdminServer does not use ephemeral ports in minicluster mode
+
+- MAPREDUCE-5386.
+ Major sub-task reported by Ashwin Shankar and fixed by Ashwin Shankar (jobhistoryserver)
+ Ability to refresh history server job retention and job cleaner settings
+
+- MAPREDUCE-5380.
+ Major bug reported by Stephen Chu and fixed by Stephen Chu
+ Invalid mapred command should return non-zero exit code
+
+- MAPREDUCE-5373.
+ Major bug reported by Chuan Liu and fixed by Jonathan Eagles
+ TestFetchFailure.testFetchFailureMultipleReduces could fail intermittently
+
+- MAPREDUCE-5356.
+ Major sub-task reported by Ashwin Shankar and fixed by Ashwin Shankar (jobhistoryserver)
+ Ability to refresh aggregated log retention period and check interval
+
+- MAPREDUCE-5332.
+ Major new feature reported by Jason Lowe and fixed by Jason Lowe (jobhistoryserver)
+ Support token-preserving restart of history server
+
+- MAPREDUCE-5329.
+ Major bug reported by Avner BenHanoch and fixed by Avner BenHanoch (mr-am)
+ APPLICATION_INIT is never sent to AuxServices other than the builtin ShuffleHandler
+
+- MAPREDUCE-5316.
+ Major bug reported by Ashwin Shankar and fixed by Ashwin Shankar (client)
+ job -list-attempt-ids command does not handle illegal task-state
+
+- MAPREDUCE-5266.
+ Major new feature reported by Jason Lowe and fixed by Ashwin Shankar (jobhistoryserver)
+ Ability to refresh retention settings on history server
+
+- MAPREDUCE-5265.
+ Major new feature reported by Jason Lowe and fixed by Ashwin Shankar (jobhistoryserver)
+ History server admin service to refresh user and superuser group mappings
+
+- MAPREDUCE-5186.
+ Critical bug reported by Sangjin Lee and fixed by Robert Parker (job submission)
+ mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail
+
+- MAPREDUCE-5102.
+ Major test reported by Aleksey Gorshkov and fixed by Andrey Klochkov
+ fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db
+
+- MAPREDUCE-5084.
+ Major test reported by Aleksey Gorshkov and fixed by Aleksey Gorshkov
+ fix coverage org.apache.hadoop.mapreduce.v2.app.webapp and org.apache.hadoop.mapreduce.v2.hs.webapp
+
+- MAPREDUCE-5052.
+ Critical bug reported by Kendall Thrapp and fixed by Chen He (jobhistoryserver , webapps)
+ Job History UI and web services confusing job start time and job submit time
+
+- MAPREDUCE-5020.
+ Major bug reported by Trevor Robinson and fixed by Trevor Robinson (client)
+ Compile failure with JDK8
+
+- MAPREDUCE-4680.
+ Major bug reported by Sandy Ryza and fixed by Robert Kanter (jobhistoryserver)
+ Job history cleaner should only check timestamps of files in old enough directories
+
+- MAPREDUCE-4421.
+ Major improvement reported by Arun C Murthy and fixed by Jason Lowe
+ Run MapReduce framework via the distributed cache
+
+- MAPREDUCE-3310.
+ Major improvement reported by Mathias Herberts and fixed by Alejandro Abdelnur (client)
+ Custom grouping comparator cannot be set for Combiners
+
+- MAPREDUCE-1176.
+ Major new feature reported by BitsOfInfo and fixed by Mariappan Asokan
+ FixedLengthInputFormat and FixedLengthRecordReader
+ Addition of FixedLengthInputFormat and FixedLengthRecordReader in the org.apache.hadoop.mapreduce.lib.input package. These two classes can be used when you need to read data from files containing fixed length (fixed width) records. Such files have no CR/LF (or any combination thereof), no delimiters etc, but each record is a fixed length, and extra data is padded with spaces. The data is one gigantic line within a file. When creating a job that specifies this input format, the job must have the "mapreduce.input.fixedlengthinputformat.record.length" property set as follows myJobConf.setInt("mapreduce.input.fixedlengthinputformat.record.length",[myFixedRecordLength]);
+
+Please see javadoc for more details.
+- MAPREDUCE-434.
+ Minor improvement reported by Yoram Arnon and fixed by Aaron Kimball
+ LocalJobRunner limited to single reducer
+
+- HDFS-5921.
+ Critical bug reported by Aaron T. Myers and fixed by Aaron T. Myers (namenode)
+ Cannot browse file system via NN web UI if any directory has the sticky bit set
+
+- HDFS-5876.
+ Major bug reported by Haohui Mai and fixed by Haohui Mai (datanode)
+ SecureDataNodeStarter does not pick up configuration in hdfs-site.xml
+
+- HDFS-5873.
+ Major bug reported by Yesha Vora and fixed by Haohui Mai
+ dfs.http.policy should have higher precedence over dfs.https.enable
+
+- HDFS-5845.
+ Blocker bug reported by Andrew Wang and fixed by Andrew Wang (namenode)
+ SecondaryNameNode dies when checkpointing with cache pools
+
+- HDFS-5844.
+ Minor bug reported by Akira AJISAKA and fixed by Akira AJISAKA (documentation)
+ Fix broken link in WebHDFS.apt.vm
+
+- HDFS-5842.
+ Major bug reported by Arpit Gupta and fixed by Jing Zhao (security)
+ Cannot create hftp filesystem when using a proxy user ugi and a doAs on a secure cluster
+
+- HDFS-5841.
+ Major improvement reported by Andrew Wang and fixed by Andrew Wang
+ Update HDFS caching documentation with new changes
+
+- HDFS-5837.
+ Major bug reported by Bryan Beaudreault and fixed by Tao Luo (namenode)
+ dfs.namenode.replication.considerLoad does not consider decommissioned nodes
+
+- HDFS-5833.
+ Trivial improvement reported by Bangtao Zhou and fixed by (namenode)
+ SecondaryNameNode have an incorrect java doc
+
+- HDFS-5830.
+ Blocker bug reported by Yongjun Zhang and fixed by Yongjun Zhang (caching , hdfs-client)
+ WebHdfsFileSystem.getFileBlockLocations throws IllegalArgumentException when accessing another cluster.
+
+- HDFS-5825.
+ Minor improvement reported by Haohui Mai and fixed by Haohui Mai
+ Use FileUtils.copyFile() to implement DFSTestUtils.copyFile()
+
+- HDFS-5806.
+ Major bug reported by Nathan Roberts and fixed by Nathan Roberts (balancer)
+ balancer should set SoTimeout to avoid indefinite hangs
+
+- HDFS-5800.
+ Trivial bug reported by Kousuke Saruta and fixed by Kousuke Saruta (hdfs-client)
+ Typo: soft-limit for hard-limit in DFSClient
+
+- HDFS-5789.
+ Major bug reported by Uma Maheswara Rao G and fixed by Uma Maheswara Rao G (namenode)
+ Some of snapshot APIs missing checkOperation double check in fsn
+
+- HDFS-5788.
+ Major improvement reported by Nathan Roberts and fixed by Nathan Roberts (namenode)
+ listLocatedStatus response can be very large
+
+- HDFS-5784.
+ Major sub-task reported by Colin Patrick McCabe and fixed by Colin Patrick McCabe (namenode)
+ reserve space in edit log header and fsimage header for feature flag section
+
+- HDFS-5777.
+ Major bug reported by Jing Zhao and fixed by Jing Zhao (namenode)
+ Update LayoutVersion for the new editlog op OP_ADD_BLOCK
+
+- HDFS-5766.
+ Major bug reported by Liang Xie and fixed by Liang Xie (hdfs-client)
+ In DFSInputStream, do not add datanode to deadNodes after InvalidEncryptionKeyException in fetchBlockByteRange
+
+- HDFS-5762.
+ Major bug reported by Colin Patrick McCabe and fixed by Colin Patrick McCabe
+ BlockReaderLocal doesn't return -1 on EOF when doing zero-length reads
+
+- HDFS-5756.
+ Major bug reported by Colin Patrick McCabe and fixed by Colin Patrick McCabe (libhdfs)
+ hadoopRzOptionsSetByteBufferPool does not accept NULL argument, contrary to docs
+
+- HDFS-5748.
+ Major improvement reported by Kihwal Lee and fixed by Haohui Mai
+ Too much information shown in the dfs health page.
+
+- HDFS-5747.
+ Minor bug reported by Tsz Wo (Nicholas), SZE and fixed by Arpit Agarwal (namenode)
+ BlocksMap.getStoredBlock(..) and BlockInfoUnderConstruction.addReplicaIfNotPresent(..) may throw NullPointerException
+
+- HDFS-5728.
+ Critical bug reported by Vinayakumar B and fixed by Vinayakumar B (datanode)
+ [Diskfull] Block recovery will fail if the metafile does not have crc for all chunks of the block
+
+- HDFS-5721.
+ Minor improvement reported by Ted Yu and fixed by Ted Yu
+ sharedEditsImage in Namenode#initializeSharedEdits() should be closed before method returns
+
+- HDFS-5719.
+ Minor bug reported by Ted Yu and fixed by Ted Yu (namenode)
+ FSImage#doRollback() should close prevState before return
+
+- HDFS-5710.
+ Major bug reported by Ted Yu and fixed by Uma Maheswara Rao G
+ FSDirectory#getFullPathName should check inodes against null
+
+- HDFS-5704.
+ Major bug reported by Suresh Srinivas and fixed by Jing Zhao (namenode)
+ Change OP_UPDATE_BLOCKS with a new OP_ADD_BLOCK
+ Add a new editlog record (OP_ADD_BLOCK) that only records allocation of the new block instead of the entire block list, on every block allocation.
+- HDFS-5703.
+ Major new feature reported by Alejandro Abdelnur and fixed by Alejandro Abdelnur (webhdfs)
+ Add support for HTTPS and swebhdfs to HttpFS
+
+- HDFS-5695.
+ Major improvement reported by Haohui Mai and fixed by Haohui Mai (test)
+ Clean up TestOfflineEditsViewer and OfflineEditsViewerHelper
+
+- HDFS-5691.
+ Minor bug reported by Akira AJISAKA and fixed by Akira AJISAKA (documentation)
+ Fix typo in ShortCircuitLocalRead document
+
+- HDFS-5690.
+ Blocker bug reported by Haohui Mai and fixed by Haohui Mai
+ DataNode fails to start in secure mode when dfs.http.policy equals to HTTP_ONLY
+
+- HDFS-5681.
+ Major bug reported by Daryn Sharp and fixed by Daryn Sharp (namenode)
+ renewLease should not hold fsn write lock
+
+- HDFS-5677.
+ Minor improvement reported by Vincent Sheffer and fixed by Vincent Sheffer (datanode , ha)
+ Need error checking for HA cluster configuration
+
+- HDFS-5676.
+ Minor improvement reported by Colin Patrick McCabe and fixed by Colin Patrick McCabe (hdfs-client)
+ fix inconsistent synchronization of CachingStrategy
+
+- HDFS-5675.
+ Minor bug reported by Plamen Jeliazkov and fixed by Plamen Jeliazkov (benchmarks)
+ Add Mkdirs operation to NNThroughputBenchmark
+
+- HDFS-5674.
+ Minor improvement reported by Tsz Wo (Nicholas), SZE and fixed by Tsz Wo (Nicholas), SZE (namenode)
+ Editlog code cleanup
+
+- HDFS-5671.
+ Critical bug reported by JamesLi and fixed by JamesLi (hdfs-client)
+ Fix socket leak in DFSInputStream#getBlockReader
+
+- HDFS-5667.
+ Major sub-task reported by Eric Sirianni and fixed by Arpit Agarwal (datanode)
+ Include DatanodeStorage in StorageReport
+
+- HDFS-5666.
+ Minor bug reported by Colin Patrick McCabe and fixed by Jimmy Xiang (namenode)
+ Fix inconsistent synchronization in BPOfferService
+
+- HDFS-5663.
+ Major improvement reported by Liang Xie and fixed by Liang Xie (hdfs-client)
+ make the retry time and interval value configurable in openInfo()
+ Makes the retries and time between retries getting the length of the last block on file configurable. Below are the new configurations.
+
+dfs.client.retry.times.get-last-block-length
+dfs.client.retry.interval-ms.get-last-block-length
+
+They are set to the 3 and 4000 respectively, these being what was previously hardcoded.
+
+
+- HDFS-5662.
+ Major improvement reported by Brandon Li and fixed by Brandon Li (namenode)
+ Can't decommission a DataNode due to file's replication factor larger than the rest of the cluster size
+
+- HDFS-5661.
+ Major bug reported by Benoy Antony and fixed by Benoy Antony
+ Browsing FileSystem via web ui, should use datanode's fqdn instead of ip address
+
+- HDFS-5657.
+ Major bug reported by Brandon Li and fixed by Brandon Li (nfs)
+ race condition causes writeback state error in NFS gateway
+
+- HDFS-5652.
+ Minor improvement reported by Liang Xie and fixed by Liang Xie (hdfs-client)
+ refactoring/uniforming invalid block token exception handling in DFSInputStream
+
+- HDFS-5649.
+ Major bug reported by Brandon Li and fixed by Brandon Li (nfs)
+ Unregister NFS and Mount service when NFS gateway is shutting down
+
+- HDFS-5637.
+ Major improvement reported by Liang Xie and fixed by Liang Xie (hdfs-client , security)
+ try to refeatchToken while local read InvalidToken occurred
+
+- HDFS-5634.
+ Major sub-task reported by Colin Patrick McCabe and fixed by Colin Patrick McCabe (hdfs-client)
+ allow BlockReaderLocal to switch between checksumming and not
+
+- HDFS-5633.
+ Minor improvement reported by Jing Zhao and fixed by Jing Zhao
+ Improve OfflineImageViewer to use less memory
+
+- HDFS-5629.
+ Major sub-task reported by Haohui Mai and fixed by Haohui Mai
+ Support HTTPS in JournalNode and SecondaryNameNode
+
+- HDFS-5592.
+ Major bug reported by Vinayakumar B and fixed by Vinayakumar B
+ "DIR* completeFile: /file is closed by DFSClient_" should be logged only for successful closure of the file.
+
+- HDFS-5590.
+ Major bug reported by Jing Zhao and fixed by Jing Zhao
+ Block ID and generation stamp may be reused when persistBlocks is set to false
+
+- HDFS-5587.
+ Minor improvement reported by Brandon Li and fixed by Brandon Li (nfs)
+ add debug information when NFS fails to start with duplicate user or group names
+
+- HDFS-5582.
+ Minor bug reported by Henry Hung and fixed by sathish
+ hdfs getconf -excludeFile or -includeFile always failed
+
+- HDFS-5581.
+ Major bug reported by Vinayakumar B and fixed by Vinayakumar B (namenode)
+ NameNodeFsck should use only one instance of BlockPlacementPolicy
+
+- HDFS-5580.
+ Major bug reported by Binglin Chang and fixed by Binglin Chang
+ Infinite loop in Balancer.waitForMoveCompletion
+
+- HDFS-5579.
+ Major bug reported by zhaoyunjiong and fixed by zhaoyunjiong (namenode)
+ Under construction files make DataNode decommission take very long hours
+
+- HDFS-5577.
+ Trivial improvement reported by Brandon Li and fixed by Brandon Li (documentation)
+ NFS user guide update
+
+- HDFS-5568.
+ Major improvement reported by Vinayakumar B and fixed by Vinayakumar B (snapshots)
+ Support inclusion of snapshot paths in Namenode fsck
+
+- HDFS-5563.
+ Major improvement reported by Brandon Li and fixed by Brandon Li (nfs)
+ NFS gateway should commit the buffered data when read request comes after write to the same file
+
+- HDFS-5561.
+ Minor improvement reported by Fengdong Yu and fixed by Haohui Mai (namenode)
+ FSNameSystem#getNameJournalStatus() in JMX should return plain text instead of HTML
+
+- HDFS-5560.
+ Major bug reported by Josh Elser and fixed by Josh Elser
+ Trash configuration log statements prints incorrect units
+
+- HDFS-5558.
+ Major bug reported by Kihwal Lee and fixed by Kihwal Lee
+ LeaseManager monitor thread can crash if the last block is complete but another block is not.
+
+- HDFS-5557.
+ Critical bug reported by Kihwal Lee and fixed by Kihwal Lee
+ Write pipeline recovery for the last packet in the block may cause rejection of valid replicas
+
+- HDFS-5552.
+ Major bug reported by Shinichi Yamashita and fixed by Haohui Mai (namenode)
+ Fix wrong information of "Cluster summay" in dfshealth.html
+
+- HDFS-5548.
+ Major improvement reported by Haohui Mai and fixed by Haohui Mai (nfs)
+ Use ConcurrentHashMap in portmap
+
+- HDFS-5545.
+ Major sub-task reported by Haohui Mai and fixed by Haohui Mai
+ Allow specifying endpoints for listeners in HttpServer
+
+- HDFS-5544.
+ Minor bug reported by sathish and fixed by sathish (hdfs-client)
+ Adding Test case For Checking dfs.checksum type as NULL value
+
+- HDFS-5540.
+ Minor bug reported by Binglin Chang and fixed by Binglin Chang
+ Fix intermittent failure in TestBlocksWithNotEnoughRacks
+
+- HDFS-5538.
+ Major sub-task reported by Haohui Mai and fixed by Haohui Mai
+ URLConnectionFactory should pick up the SSL related configuration by default
+
+- HDFS-5536.
+ Major sub-task reported by Haohui Mai and fixed by Haohui Mai
+ Implement HTTP policy for Namenode and DataNode
+ Add new HTTP policy configuration. Users can use "dfs.http.policy" to control the HTTP endpoints for NameNode and DataNode. Specifically, The following values are supported:
+- HTTP_ONLY : Service is provided only on http
+- HTTPS_ONLY : Service is provided only on https
+- HTTP_AND_HTTPS : Service is provided both on http and https
+
+hadoop.ssl.enabled and dfs.https.enabled are deprecated. When the deprecated configuration properties are still configured, currently http policy is decided based on the following rules:
+1. If dfs.http.policy is set to HTTPS_ONLY or HTTP_AND_HTTPS. It picks the specified policy, otherwise it proceeds to 2~4.
+2. It picks HTTPS_ONLY if hadoop.ssl.enabled equals to true.
+3. It picks HTTP_AND_HTTPS if dfs.https.enable equals to true.
+4. It picks HTTP_ONLY for other configurations.
+- HDFS-5533.
+ Minor bug reported by Binglin Chang and fixed by Binglin Chang (snapshots)
+ Symlink delete/create should be treated as DELETE/CREATE in snapshot diff report
+
+- HDFS-5532.
+ Major improvement reported by Vinayakumar B and fixed by Vinayakumar B (webhdfs)
+ Enable the webhdfs by default to support new HDFS web UI
+
+- HDFS-5526.
+ Blocker bug reported by Tsz Wo (Nicholas), SZE and fixed by Kihwal Lee (datanode)
+ Datanode cannot roll back to previous layout version
+
+- HDFS-5525.
+ Major sub-task reported by Haohui Mai and fixed by Haohui Mai
+ Inline dust templates
+
+- HDFS-5519.
+ Minor sub-task reported by Brandon Li and fixed by Brandon Li (nfs)
+ COMMIT handler should update the commit status after sync
+
+- HDFS-5514.
+ Major sub-task reported by Daryn Sharp and fixed by Daryn Sharp (namenode)
+ FSNamesystem's fsLock should allow custom implementation
+
+- HDFS-5506.
+ Major sub-task reported by Haohui Mai and fixed by Haohui Mai
+ Use URLConnectionFactory in DelegationTokenFetcher
+
+- HDFS-5504.
+ Major bug reported by Vinayakumar B and fixed by Vinayakumar B (snapshots)
+ In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, leads to NN safemode.
+
+- HDFS-5502.
+ Major sub-task reported by Haohui Mai and fixed by Haohui Mai
+ Fix HTTPS support in HsftpFileSystem
+ Fix the https support in HsftpFileSystem. With the change the client now verifies the server certificate. In particular, client side will verify the Common Name of the certificate using a strategy specified by the configuration property "hadoop.ssl.hostname.verifier".
+- HDFS-5495.
+ Major improvement reported by Andrew Wang and fixed by Jarek Jarcec Cecho
+ Remove further JUnit3 usages from HDFS
+
+- HDFS-5489.
+ Major sub-task reported by Haohui Mai and fixed by Haohui Mai
+ Use TokenAspect in WebHDFSFileSystem
+
+- HDFS-5488.
+ Major sub-task reported by Haohui Mai and fixed by Haohui Mai
+ Clean up TestHftpURLTimeout
+
+- HDFS-5487.
+ Major sub-task reported by Haohui Mai and fixed by Haohui Mai
+ Introduce unit test for TokenAspect
+
+- HDFS-5476.
+ Major bug reported by Jing Zhao and fixed by Jing Zhao
+ Snapshot: clean the blocks/files/directories under a renamed file/directory while deletion
+
+- HDFS-5474.
+ Major bug reported by Uma Maheswara Rao G and fixed by sathish (snapshots)
+ Deletesnapshot can make Namenode in safemode on NN restarts.
+
+- HDFS-5469.
+ Major sub-task reported by Brandon Li and fixed by Brandon Li (nfs)
+ Add configuration property for the sub-directroy export path
+
+- HDFS-5467.
+ Trivial improvement reported by Andrew Wang and fixed by Shinichi Yamashita
+ Remove tab characters in hdfs-default.xml
+
+- HDFS-5458.
+ Major bug reported by Andrew Wang and fixed by Mike Mellenthin (datanode)
+ Datanode failed volume threshold ignored if exception is thrown in getDataDirsFromURIs
+
+- HDFS-5456.
+ Critical bug reported by Chris Nauroth and fixed by Chris Nauroth (namenode)
+ NameNode startup progress creates new steps if caller attempts to create a counter for a step that doesn't already exist.
+
+- HDFS-5454.
+ Minor sub-task reported by Eric Sirianni and fixed by Arpit Agarwal (datanode)
+ DataNode UUID should be assigned prior to FsDataset initialization
+
+- HDFS-5449.
+ Blocker bug reported by Kihwal Lee and fixed by Kihwal Lee
+ WebHdfs compatibility broken between 2.2 and 1.x / 23.x
+
+- HDFS-5444.
+ Major sub-task reported by Haohui Mai and fixed by Haohui Mai
+ Choose default web UI based on browser capabilities
+
+- HDFS-5443.
+ Major bug reported by Uma Maheswara Rao G and fixed by Jing Zhao (snapshots)
+ Delete 0-sized block when deleting an under-construction file that is included in snapshot
+
+- HDFS-5440.
+ Major sub-task reported by Haohui Mai and fixed by Haohui Mai
+ Extract the logic of handling delegation tokens in HftpFileSystem to the TokenAspect class
+
+- HDFS-5438.
+ Critical bug reported by Kihwal Lee and fixed by Kihwal Lee (namenode)
+ Flaws in block report processing can cause data loss
+
+- HDFS-5436.
+ Major sub-task reported by Haohui Mai and fixed by Haohui Mai
+ Move HsFtpFileSystem and HFtpFileSystem into org.apache.hdfs.web
+
+- HDFS-5434.
+ Minor bug reported by Buddy and fixed by (namenode)
+ Write resiliency for replica count 1
+
+- HDFS-5433.
+ Critical bug reported by Aaron T. Myers and fixed by Aaron T. Myers (snapshots)
+ When reloading fsimage during checkpointing, we should clear existing snapshottable directories
+
+- HDFS-5432.
+ Trivial bug reported by Chris Nauroth and fixed by Chris Nauroth (datanode , test)
+ TestDatanodeJsp fails on Windows due to assumption that loopback address resolves to host name localhost.
+
+- HDFS-5428.
+ Major bug reported by Vinayakumar B and fixed by Jing Zhao (snapshots)
+ under construction files deletion after snapshot+checkpoint+nn restart leads nn safemode
+
+- HDFS-5427.
+ Major bug reported by Vinayakumar B and fixed by Vinayakumar B (snapshots)
+ not able to read deleted files from snapshot directly under snapshottable dir after checkpoint and NN restart
+
+- HDFS-5425.
+ Major bug reported by sathish and fixed by Jing Zhao (namenode , snapshots)
+ Renaming underconstruction file with snapshots can make NN failure on restart
+
+- HDFS-5413.
+ Major bug reported by Chris Nauroth and fixed by Chris Nauroth (scripts)
+ hdfs.cmd does not support passthrough to any arbitrary class.
+
+- HDFS-5407.
+ Trivial bug reported by Haohui Mai and fixed by Haohui Mai
+ Fix typos in DFSClientCache
+
+- HDFS-5406.
+ Major sub-task reported by Arpit Agarwal and fixed by Arpit Agarwal (datanode)
+ Send incremental block reports for all storages in a single call
+
+- HDFS-5403.
+ Major bug reported by Aaron T. Myers and fixed by Aaron T. Myers (webhdfs)
+ WebHdfs client cannot communicate with older WebHdfs servers post HDFS-5306
+
+- HDFS-5400.
+ Major bug reported by Colin Patrick McCabe and fixed by Colin Patrick McCabe
+ DFS_CLIENT_MMAP_CACHE_THREAD_RUNS_PER_TIMEOUT constant is set to the wrong value
+
+- HDFS-5399.
+ Major improvement reported by Jing Zhao and fixed by Jing Zhao
+ Revisit SafeModeException and corresponding retry policies
+
+- HDFS-5393.
+ Minor sub-task reported by Haohui Mai and fixed by Haohui Mai
+ Serve bootstrap and jQuery locally
+
+- HDFS-5382.
+ Major sub-task reported by Haohui Mai and fixed by Haohui Mai
+ Implement the UI of browsing filesystems in HTML 5 page
+
+- HDFS-5379.
+ Major sub-task reported by Haohui Mai and fixed by Haohui Mai
+ Update links to datanode information in dfshealth.html
+
+- HDFS-5375.
+ Minor bug reported by Chris Nauroth and fixed by Chris Nauroth (tools)
+ hdfs.cmd does not expose several snapshot commands.
+
+- HDFS-5374.
+ Trivial bug reported by Suresh Srinivas and fixed by Suresh Srinivas
+ Remove deadcode in DFSOutputStream
+
+- HDFS-5372.
+ Major bug reported by Tsz Wo (Nicholas), SZE and fixed by Vinayakumar B (namenode)
+ In FSNamesystem, hasReadLock() returns false if the current thread holds the write lock
+
+- HDFS-5371.
+ Minor improvement reported by Jing Zhao and fixed by Jing Zhao (ha , test)
+ Let client retry the same NN when "dfs.client.test.drop.namenode.response.number" is enabled
+
+- HDFS-5370.
+ Trivial bug reported by Kousuke Saruta and fixed by Kousuke Saruta (hdfs-client)
+ Typo in Error Message: different between range in condition and range in error message
+
+- HDFS-5365.
+ Major bug reported by Radim Kolar and fixed by Radim Kolar (build , libhdfs)
+ Fix libhdfs compile error on FreeBSD9
+
+- HDFS-5364.
+ Major sub-task reported by Brandon Li and fixed by Brandon Li (nfs)
+ Add OpenFileCtx cache
+
+- HDFS-5363.
+ Major sub-task reported by Haohui Mai and fixed by Haohui Mai
+ Refactor WebHdfsFileSystem: move SPENGO-authenticated connection creation to URLConnectionFactory
+
+- HDFS-5360.
+ Minor improvement reported by Shinichi Yamashita and fixed by Shinichi Yamashita (snapshots)
+ Improvement of usage message of renameSnapshot and deleteSnapshot
+
+- HDFS-5353.
+ Blocker bug reported by Haohui Mai and fixed by Colin Patrick McCabe
+ Short circuit reads fail when dfs.encrypt.data.transfer is enabled
+
+- HDFS-5352.
+ Minor bug reported by Ted Yu and fixed by Ted Yu
+ Server#initLog() doesn't close InputStream in httpfs
+
+- HDFS-5350.
+ Minor improvement reported by Rob Weltman and fixed by Jimmy Xiang (namenode)
+ Name Node should report fsimage transfer time as a metric
+
+- HDFS-5347.
+ Major sub-task reported by Brandon Li and fixed by Brandon Li (documentation)
+ add HDFS NFS user guide
+
+- HDFS-5346.
+ Major bug reported by Kihwal Lee and fixed by Ravi Prakash (namenode , performance)
+ Avoid unnecessary call to getNumLiveDataNodes() for each block during IBR processing
+
+- HDFS-5344.
+ Minor improvement reported by sathish and fixed by sathish (snapshots , tools)
+ Make LsSnapshottableDir as Tool interface implementation
+
+- HDFS-5343.
+ Major bug reported by sathish and fixed by sathish (hdfs-client)
+ When cat command is issued on snapshot files getting unexpected result
+
+- HDFS-5342.
+ Major sub-task reported by Haohui Mai and fixed by Haohui Mai
+ Provide more information in the FSNamesystem JMX interfaces
+
+- HDFS-5341.
+ Major bug reported by qus-jiawei and fixed by qus-jiawei (datanode)
+ Reduce fsdataset lock duration during directory scanning.
+
+- HDFS-5338.
+ Major improvement reported by Tsz Wo (Nicholas), SZE and fixed by Tsz Wo (Nicholas), SZE (namenode)
+ Add a conf to disable hostname check in DN registration
+
+- HDFS-5337.
+ Major sub-task reported by Brandon Li and fixed by Brandon Li (nfs)
+ should do hsync for a commit request even there is no pending writes
+
+- HDFS-5336.
+ Minor bug reported by Akira AJISAKA and fixed by Akira AJISAKA (namenode)
+ DataNode should not output 'StartupProgress' metrics
+
+- HDFS-5335.
+ Major bug reported by Arpit Gupta and fixed by Haohui Mai
+ DFSOutputStream#close() keeps throwing exceptions when it is called multiple times
+
+- HDFS-5334.
+ Major sub-task reported by Haohui Mai and fixed by Haohui Mai
+ Implement dfshealth.jsp in HTML pages
+
+- HDFS-5331.
+ Major improvement reported by Vinayakumar B and fixed by Vinayakumar B (snapshots)
+ make SnapshotDiff.java to a o.a.h.util.Tool interface implementation
+
+- HDFS-5330.
+ Major sub-task reported by Brandon Li and fixed by Brandon Li (nfs)
+ fix readdir and readdirplus for large directories
+
+- HDFS-5329.
+ Major bug reported by Brandon Li and fixed by Brandon Li (namenode , nfs)
+ Update FSNamesystem#getListing() to handle inode path in startAfter token
+
+- HDFS-5325.
+ Major sub-task reported by Haohui Mai and fixed by Haohui Mai
+ Remove WebHdfsFileSystem#ConnRunner
+
+- HDFS-5323.
+ Minor improvement reported by Colin Patrick McCabe and fixed by Colin Patrick McCabe (namenode)
+ Remove some deadcode in BlockManager
+
+- HDFS-5322.
+ Major bug reported by Arpit Gupta and fixed by Jing Zhao (ha)
+ HDFS delegation token not found in cache errors seen on secure HA clusters
+
+- HDFS-5317.
+ Critical sub-task reported by Suresh Srinivas and fixed by Haohui Mai
+ Go back to DFS Home link does not work on datanode webUI
+
+- HDFS-5316.
+ Critical sub-task reported by Suresh Srinivas and fixed by Haohui Mai
+ Namenode ignores the default https port
+
+- HDFS-5312.
+ Major sub-task reported by Haohui Mai and fixed by Haohui Mai
+ Generate HTTP / HTTPS URL in DFSUtil#getInfoServer() based on the configured http policy
+
+- HDFS-5307.
+ Major sub-task reported by Haohui Mai and fixed by Haohui Mai
+ Support both HTTP and HTTPS in jsp pages
+
+- HDFS-5305.
+ Major bug reported by Suresh Srinivas and fixed by Suresh Srinivas
+ Add https support in HDFS
+
+- HDFS-5297.
+ Major bug reported by Akira AJISAKA and fixed by Akira AJISAKA (documentation)
+ Fix dead links in HDFS site documents
+
+- HDFS-5291.
+ Critical bug reported by Arpit Gupta and fixed by Jing Zhao (ha)
+ Clients need to retry when Active NN is in SafeMode
+
+- HDFS-5288.
+ Major sub-task reported by Haohui Mai and fixed by Haohui Mai (nfs)
+ Close idle connections in portmap
+
+- HDFS-5283.
+ Critical bug reported by Vinayakumar B and fixed by Vinayakumar B (snapshots)
+ NN not coming out of startup safemode due to under construction blocks only inside snapshots also counted in safemode threshhold
+
+- HDFS-5281.
+ Major sub-task reported by Brandon Li and fixed by Brandon Li (nfs)
+ COMMIT request should not block
+
+- HDFS-5276.
+ Major bug reported by Chengxiang Li and fixed by Colin Patrick McCabe
+ FileSystem.Statistics got performance issue on multi-thread read/write.
+
+- HDFS-5267.
+ Minor improvement reported by Junping Du and fixed by Junping Du
+ Remove volatile from LightWeightHashSet
+
+- HDFS-5260.
+ Major new feature reported by Chris Nauroth and fixed by Chris Nauroth (hdfs-client , libhdfs)
+ Merge zero-copy memory-mapped HDFS client reads to trunk and branch-2.
+
+- HDFS-5257.
+ Major bug reported by Vinayakumar B and fixed by Vinayakumar B (hdfs-client , namenode)
+ addBlock() retry should return LocatedBlock with locations else client will get AIOBE
+
+- HDFS-5252.
+ Major sub-task reported by Brandon Li and fixed by Brandon Li (nfs)
+ Stable write is not handled correctly in someplace
+
+- HDFS-5240.
+ Major sub-task reported by Daryn Sharp and fixed by Daryn Sharp (namenode)
+ Separate formatting from logging in the audit logger API
+
+- HDFS-5239.
+ Major sub-task reported by Daryn Sharp and fixed by Daryn Sharp (namenode)
+ Allow FSNamesystem lock fairness to be configurable
+
+- HDFS-5220.
+ Major improvement reported by Rob Weltman and fixed by Jimmy Xiang (namenode)
+ Expose group resolution time as metric
+
+- HDFS-5207.
+ Major improvement reported by Junping Du and fixed by Junping Du (namenode)
+ In BlockPlacementPolicy, update 2 parameters of chooseTarget()
+
+- HDFS-5188.
+ Major improvement reported by Tsz Wo (Nicholas), SZE and fixed by Tsz Wo (Nicholas), SZE (namenode)
+ Clean up BlockPlacementPolicy and its implementations
+
+- HDFS-5171.
+ Major sub-task reported by Brandon Li and fixed by Haohui Mai (nfs)
+ NFS should create input stream for a file and try to share it with multiple read requests
+
+- HDFS-5170.
+ Trivial bug reported by Andrew Wang and fixed by Andrew Wang
+ BlockPlacementPolicyDefault uses the wrong classname when alerting to enable debug logging
+
+- HDFS-5164.
+ Minor bug reported by Colin Patrick McCabe and fixed by Colin Patrick McCabe (namenode)
+ deleteSnapshot should check if OperationCategory.WRITE is possible before taking write lock
+
+- HDFS-5144.
+ Minor improvement reported by Akira AJISAKA and fixed by Akira AJISAKA (documentation)
+ Document time unit to NameNodeMetrics.java
+
+- HDFS-5136.
+ Major sub-task reported by Brandon Li and fixed by Brandon Li (nfs)
+ MNT EXPORT should give the full group list which can mount the exports
+
+- HDFS-5130.
+ Minor test reported by Binglin Chang and fixed by Binglin Chang (test)
+ Add test for snapshot related FsShell and DFSAdmin commands
+
+- HDFS-5122.
+ Major bug reported by Arpit Gupta and fixed by Haohui Mai (ha , webhdfs)
+ Support failover and retry in WebHdfsFileSystem for NN HA
+
+- HDFS-5110.
+ Major sub-task reported by Brandon Li and fixed by Brandon Li (nfs)
+ Change FSDataOutputStream to HdfsDataOutputStream for opened streams to fix type cast error
+
+- HDFS-5107.
+ Major sub-task reported by Brandon Li and fixed by Brandon Li (nfs)
+ Fix array copy error in Readdir and Readdirplus responses
+
+- HDFS-5104.
+ Major sub-task reported by Brandon Li and fixed by Brandon Li (nfs)
+ Support dotdot name in NFS LOOKUP operation
+
+- HDFS-5093.
+ Minor bug reported by Chuan Liu and fixed by Chuan Liu (test)
+ TestGlobPaths should re-use the MiniDFSCluster to avoid failure on Windows
+
+- HDFS-5078.
+ Major sub-task reported by Brandon Li and fixed by Brandon Li (nfs)
+ Support file append in NFSv3 gateway to enable data streaming to HDFS
+
+- HDFS-5075.
+ Major bug reported by Timothy St. Clair and fixed by Timothy St. Clair
+ httpfs-config.sh calls out incorrect env script name
+
+- HDFS-5074.
+ Major bug reported by Todd Lipcon and fixed by Todd Lipcon (ha , namenode)
+ Allow starting up from an fsimage checkpoint in the middle of a segment
+
+- HDFS-5073.
+ Minor bug reported by Kihwal Lee and fixed by Arpit Agarwal (test)
+ TestListCorruptFileBlocks fails intermittently
+
+- HDFS-5071.
+ Major sub-task reported by Kihwal Lee and fixed by Brandon Li (nfs)
+ Change hdfs-nfs parent project to hadoop-project
+
+- HDFS-5069.
+ Major sub-task reported by Brandon Li and fixed by Brandon Li (nfs)
+ Include hadoop-nfs and hadoop-hdfs-nfs into hadoop dist for NFS deployment
+
+- HDFS-5068.
+ Major improvement reported by Konstantin Shvachko and fixed by Konstantin Shvachko (benchmarks)
+ Convert NNThroughputBenchmark to a Tool to allow generic options.
+
+- HDFS-5065.
+ Major bug reported by Ivan Mitic and fixed by Ivan Mitic (hdfs-client , test)
+ TestSymlinkHdfsDisable fails on Windows
+
+- HDFS-5043.
+ Major bug reported by Brandon Li and fixed by Brandon Li
+ For HdfsFileStatus, set default value of childrenNum to -1 instead of 0 to avoid confusing applications
+
+- HDFS-5037.
+ Critical improvement reported by Todd Lipcon and fixed by Andrew Wang (ha , namenode)
+ Active NN should trigger its own edit log rolls
+
+- HDFS-5035.
+ Major bug reported by Andrew Wang and fixed by Andrew Wang (namenode)
+ getFileLinkStatus and rename do not correctly check permissions of symlinks
+
+- HDFS-5034.
+ Trivial improvement reported by Andrew Wang and fixed by Andrew Wang (namenode)
+ Remove debug prints from getFileLinkInfo
+
+- HDFS-5023.
+ Major bug reported by Ravi Prakash and fixed by Mit Desai (snapshots , test)
+ TestSnapshotPathINodes.testAllowSnapshot is failing with jdk7
+
+- HDFS-5014.
+ Major bug reported by Vinayakumar B and fixed by Vinayakumar B (datanode , ha)
+ BPOfferService#processCommandFromActor() synchronization on namenode RPC call delays IBR to Active NN, if Stanby NN is unstable
+
+- HDFS-5004.
+ Major improvement reported by Trevor Lorimer and fixed by Trevor Lorimer (namenode)
+ Add additional JMX bean for NameNode status data
+
+- HDFS-4997.
+ Major bug reported by Colin Patrick McCabe and fixed by Colin Patrick McCabe (libhdfs)
+ libhdfs doesn't return correct error codes in most cases
+ libhdfs now returns correct codes in errno. Previously, due to a bug, many functions set errno to 255 instead of the more specific error code.
+- HDFS-4995.
+ Major bug reported by Kihwal Lee and fixed by Kihwal Lee (namenode)
+ Make getContentSummary() less expensive
+
+- HDFS-4994.
+ Minor bug reported by Kihwal Lee and fixed by Robert Parker (namenode)
+ Audit log getContentSummary() calls
+
+- HDFS-4983.
+ Major improvement reported by Harsh J and fixed by Yongjun Zhang (webhdfs)
+ Numeric usernames do not work with WebHDFS FS
+ Add a new configuration property "dfs.webhdfs.user.provider.user.pattern" for specifying user name filters for WebHDFS.
+- HDFS-4962.
+ Minor sub-task reported by Tsz Wo (Nicholas), SZE and fixed by Tsz Wo (Nicholas), SZE (nfs)
+ Use enum for nfs constants
+
+- HDFS-4949.
+ Major new feature reported by Andrew Wang and fixed by Andrew Wang (datanode , namenode)
+ Centralized cache management in HDFS
+
+- HDFS-4948.
+ Major bug reported by Robert Joseph Evans and fixed by Brandon Li
+ mvn site for hadoop-hdfs-nfs fails
+
+- HDFS-4947.
+ Major sub-task reported by Brandon Li and fixed by Jing Zhao (nfs)
+ Add NFS server export table to control export by hostname or IP range
+
+- HDFS-4885.
+ Major sub-task reported by Junping Du and fixed by Junping Du
+ Update verifyBlockPlacement() API in BlockPlacementPolicy
+
+- HDFS-4879.
+ Major improvement reported by Todd Lipcon and fixed by Todd Lipcon (namenode)
+ Add "blocked ArrayList" collection to avoid CMS full GCs
+
+- HDFS-4860.
+ Major improvement reported by Trevor Lorimer and fixed by Trevor Lorimer (namenode)
+ Add additional attributes to JMX beans
+
+- HDFS-4816.
+ Major bug reported by Andrew Wang and fixed by Andrew Wang (namenode)
+ transitionToActive blocks if the SBN is doing checkpoint image transfer
+
+- HDFS-4772.
+ Minor improvement reported by Brandon Li and fixed by Brandon Li (namenode)
+ Add number of children in HdfsFileStatus
+
+- HDFS-4763.
+ Major sub-task reported by Brandon Li and fixed by Brandon Li (nfs)
+ Add script changes/utility for starting NFS gateway
+
+- HDFS-4762.
+ Major sub-task reported by Brandon Li and fixed by Brandon Li (nfs)
+ Provide HDFS based NFSv3 and Mountd implementation
+
+- HDFS-4657.
+ Major bug reported by Aaron T. Myers and fixed by Aaron T. Myers (namenode)
+ Limit the number of blocks logged by the NN after a block report to a configurable value.
+
+- HDFS-4633.
+ Major bug reported by Chris Nauroth and fixed by Chris Nauroth (hdfs-client , test)
+ TestDFSClientExcludedNodes fails sporadically if excluded nodes cache expires too quickly
+
+- HDFS-4517.
+ Major test reported by Vadim Bondarev and fixed by Ivan A. Veselovsky
+ Cover class RemoteBlockReader with unit tests
+
+- HDFS-4516.
+ Critical bug reported by Uma Maheswara Rao G and fixed by Vinayakumar B (namenode)
+ Client crash after block allocation and NN switch before lease recovery for the same file can cause readers to fail forever
+
+- HDFS-4512.
+ Major test reported by Vadim Bondarev and fixed by Vadim Bondarev
+ Cover package org.apache.hadoop.hdfs.server.common with tests
+
+- HDFS-4511.
+ Major test reported by Vadim Bondarev and fixed by Andrey Klochkov
+ Cover package org.apache.hadoop.hdfs.tools with unit test
+
+- HDFS-4510.
+ Major test reported by Vadim Bondarev and fixed by Andrey Klochkov
+ Cover classes ClusterJspHelper/NamenodeJspHelper with unit tests
+
+- HDFS-4491.
+ Major test reported by Tsuyoshi OZAWA and fixed by Andrey Klochkov (test)
+ Parallel testing HDFS
+
+- HDFS-4376.
+ Major bug reported by Aaron T. Myers and fixed by Junping Du (balancer)
+ Fix several race conditions in Balancer and resolve intermittent timeout of TestBalancerWithNodeGroup
+
+- HDFS-4329.
+ Major bug reported by Andy Isaacson and fixed by Cristina L. Abad (hdfs-client)
+ DFSShell issues with directories with spaces in name
+
+- HDFS-4278.
+ Major improvement reported by Harsh J and fixed by Kousuke Saruta (datanode , namenode)
+ Log an ERROR when DFS_BLOCK_ACCESS_TOKEN_ENABLE config is disabled but security is turned on.
+
+- HDFS-4201.
+ Critical bug reported by Eli Collins and fixed by Jimmy Xiang (namenode)
+ NPE in BPServiceActor#sendHeartBeat
+
+- HDFS-4096.
+ Major sub-task reported by Jing Zhao and fixed by Haohui Mai (datanode , namenode)
+ Add snapshot information to namenode WebUI
+
+- HDFS-3987.
+ Major sub-task reported by Alejandro Abdelnur and fixed by Haohui Mai
+ Support webhdfs over HTTPS
+
+- HDFS-3981.
+ Major bug reported by Xiaobo Peng and fixed by Xiaobo Peng (namenode)
+ access time is set without holding FSNamesystem write lock
+
+- HDFS-3934.
+ Minor bug reported by Andy Isaacson and fixed by Colin Patrick McCabe
+ duplicative dfs_hosts entries handled wrong
+
+- HDFS-2933.
+ Major improvement reported by Philip Zeyliger and fixed by Vivek Ganesan (datanode)
+ Improve DataNode Web UI Index Page
+
+- HADOOP-10317.
+ Major bug reported by Andrew Wang and fixed by Andrew Wang
+ Rename branch-2.3 release version from 2.4.0-SNAPSHOT to 2.3.0-SNAPSHOT
+
+- HADOOP-10313.
+ Major bug reported by Alejandro Abdelnur and fixed by Alejandro Abdelnur (build)
+ Script and jenkins job to produce Hadoop release artifacts
+
+- HADOOP-10311.
+ Blocker bug reported by Suresh Srinivas and fixed by Alejandro Abdelnur
+ Cleanup vendor names from the code base
+
+- HADOOP-10310.
+ Blocker bug reported by Aaron T. Myers and fixed by Aaron T. Myers (security)
+ SaslRpcServer should be initialized even when no secret manager present
+
+- HADOOP-10305.
+ Major bug reported by Akira AJISAKA and fixed by Akira AJISAKA (metrics)
+ Add "rpc.metrics.quantile.enable" and "rpc.metrics.percentiles.intervals" to core-default.xml
+
+- HADOOP-10292.
+ Major bug reported by Haohui Mai and fixed by Haohui Mai
+ Restore HttpServer from branch-2.2 in branch-2
+
+- HADOOP-10291.
+ Major bug reported by Mit Desai and fixed by Mit Desai
+ TestSecurityUtil#testSocketAddrWithIP fails
+
+- HADOOP-10288.
+ Major bug reported by Todd Lipcon and fixed by Todd Lipcon (util)
+ Explicit reference to Log4JLogger breaks non-log4j users
+
+- HADOOP-10274.
+ Minor improvement reported by takeshi.miao and fixed by takeshi.miao (security)
+ Lower the logging level from ERROR to WARN for UGI.doAs method
+
+- HADOOP-10273.
+ Major bug reported by Arpit Agarwal and fixed by Arpit Agarwal (build)
+ Fix 'mvn site'
+
+- HADOOP-10255.
+ Blocker bug reported by Haohui Mai and fixed by Haohui Mai
+ Rename HttpServer to HttpServer2 to retain older HttpServer in branch-2 for compatibility
+
+- HADOOP-10252.
+ Major bug reported by Jimmy Xiang and fixed by Jimmy Xiang
+ HttpServer can't start if hostname is not specified
+
+- HADOOP-10250.
+ Major bug reported by Yongjun Zhang and fixed by Yongjun Zhang
+ VersionUtil returns wrong value when comparing two versions
+
+- HADOOP-10248.
+ Major improvement reported by Ted Yu and fixed by Akira AJISAKA
+ Property name should be included in the exception where property value is null
+
+- HADOOP-10240.
+ Trivial bug reported by Chris Nauroth and fixed by Chris Nauroth (documentation)
+ Windows build instructions incorrectly state requirement of protoc 2.4.1 instead of 2.5.0
+
+- HADOOP-10236.
+ Trivial bug reported by Akira AJISAKA and fixed by Akira AJISAKA
+ Fix typo in o.a.h.ipc.Client#checkResponse
+
+- HADOOP-10235.
+ Major bug reported by Alejandro Abdelnur and fixed by Alejandro Abdelnur (build)
+ Hadoop tarball has 2 versions of stax-api JARs
+
+- HADOOP-10234.
+ Major bug reported by Chris Nauroth and fixed by Chris Nauroth (scripts)
+ "hadoop.cmd jar" does not propagate exit code.
+
+- HADOOP-10228.
+ Minor improvement reported by Haohui Mai and fixed by Haohui Mai (fs)
+ FsPermission#fromShort() should cache FsAction.values()
+
+- HADOOP-10223.
+ Minor bug reported by Ted Yu and fixed by Ted Yu
+ MiniKdc#main() should close the FileReader it creates
+
+- HADOOP-10214.
+ Major bug reported by Liang Xie and fixed by Liang Xie (ha)
+ Fix multithreaded correctness warnings in ActiveStandbyElector
+
+- HADOOP-10212.
+ Major bug reported by Akira AJISAKA and fixed by Akira AJISAKA (documentation)
+ Incorrect compile command in Native Library document
+
+- HADOOP-10208.
+ Trivial improvement reported by Benoy Antony and fixed by Benoy Antony
+ Remove duplicate initialization in StringUtils.getStringCollection
+
+- HADOOP-10207.
+ Minor test reported by Jimmy Xiang and fixed by Jimmy Xiang
+ TestUserGroupInformation#testLogin is flaky
+
+- HADOOP-10203.
+ Major bug reported by Andrei Savu and fixed by Andrei Savu (fs/s3)
+ Connection leak in Jets3tNativeFileSystemStore#retrieveMetadata
+
+- HADOOP-10198.
+ Minor improvement reported by Colin Patrick McCabe and fixed by Colin Patrick McCabe (native)
+ DomainSocket: add support for socketpair
+
+- HADOOP-10193.
+ Minor bug reported by Gregory Chanan and fixed by Gregory Chanan (security)
+ hadoop-auth's PseudoAuthenticationHandler can consume getInputStream
+
+- HADOOP-10178.
+ Major bug reported by shanyu zhao and fixed by shanyu zhao (conf)
+ Configuration deprecation always emit "deprecated" warnings when a new key is used
+
+- HADOOP-10175.
+ Major bug reported by Chuan Liu and fixed by Chuan Liu (fs)
+ Har files system authority should preserve userinfo
+
+- HADOOP-10173.
+ Critical improvement reported by Daryn Sharp and fixed by Daryn Sharp (ipc)
+ Remove UGI from DIGEST-MD5 SASL server creation
+
+- HADOOP-10172.
+ Critical improvement reported by Daryn Sharp and fixed by Daryn Sharp (ipc)
+ Cache SASL server factories
+
+- HADOOP-10171.
+ Major bug reported by Mit Desai and fixed by Mit Desai
+ TestRPC fails intermittently on jkd7
+
+- HADOOP-10169.
+ Minor improvement reported by Liang Xie and fixed by Liang Xie (metrics)
+ remove the unnecessary synchronized in JvmMetrics class
+
+- HADOOP-10168.
+ Major bug reported by Thejas M Nair and fixed by Thejas M Nair
+ fix javadoc of ReflectionUtils.copy
+
+- HADOOP-10167.
+ Major improvement reported by Mikhail Antonov and fixed by (build)
+ Mark hadoop-common source as UTF-8 in Maven pom files / refactoring
+
+- HADOOP-10164.
+ Major improvement reported by Robert Joseph Evans and fixed by Robert Joseph Evans
+ Allow UGI to login with a known Subject
+
+- HADOOP-10162.
+ Major bug reported by Mit Desai and fixed by Mit Desai
+ Fix symlink-related test failures in TestFileContextResolveAfs and TestStat in branch-2
+
+- HADOOP-10147.
+ Minor bug reported by Eric Sirianni and fixed by Steve Loughran (build)
+ Upgrade to commons-logging 1.1.3 to avoid potential deadlock in MiniDFSCluster
+
+- HADOOP-10146.
+ Critical bug reported by Daryn Sharp and fixed by Daryn Sharp (util)
+ Workaround JDK7 Process fd close bug
+
+- HADOOP-10143.
+ Major improvement reported by Liang Xie and fixed by Liang Xie (io)
+ replace WritableFactories's hashmap with ConcurrentHashMap
+
+- HADOOP-10142.
+ Major bug reported by Vinayakumar B and fixed by Vinayakumar B
+ Avoid groups lookup for unprivileged users such as "dr.who"
+
+- HADOOP-10135.
+ Major bug reported by David Dobbins and fixed by David Dobbins (fs)
+ writes to swift fs over partition size leave temp files and empty output file
+
+- HADOOP-10132.
+ Minor improvement reported by Ted Yu and fixed by Ted Yu
+ RPC#stopProxy() should log the class of proxy when IllegalArgumentException is encountered
+
+- HADOOP-10130.
+ Minor bug reported by Binglin Chang and fixed by Binglin Chang
+ RawLocalFS::LocalFSFileInputStream.pread does not track FS::Statistics
+
+- HADOOP-10129.
+ Critical bug reported by Daryn Sharp and fixed by Daryn Sharp (tools/distcp)
+ Distcp may succeed when it fails
+
+- HADOOP-10127.
+ Major bug reported by Karthik Kambatla and fixed by Karthik Kambatla (ipc)
+ Add ipc.client.connect.retry.interval to control the frequency of connection retries
+
+- HADOOP-10126.
+ Minor bug reported by Vinayakumar B and fixed by Vinayakumar B (util)
+ LightWeightGSet log message is confusing : "2.0% max memory = 2.0 GB"
+
+- HADOOP-10125.
+ Major bug reported by Ming Ma and fixed by Ming Ma (ipc)
+ no need to process RPC request if the client connection has been dropped
+
+- HADOOP-10112.
+ Major bug reported by Brandon Li and fixed by Brandon Li (tools)
+ har file listing doesn't work with wild card
+
+- HADOOP-10111.
+ Major improvement reported by Kihwal Lee and fixed by Kihwal Lee
+ Allow DU to be initialized with an initial value
+
+- HADOOP-10110.
+ Blocker bug reported by Chuan Liu and fixed by Chuan Liu (build)
+ hadoop-auth has a build break due to missing dependency
+
+- HADOOP-10109.
+ Major sub-task reported by Colin Patrick McCabe and fixed by Colin Patrick McCabe (test)
+ Fix test failure in TestOfflineEditsViewer introduced by HADOOP-10052
+
+- HADOOP-10107.
+ Major sub-task reported by Tsz Wo (Nicholas), SZE and fixed by Kihwal Lee (ipc)
+ Server.getNumOpenConnections may throw NPE
+
+- HADOOP-10106.
+ Minor bug reported by Ming Ma and fixed by Ming Ma
+ Incorrect thread name in RPC log messages
+
+- HADOOP-10103.
+ Minor sub-task reported by Steve Loughran and fixed by Akira AJISAKA (build)
+ update commons-lang to 2.6
+
+- HADOOP-10102.
+ Minor sub-task reported by Steve Loughran and fixed by Akira AJISAKA (build)
+ update commons IO from 2.1 to 2.4
+
+- HADOOP-10100.
+ Major bug reported by Robert Kanter and fixed by Robert Kanter
+ MiniKDC shouldn't use apacheds-all artifact
+
+- HADOOP-10095.
+ Minor improvement reported by Nicolas Liochon and fixed by Nicolas Liochon (io)
+ Performance improvement in CodecPool
+
+- HADOOP-10094.
+ Trivial bug reported by Enis Soztutar and fixed by Enis Soztutar (util)
+ NPE in GenericOptionsParser#preProcessForWindows()
+
+- HADOOP-10093.
+ Major bug reported by shanyu zhao and fixed by shanyu zhao (conf)
+ hadoop-env.cmd sets HADOOP_CLIENT_OPTS with a max heap size that is too small.
+
+- HADOOP-10090.
+ Major bug reported by Ivan Mitic and fixed by Ivan Mitic (metrics)
+ Jobtracker metrics not updated properly after execution of a mapreduce job
+
+- HADOOP-10088.
+ Major bug reported by Raja Aluri and fixed by Raja Aluri (build)
+ copy-nativedistlibs.sh needs to quote snappy lib dir
+
+- HADOOP-10087.
+ Major bug reported by Yu Gao and fixed by Colin Patrick McCabe (security)
+ UserGroupInformation.getGroupNames() fails to return primary group first when JniBasedUnixGroupsMappingWithFallback is used
+
+- HADOOP-10086.
+ Minor improvement reported by Masatake Iwasaki and fixed by Masatake Iwasaki (documentation)
+ User document for authentication in secure cluster
+
+- HADOOP-10081.
+ Critical bug reported by Jason Lowe and fixed by Tsuyoshi OZAWA (ipc)
+ Client.setupIOStreams can leak socket resources on exception or error
+
+- HADOOP-10079.
+ Major improvement reported by Colin Patrick McCabe and fixed by Colin Patrick McCabe
+ log a warning message if group resolution takes too long.
+
+- HADOOP-10078.
+ Minor bug reported by Robert Kanter and fixed by Robert Kanter (security)
+ KerberosAuthenticator always does SPNEGO
+
+- HADOOP-10072.
+ Trivial bug reported by Chris Nauroth and fixed by Chris Nauroth (nfs , test)
+ TestNfsExports#testMultiMatchers fails due to non-deterministic timing around cache expiry check.
+
+- HADOOP-10067.
+ Minor improvement reported by Robert Rati and fixed by Robert Rati
+ Missing POM dependency on jsr305
+
+- HADOOP-10064.
+ Major improvement reported by Arpit Agarwal and fixed by Arpit Agarwal (build)
+ Upgrade to maven antrun plugin version 1.7
+
+- HADOOP-10058.
+ Minor bug reported by Akira AJISAKA and fixed by Chen He (metrics)
+ TestMetricsSystemImpl#testInitFirstVerifyStopInvokedImmediately fails on trunk
+
+- HADOOP-10055.
+ Trivial bug reported by Eli Collins and fixed by Akira AJISAKA (documentation)
+ FileSystemShell.apt.vm doc has typo "numRepicas"
+
+- HADOOP-10052.
+ Major sub-task reported by Andrew Wang and fixed by Andrew Wang (fs)
+ Temporarily disable client-side symlink resolution
+
+- HADOOP-10047.
+ Major new feature reported by Gopal V and fixed by Gopal V (io)
+ Add a directbuffer Decompressor API to hadoop
+ Direct Bytebuffer decompressors for Zlib (Deflate & Gzip) and Snappy
+- HADOOP-10046.
+ Trivial improvement reported by David S. Wang and fixed by David S. Wang
+ Print a log message when SSL is enabled
+
+- HADOOP-10040.
+ Major bug reported by Yingda Chen and fixed by Chris Nauroth
+ hadoop.cmd in UNIX format and would not run by default on Windows
+
+- HADOOP-10039.
+ Major bug reported by Suresh Srinivas and fixed by Haohui Mai (security)
+ Add Hive to the list of projects using AbstractDelegationTokenSecretManager
+
+- HADOOP-10031.
+ Major bug reported by Chuan Liu and fixed by Chuan Liu (fs)
+ FsShell -get/copyToLocal/moveFromLocal should support Windows local path
+
+- HADOOP-10030.
+ Major bug reported by Chuan Liu and fixed by Chuan Liu
+ FsShell -put/copyFromLocal should support Windows local path
+
+- HADOOP-10029.
+ Major bug reported by Suresh Srinivas and fixed by Suresh Srinivas (fs)
+ Specifying har file to MR job fails in secure cluster
+
+- HADOOP-10028.
+ Minor bug reported by Jing Zhao and fixed by Haohui Mai
+ Malformed ssl-server.xml.example
+
+- HADOOP-10006.
+ Blocker bug reported by Junping Du and fixed by Junping Du (fs , util)
+ Compilation failure in trunk for o.a.h.fs.swift.util.JSONUtil
+
+- HADOOP-10005.
+ Trivial improvement reported by Jackie Chang and fixed by Jackie Chang
+ No need to check INFO severity level is enabled or not
+
+- HADOOP-9998.
+ Major improvement reported by Junping Du and fixed by Junping Du (net)
+ Provide methods to clear only part of the DNSToSwitchMapping
+
+- HADOOP-9982.
+ Major bug reported by Akira AJISAKA and fixed by Akira AJISAKA (documentation)
+ Fix dead links in hadoop site docs
+
+- HADOOP-9981.
+ Critical bug reported by Kihwal Lee and fixed by Colin Patrick McCabe
+ globStatus should minimize its listStatus and getFileStatus calls
+
+- HADOOP-9964.
+ Major bug reported by Junping Du and fixed by Junping Du (util)
+ O.A.H.U.ReflectionUtils.printThreadInfo() is not thread-safe which cause TestHttpServer pending 10 minutes or longer.
+
+- HADOOP-9956.
+ Major sub-task reported by Daryn Sharp and fixed by Daryn Sharp (ipc)
+ RPC listener inefficiently assigns connections to readers
+
+- HADOOP-9955.
+ Major sub-task reported by Daryn Sharp and fixed by Daryn Sharp (ipc)
+ RPC idle connection closing is extremely inefficient
+
+- HADOOP-9929.
+ Major bug reported by Jason Lowe and fixed by Colin Patrick McCabe (fs)
+ Insufficient permissions for a path reported as file not found
+
+- HADOOP-9915.
+ Trivial improvement reported by Binglin Chang and fixed by Binglin Chang
+ o.a.h.fs.Stat support on Macosx
+
+- HADOOP-9909.
+ Major improvement reported by Shinichi Yamashita and fixed by (fs)
+ org.apache.hadoop.fs.Stat should permit other LANG
+
+- HADOOP-9908.
+ Major bug reported by Todd Lipcon and fixed by Todd Lipcon (util)
+ Fix NPE when versioninfo properties file is missing
+
+- HADOOP-9898.
+ Minor bug reported by Todd Lipcon and fixed by Todd Lipcon (ipc , net)
+ Set SO_KEEPALIVE on all our sockets
+
+- HADOOP-9897.
+ Trivial improvement reported by Binglin Chang and fixed by Binglin Chang (fs)
+ Add method to get path start position without drive specifier in o.a.h.fs.Path
+
+- HADOOP-9889.
+ Major bug reported by Wei Yan and fixed by Wei Yan
+ Refresh the Krb5 configuration when creating a new kdc in Hadoop-MiniKDC
+
+- HADOOP-9887.
+ Major bug reported by Chris Nauroth and fixed by Chuan Liu (fs)
+ globStatus does not correctly handle paths starting with a drive spec on Windows
+
+- HADOOP-9875.
+ Minor bug reported by Aaron T. Myers and fixed by Aaron T. Myers (test)
+ TestDoAsEffectiveUser can fail on JDK 7
+
+- HADOOP-9871.
+ Minor bug reported by Luke Lu and fixed by Junping Du
+ Fix intermittent findbug warnings in DefaultMetricsSystem
+
+- HADOOP-9866.
+ Major test reported by Alejandro Abdelnur and fixed by Wei Yan (test)
+ convert hadoop-auth testcases requiring kerberos to use minikdc
+
+- HADOOP-9865.
+ Major bug reported by Chuan Liu and fixed by Chuan Liu
+ FileContext.globStatus() has a regression with respect to relative path
+
+- HADOOP-9860.
+ Major improvement reported by Wei Yan and fixed by Wei Yan
+ Remove class HackedKeytab and HackedKeytabEncoder from hadoop-minikdc once jira DIRSERVER-1882 solved
+
+- HADOOP-9848.
+ Major new feature reported by Wei Yan and fixed by Wei Yan (security , test)
+ Create a MiniKDC for use with security testing
+
+- HADOOP-9847.
+ Minor bug reported by Andrew Wang and fixed by Colin Patrick McCabe
+ TestGlobPath symlink tests fail to cleanup properly
+
+- HADOOP-9833.
+ Minor improvement reported by Steve Loughran and fixed by Kousuke Saruta (build)
+ move slf4j to version 1.7.5
+
+- HADOOP-9830.
+ Trivial bug reported by Dmitry Lysnichenko and fixed by Kousuke Saruta (documentation)
+ Typo at http://hadoop.apache.org/docs/current/
+
+- HADOOP-9820.
+ Blocker bug reported by Daryn Sharp and fixed by Daryn Sharp (ipc , security)
+ RPCv9 wire protocol is insufficient to support multiplexing
+
+- HADOOP-9817.
+ Major bug reported by Colin Patrick McCabe and fixed by Colin Patrick McCabe
+ FileSystem#globStatus and FileContext#globStatus need to work with symlinks
+
+- HADOOP-9806.
+ Major bug reported by Brandon Li and fixed by Brandon Li (nfs)
+ PortmapInterface should check if the procedure is out-of-range
+
+- HADOOP-9791.
+ Major bug reported by Ivan Mitic and fixed by Ivan Mitic
+ Add a test case covering long paths for new FileUtil access check methods
+
+- HADOOP-9787.
+ Major bug reported by Karthik Kambatla and fixed by Karthik Kambatla (util)
+ ShutdownHelper util to shutdown threads and threadpools
+
+- HADOOP-9784.
+ Major improvement reported by Junping Du and fixed by Junping Du
+ Add a builder for HttpServer
+
+- HADOOP-9748.
+ Critical sub-task reported by Daryn Sharp and fixed by Daryn Sharp (security)
+ Reduce blocking on UGI.ensureInitialized
+
+- HADOOP-9703.
+ Minor bug reported by Mark Miller and fixed by Tsuyoshi OZAWA
+ org.apache.hadoop.ipc.Client leaks threads on stop.
+
+- HADOOP-9698.
+ Blocker sub-task reported by Daryn Sharp and fixed by Daryn Sharp (ipc)
+ RPCv9 client must honor server's SASL negotiate response
+ The RPC client now waits for the Server's SASL negotiate response before instantiating its SASL client.
+- HADOOP-9693.
+ Trivial improvement reported by Steve Loughran and fixed by
+ Shell should add a probe for OSX
+
+- HADOOP-9686.
+ Major improvement reported by Jason Lowe and fixed by Jason Lowe (conf)
+ Easy access to final parameters in Configuration
+
+- HADOOP-9683.
+ Blocker sub-task reported by Luke Lu and fixed by Daryn Sharp (ipc)
+ Wrap IpcConnectionContext in RPC headers
+ Connection context is now sent as a rpc header wrapped protobuf.
+- HADOOP-9660.
+ Major bug reported by Enis Soztutar and fixed by Enis Soztutar (scripts , util)
+ [WINDOWS] Powershell / cmd parses -Dkey=value from command line as [-Dkey, value] which breaks GenericsOptionParser
+
+- HADOOP-9652.
+ Major improvement reported by Colin Patrick McCabe and fixed by Andrew Wang
+ Allow RawLocalFs#getFileLinkStatus to fill in the link owner and mode if requested
+
+- HADOOP-9635.
+ Major bug reported by V. Karthik Kumar and fixed by (native)
+ Fix Potential Stack Overflow in DomainSocket.c
+
+- HADOOP-9623.
+ Major improvement reported by Timothy St. Clair and fixed by Amandeep Khurana (fs/s3)
+ Update jets3t dependency to 0.9.0
+
+- HADOOP-9618.
+ Major new feature reported by Todd Lipcon and fixed by Todd Lipcon (util)
+ Add thread which detects JVM pauses
+
+- HADOOP-9611.
+ Major improvement reported by Timothy St. Clair and fixed by Timothy St. Clair (build)
+ mvn-rpmbuild against google-guice > 3.0 yields missing cglib dependency
+
+- HADOOP-9598.
+ Major test reported by Aleksey Gorshkov and fixed by Andrey Klochkov
+ Improve code coverage of RMAdminCLI
+
+- HADOOP-9594.
+ Major improvement reported by Timothy St. Clair and fixed by Timothy St. Clair (build)
+ Update apache commons math dependency
+
+- HADOOP-9582.
+ Major bug reported by Ashwin Shankar and fixed by Ashwin Shankar (conf)
+ Non-existent file to "hadoop fs -conf" doesn't throw error
+
+- HADOOP-9527.
+ Major bug reported by Arpit Agarwal and fixed by Arpit Agarwal (fs , test)
+ Add symlink support to LocalFileSystem on Windows
+
+- HADOOP-9515.
+ Major new feature reported by Brandon Li and fixed by Brandon Li
+ Add general interface for NFS and Mount
+
+- HADOOP-9509.
+ Major new feature reported by Brandon Li and fixed by Brandon Li
+ Implement ONCRPC and XDR
+
+- HADOOP-9494.
+ Major improvement reported by Dennis Y and fixed by Andrey Klochkov
+ Excluded auto-generated and examples code from clover reports
+
+- HADOOP-9487.
+ Major improvement reported by Steve Loughran and fixed by (conf)
+ Deprecation warnings in Configuration should go to their own log or otherwise be suppressible
+
+- HADOOP-9470.
+ Major improvement reported by Ivan A. Veselovsky and fixed by Ivan A. Veselovsky (test)
+ eliminate duplicate FQN tests in different Hadoop modules
+
+- HADOOP-9432.
+ Minor new feature reported by Steve Loughran and fixed by (build , documentation)
+ Add support for markdown .md files in site documentation
+
+- HADOOP-9421.
+ Blocker sub-task reported by Sanjay Radia and fixed by Daryn Sharp
+ Convert SASL to use ProtoBuf and provide negotiation capabilities
+ Raw SASL protocol now uses protobufs wrapped with RPC headers.
+The negotiation sequence incorporates the state of the exchange.
+The server now has the ability to advertise its supported auth types.
+- HADOOP-9420.
+ Major bug reported by Todd Lipcon and fixed by Liang Xie (ipc , metrics)
+ Add percentile or max metric for rpcQueueTime, processing time
+
+- HADOOP-9417.
+ Major sub-task reported by Andrew Wang and fixed by Andrew Wang (fs)
+ Support for symlink resolution in LocalFileSystem / RawLocalFileSystem
+
+- HADOOP-9350.
+ Minor bug reported by Steve Loughran and fixed by Robert Kanter (build)
+ Hadoop not building against Java7 on OSX
+
+- HADOOP-9319.
+ Major improvement reported by Arpit Agarwal and fixed by Binglin Chang
+ Update bundled lz4 source to latest version
+
+- HADOOP-9291.
+ Major test reported by Ivan A. Veselovsky and fixed by Ivan A. Veselovsky
+ enhance unit-test coverage of package o.a.h.metrics2
+
+- HADOOP-9254.
+ Major test reported by Vadim Bondarev and fixed by Vadim Bondarev
+ Cover packages org.apache.hadoop.util.bloom, org.apache.hadoop.util.hash
+
+- HADOOP-9241.
+ Trivial improvement reported by Harsh J and fixed by Harsh J
+ DU refresh interval is not configurable
+ The 'du' (disk usage command from Unix) script refresh monitor is now configurable in the same way as its 'df' counterpart, via the property 'fs.du.interval', the default of which is 10 minute (in ms).
+- HADOOP-9225.
+ Major test reported by Vadim Bondarev and fixed by Andrey Klochkov
+ Cover package org.apache.hadoop.compress.Snappy
+
+- HADOOP-9199.
+ Major test reported by Vadim Bondarev and fixed by Andrey Klochkov
+ Cover package org.apache.hadoop.io with unit tests
+
+- HADOOP-9114.
+ Minor bug reported by liuyang and fixed by sathish
+ After defined the dfs.checksum.type as the NULL, write file and hflush will through java.lang.ArrayIndexOutOfBoundsException
+
+- HADOOP-9078.
+ Major test reported by Ivan A. Veselovsky and fixed by Ivan A. Veselovsky
+ enhance unit-test coverage of class org.apache.hadoop.fs.FileContext
+
+- HADOOP-9063.
+ Minor test reported by Ivan A. Veselovsky and fixed by Ivan A. Veselovsky
+ enhance unit-test coverage of class org.apache.hadoop.fs.FileUtil
+
+- HADOOP-9016.
+ Minor bug reported by Ivan A. Veselovsky and fixed by Ivan A. Veselovsky
+ org.apache.hadoop.fs.HarFileSystem.HarFSDataInputStream.HarFsInputStream.skip(long) must never return negative value.
+
+- HADOOP-8814.
+ Minor improvement reported by Brandon Li and fixed by Brandon Li (conf , fs , fs/s3 , ha , io , metrics , performance , record , security , util)
+ Inefficient comparison with the empty string. Use isEmpty() instead
+
+- HADOOP-8753.
+ Minor bug reported by Nishan Shetty, Huawei and fixed by Benoy Antony
+ LocalDirAllocator throws "ArithmeticException: / by zero" when there is no available space on configured local dir
+
+- HADOOP-8704.
+ Major improvement reported by Thomas Graves and fixed by Jonathan Eagles
+ add request logging to jetty/httpserver
+
+- HADOOP-8545.
+ Major new feature reported by Tim Miller and fixed by Dmitry Mezhensky (fs)
+ Filesystem Implementation for OpenStack Swift
+ Added file system implementation for OpenStack Swift.
+There are two implementation: block and native (similar to Amazon S3 integration).
+Data locality issue solved by patch in Swift, commit procedure to OpenStack is in progress.
+
+To use implementation add to core-site.xml following:
+...
+ <property>
+ <name>fs.swift.impl</name>
+ <value>com.mirantis.fs.SwiftFileSystem</value>
+ </property>
+ <property>
+ <name>fs.swift.block.impl</name>
+ <value>com.mirantis.fs.block.SwiftBlockFileSystem</value>
+ </property>
+...
+
+In MapReduce job specify following configs for OpenStack Keystone authentication:
+conf.set("swift.auth.url", "http://172.18.66.117:5000/v2.0/tokens");
+conf.set("swift.tenant", "superuser");
+conf.set("swift.username", "admin1");
+conf.set("swift.password", "password");
+conf.setInt("swift.http.port", 8080);
+conf.setInt("swift.https.port", 443);
+
+Additional information specified on github: https://github.com/DmitryMezhensky/Hadoop-and-Swift-integration
+- HADOOP-7344.
+ Major bug reported by Daryn Sharp and fixed by Colin Patrick McCabe (fs)
+ globStatus doesn't grok groupings with a slash
+
+
+