YARN-1696. Added documentation for ResourceManager fail-over. Contributed by Karthik Kambatla, Masatake Iwasaki, Tsuyoshi OZAWA.

svn merge --ignore-ancestry -c 1591416 ../../trunk/


git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1591417 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Vinod Kumar Vavilapalli 2014-04-30 18:01:05 +00:00
parent fdf685576e
commit 0a914b61e0
4 changed files with 238 additions and 0 deletions

View File

@ -101,6 +101,7 @@
<item name="Capacity Scheduler" href="hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html"/>
<item name="Fair Scheduler" href="hadoop-yarn/hadoop-yarn-site/FairScheduler.html"/>
<item name="ResourceManager Restart" href="hadoop-yarn/hadoop-yarn-site/ResourceManagerRestart.html"/>
<item name="ResourceManager HA" href="hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html"/>
<item name="Web Application Proxy" href="hadoop-yarn/hadoop-yarn-site/WebApplicationProxy.html"/>
<item name="YARN Timeline Server" href="hadoop-yarn/hadoop-yarn-site/TimelineServer.html"/>
<item name="Writing YARN Applications" href="hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html"/>

View File

@ -77,6 +77,9 @@ Release 2.4.1 - UNRELEASED
YARN-1892. Improved some logs in the scheduler. (Jian He via zjshen)
YARN-1696. Added documentation for ResourceManager fail-over. (Karthik
Kambatla, Masatake Iwasaki, Tsuyoshi OZAWA via vinodkv)
OPTIMIZATIONS
BUG FIXES

View File

@ -0,0 +1,234 @@
~~ Licensed under the Apache License, Version 2.0 (the "License");
~~ you may not use this file except in compliance with the License.
~~ You may obtain a copy of the License at
~~
~~ http://www.apache.org/licenses/LICENSE-2.0
~~
~~ Unless required by applicable law or agreed to in writing, software
~~ distributed under the License is distributed on an "AS IS" BASIS,
~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
~~ See the License for the specific language governing permissions and
~~ limitations under the License. See accompanying LICENSE file.
---
ResourceManager High Availability
---
---
${maven.build.timestamp}
ResourceManager High Availability
\[ {{{./index.html}Go Back}} \]
%{toc|section=1|fromDepth=0}
* Introduction
This guide provides an overview of High Availability of YARN's ResourceManager,
and details how to configure and use this feature. The ResourceManager (RM)
is responsible for tracking the resources in a cluster, and scheduling
applications (e.g., MapReduce jobs). Prior to Hadoop 2.4, the ResourceManager
is the single point of failure in a YARN cluster. The High Availability
feature adds redundancy in the form of an Active/Standby ResourceManager pair
to remove this otherwise single point of failure.
* Architecture
[images/rm-ha-overview.png] Overview of ResourceManager High Availability
** RM Failover
ResourceManager HA is realized through an Active/Standby architecture - at
any point of time, one of the RMs is Active, and one or more RMs are in
Standby mode waiting to take over should anything happen to the Active.
The trigger to transition-to-active comes from either the admin (through CLI)
or through the integrated failover-controller when automatic-failover is
enabled.
*** Manual transitions and failover
When automatic failover is not enabled, admins have to manually transition
one of the RMs to Active. To failover from one RM to the other, they are
expected to first transition the Active-RM to Standby and transition a
Standby-RM to Active. All this can be done using the "<<<yarn rmadmin>>>"
CLI.
*** Automatic failover
The RMs have an option to embed the Zookeeper-based ActiveStandbyElector to
decide which RM should be the Active. When the Active goes down or becomes
unresponsive, another RM is automatically elected to be the Active which
then takes over. Note that, there is no need to run a separate ZKFC daemon
as is the case for HDFS because ActiveStandbyElector embedded in RMs acts
as a failure detector and a leader elector instead of a separate ZKFC
deamon.
*** Client, ApplicationMaster and NodeManager on RM failover
When there are multiple RMs, the configuration (yarn-site.xml) used by
clients and nodes is expected to list all the RMs. Clients,
ApplicationMasters (AMs) and NodeManagers (NMs) try connecting to the RMs in
a round-robin fashion until they hit the Active RM. If the Active goes down,
they resume the round-robin polling until they hit the "new" Active.
This default retry logic is implemented as
<<<org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider>>>.
You can override the logic by
implementing <<<org.apache.hadoop.yarn.client.RMFailoverProxyProvider>>> and
setting the value of <<<yarn.client.failover-proxy-provider>>> to
the class name.
** Recovering prevous active-RM's state
With the {{{./ResourceManagerRestart.html}ResourceManger Restart}} enabled,
the RM being promoted to an active state loads the RM internal state and
continues to operate from where the previous active left off as much as
possible depending on the RM restart feature. A new attempt is spawned for
each managed application previously submitted to the RM. Applications can
checkpoint periodically to avoid losing any work. The state-store must be
visible from the both of Active/Standby RMs. Currently, there are two
RMStateStore implementations for persistence - FileSystemRMStateStore
and ZKRMStateStore. The <<<ZKRMStateStore>>> implicitly allows write access
to a single RM at any point in time, and hence is the recommended store to
use in an HA cluster. When using the ZKRMStateStore, there is no need for a
separate fencing mechanism to address a potential split-brain situation
where multiple RMs can potentially assume the Active role.
* Deployment
** Configurations
Most of the failover functionality is tunable using various configuration
properties. Following is a list of required/important ones. yarn-default.xml
carries a full-list of knobs. See
{{{../hadoop-yarn-common/yarn-default.xml}yarn-default.xml}}
for more information including default values.
See {{{./ResourceManagerRestart.html}the document for ResourceManger
Restart}} also for instructions on setting up the state-store.
*-------------------------+----------------------------------------------+
|| Configuration Property || Description |
*-------------------------+----------------------------------------------+
| yarn.resourcemanager.zk-address | |
| | Address of the ZK-quorum.
| | Used both for the state-store and embedded leader-election.
*-------------------------+----------------------------------------------+
| yarn.resourcemanager.ha.enabled | |
| | Enable RM HA
*-------------------------+----------------------------------------------+
| yarn.resourcemanager.ha.rm-ids | |
| | List of logical IDs for the RMs. |
| | e.g., "rm1,rm2" |
*-------------------------+----------------------------------------------+
| yarn.resourcemanager.hostname.<rm-id> | |
| | For each <rm-id>, specify the hostname the |
| | RM corresponds to. Alternately, one could set each of the RM's service |
| | addresses. |
*-------------------------+----------------------------------------------+
| yarn.resourcemanager.ha.id | |
| | Identifies the RM in the ensemble. This is optional; |
| | however, if set, admins have to ensure that all the RMs have their own |
| | IDs in the config |
*-------------------------+----------------------------------------------+
| yarn.resourcemanager.ha.automatic-failover.enabled | |
| | Enable automatic failover; |
| | By default, it is enabled only when HA is enabled. |
*-------------------------+----------------------------------------------+
| yarn.resourcemanager.ha.automatic-failover.embedded | |
| | Use embedded leader-elector |
| | to pick the Active RM, when automatic failover is enabled. By default, |
| | it is enabled only when HA is enabled. |
*-------------------------+----------------------------------------------+
| yarn.resourcemanager.cluster-id | |
| | Identifies the cluster. Used by the elector to |
| | ensure an RM doesn't take over as Active for another cluster. |
*-------------------------+----------------------------------------------+
| yarn.client.failover-proxy-provider | |
| | The class to be used by Clients, AMs and NMs to failover to the Active RM. |
*-------------------------+----------------------------------------------+
| yarn.client.failover-max-attempts | |
| | The max number of times FailoverProxyProvider should attempt failover. |
*-------------------------+----------------------------------------------+
| yarn.client.failover-sleep-base-ms | |
| | The sleep base (in milliseconds) to be used for calculating |
| | the exponential delay between failovers. |
*-------------------------+----------------------------------------------+
| yarn.client.failover-sleep-max-ms | |
| | The maximum sleep time (in milliseconds) between failovers |
*-------------------------+----------------------------------------------+
| yarn.client.failover-retries | |
| | The number of retries per attempt to connect to a ResourceManager. |
*-------------------------+----------------------------------------------+
| yarn.client.failover-retries-on-socket-timeouts | |
| | The number of retries per attempt to connect to a ResourceManager on socket timeouts. |
*-------------------------+----------------------------------------------+
*** Sample configurations
Here is the sample of minimal setup for RM failover.
+---+
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>cluster1</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>master1</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>master2</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>zk1:2181,zk2:2181,zk3:2181</value>
</property>
+---+
** Admin commands
<<<yarn rmadmin>>> has a few HA-specific command options to check the health/state of an
RM, and transition to Active/Standby.
Commands for HA take service id of RM set by <<<yarn.resourcemanager.ha.rm-ids>>>
as argument.
+---+
$ yarn rmadmin -getServiceState rm1
active
$ yarn rmadmin -getServiceState rm2
standby
+---+
If automatic failover is enabled, you can not use manual transition command.
+---+
$ yarn rmadmin -transitionToStandby rm1
Automatic failover is enabled for org.apache.hadoop.yarn.client.RMHAServiceTarget@1d8299fd
Refusing to manually manage HA state, since it may cause
a split-brain scenario or other incorrect state.
If you are very sure you know what you are doing, please
specify the forcemanual flag.
+---+
See {{{./YarnCommands.html}YarnCommands}} for more details.
** ResourceManager Web UI services
Assuming a standby RM is up and running, the Standby automatically redirects
all web requests to the Active, except for the "About" page.
** Web Services
Assuming a standby RM is up and running, RM web-services described at
{{{./ResourceManagerRest.html}ResourceManager REST APIs}} when invoked on
a standby RM are automatically redirected to the Active RM.

Binary file not shown.

After

Width:  |  Height:  |  Size: 30 KiB