diff --git a/hadoop-mapreduce-project/CHANGES.txt b/hadoop-mapreduce-project/CHANGES.txt index 4d3ab334402..619dd0f86fe 100644 --- a/hadoop-mapreduce-project/CHANGES.txt +++ b/hadoop-mapreduce-project/CHANGES.txt @@ -392,6 +392,9 @@ Release 0.23.0 - Unreleased MAPREDUCE-3187. Add names for various unnamed threads in MR2. (Todd Lipcon and Siddharth Seth via mahadev) + MAPREDUCE-3136. Added documentation for setting up Hadoop clusters in both + non-secure and secure mode for both HDFS & YARN. (acmurthy) + OPTIMIZATIONS MAPREDUCE-2026. Make JobTracker.getJobCounters() and diff --git a/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/CapacityScheduler.apt.vm b/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/CapacityScheduler.apt.vm index 0f1c3e406ad..ebd4cb70a15 100644 --- a/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/CapacityScheduler.apt.vm +++ b/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/CapacityScheduler.apt.vm @@ -141,7 +141,7 @@ Hadoop MapReduce Next Generation - Capacity Scheduler *--------------------------------------+--------------------------------------+ | <<>> | | | | <<>> | -*--------------------------------------------+--------------------------------------------+ +*--------------------------------------+--------------------------------------+ * Setting up diff --git a/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ClusterSetup.apt.vm b/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ClusterSetup.apt.vm new file mode 100644 index 00000000000..874abf0fc18 --- /dev/null +++ b/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ClusterSetup.apt.vm @@ -0,0 +1,990 @@ +~~ Licensed under the Apache License, Version 2.0 (the "License"); +~~ you may not use this file except in compliance with the License. +~~ You may obtain a copy of the License at +~~ +~~ http://www.apache.org/licenses/LICENSE-2.0 +~~ +~~ Unless required by applicable law or agreed to in writing, software +~~ distributed under the License is distributed on an "AS IS" BASIS, +~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +~~ See the License for the specific language governing permissions and +~~ limitations under the License. See accompanying LICENSE file. + + --- + Hadoop Map Reduce Next Generation-${project.version} - Cluster Setup + --- + --- + ${maven.build.timestamp} + +Hadoop MapReduce Next Generation - Cluster Setup + + \[ {{{./index.html}Go Back}} \] + +%{toc|section=1|fromDepth=0} + +* {Purpose} + + This document describes how to install, configure and manage non-trivial + Hadoop clusters ranging from a few nodes to extremely large clusters + with thousands of nodes. + + To play with Hadoop, you may first want to install it on a single + machine (see {{{SingleCluster}Single Node Setup}}). + +* {Prerequisites} + + Download a stable version of Hadoop from Apache mirrors. + +* {Installation} + + Installing a Hadoop cluster typically involves unpacking the software on all + the machines in the cluster or installing RPMs. + + Typically one machine in the cluster is designated as the NameNode and + another machine the as ResourceManager, exclusively. These are the masters. + + The rest of the machines in the cluster act as both DataNode and NodeManager. + These are the slaves. + +* {Running Hadoop in Non-Secure Mode} + + The following sections describe how to configure a Hadoop cluster. + + * {Configuration Files} + + Hadoop configuration is driven by two types of important configuration files: + + * Read-only default configuration - <<>>, + <<>>, <<>> and + <<>>. + + * Site-specific configuration - <>, + <>, <> and + <>. + + + Additionally, you can control the Hadoop scripts found in the bin/ + directory of the distribution, by setting site-specific values via the + <> and <>. + + * {Site Configuration} + + To configure the Hadoop cluster you will need to configure the + <<>> in which the Hadoop daemons execute as well as the + <<>> for the Hadoop daemons. + + The Hadoop daemons are NameNode/DataNode and ResourceManager/NodeManager. + + + * {Configuring Environment of Hadoop Daemons} + + Administrators should use the <> and + <> script to do site-specific customization of the + Hadoop daemons' process environment. + + At the very least you should specify the <<>> so that it is + correctly defined on each remote node. + + Administrators can configure individual daemons using the configuration + options shown below in the table: + +*--------------------------------------+--------------------------------------+ +|| Daemon || Environment Variable | +*--------------------------------------+--------------------------------------+ +| NameNode | HADOOP_NAMENODE_OPTS | +*--------------------------------------+--------------------------------------+ +| DataNode | HADOOP_DATANODE_OPTS | +*--------------------------------------+--------------------------------------+ +| Backup NameNode | HADOOP_SECONDARYNAMENODE_OPTS | +*--------------------------------------+--------------------------------------+ +| ResourceManager | YARN_RESOURCEMANAGER_OPTS | +*--------------------------------------+--------------------------------------+ +| NodeManager | YARN_NODEMANAGER_OPTS | +*--------------------------------------+--------------------------------------+ + + For example, To configure Namenode to use parallelGC, the following + statement should be added in hadoop-env.sh : + +---- + export HADOOP_NAMENODE_OPTS="-XX:+UseParallelGC ${HADOOP_NAMENODE_OPTS}" +---- + + Other useful configuration parameters that you can customize include: + + * <<>> / <<>> - The directory where the + daemons' log files are stored. They are automatically created if they + don't exist. + + * <<>> / <<>> - The maximum amount of + heapsize to use, in MB e.g. 1000MB. This is used to configure the heap + size for the daemon. By default, the value is 1000MB. + + + * {Configuring the Hadoop Daemons in Non-Secure Mode} + + This section deals with important parameters to be specified in + the given configuration files: + + * <<>> + +*-------------------------+-------------------------+------------------------+ +|| Parameter || Value || Notes | +*-------------------------+-------------------------+------------------------+ +| <<>> | NameNode URI | | +*-------------------------+-------------------------+------------------------+ +| <<>> | 131072 | | +| | | Size of read/write buffer used in SequenceFiles. | +*-------------------------+-------------------------+------------------------+ + + * <<>> + + * Configurations for NameNode: + +*-------------------------+-------------------------+------------------------+ +|| Parameter || Value || Notes | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | Path on the local filesystem where the NameNode stores the namespace | | +| | and transactions logs persistently. | | +| | | If this is a comma-delimited list of directories then the name table is | +| | | replicated in all of the directories, for redundancy. | +*-------------------------+-------------------------+------------------------+ +| <<>> / <<>> | | | +| | List of permitted/excluded DataNodes. | | +| | | If necessary, use these files to control the list of allowable | +| | | datanodes. | +*-------------------------+-------------------------+------------------------+ +| <<>> | 268435456 | | +| | | HDFS blocksize of 256MB for large file-systems. | +*-------------------------+-------------------------+------------------------+ +| <<>> | 100 | | +| | | More NameNode server threads to handle RPCs from large number of | +| | | DataNodes. | +*-------------------------+-------------------------+------------------------+ + + * Configurations for DataNode: + +*-------------------------+-------------------------+------------------------+ +|| Parameter || Value || Notes | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | Comma separated list of paths on the local filesystem of a | | +| | <<>> where it should store its blocks. | | +| | | If this is a comma-delimited list of directories, then data will be | +| | | stored in all named directories, typically on different devices. | +*-------------------------+-------------------------+------------------------+ + + * <<>> + + * Configurations for ResourceManager: + +*-------------------------+-------------------------+------------------------+ +|| Parameter || Value || Notes | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | <<>> host:port for clients to submit jobs. | | +| | | | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | <<>> host:port for ApplicationMasters to talk to | | +| | Scheduler to obtain resources. | | +| | | | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | <<>> host:port for NodeManagers. | | +| | | | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | <<>> Scheduler class. | | +| | | <<>> (recommended) or <<>> | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | <<>> / <<>> | | +| | | Enable ACLs? | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | Admin ACL | | +| | | ACL to set admins on the cluster. | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | | | +| | | HDFS directory where the application logs are moved on application | +| | | completion. Need to set appropriate permissions. | +*-------------------------+-------------------------+------------------------+ +| <<>> / | | | +| <<>> | | | +| | List of permitted/excluded NodeManagers. | | +| | | If necessary, use these files to control the list of allowable | +| | | NodeManagers. | +*-------------------------+-------------------------+------------------------+ +| + * Configurations for NodeManager: + +*-------------------------+-------------------------+------------------------+ +|| Parameter || Value || Notes | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | Resource i.e. available memory, in GB, for given <<>> | | +| | | Defines available resources on the <<>>. | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | Comma-separated list of paths on the local filesystem where | | +| | intermediate data is written. || +| | | Multiple paths help spread disk i/o. | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | Comma-separated list of paths on the local filesystem where logs | | +| | are written. | | +| | | Multiple paths help spread disk i/o. | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | mapreduce.shuffle | | +| | | Shuffle service that needs to be set for Map Reduce applications. | +*-------------------------+-------------------------+------------------------+ + + * <<>> + + * Configurations for MapReduce Applications: + +*-------------------------+-------------------------+------------------------+ +|| Parameter || Value || Notes | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | yarn | | +| | | Execution framework set to Hadoop YARN. | +*-------------------------+-------------------------+------------------------+ +| <<>> | 1536 | | +| | | Larger resource limit for maps. | +*-------------------------+-------------------------+------------------------+ +| <<>> | -Xmx1024M | | +| | | Larger heap-size for child jvms of maps. | +*-------------------------+-------------------------+------------------------+ +| <<>> | 3072 | | +| | | Larger resource limit for reduces. | +*-------------------------+-------------------------+------------------------+ +| <<>> | -Xmx2560M | | +| | | Larger heap-size for child jvms of reduces. | +*-------------------------+-------------------------+------------------------+ +| <<>> | 512 | | +| | | Higher memory-limit while sorting data for efficiency. | +*-------------------------+-------------------------+------------------------+ +| <<>> | 100 | | +| | | More streams merged at once while sorting files. | +*-------------------------+-------------------------+------------------------+ +| <<>> | 50 | | +| | | Higher number of parallel copies run by reduces to fetch outputs | +| | | from very large number of maps. | +*-------------------------+-------------------------+------------------------+ + + * Configurations for MapReduce JobHistory Server: + +*-------------------------+-------------------------+------------------------+ +|| Parameter || Value || Notes | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | MapReduce JobHistory Server | Default port is 10020. | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | MapReduce JobHistory Server Web UI | Default port is 19888. | +*-------------------------+-------------------------+------------------------+ +| <<>> | /mr-history/tmp | | +| | | Directory where history files are written by MapReduce jobs. | +*-------------------------+-------------------------+------------------------+ +| <<>> | /mr-history/done| | +| | | Directory where history files are managed by the MR JobHistory Server. | +*-------------------------+-------------------------+------------------------+ + + * Hadoop Rack Awareness + + The HDFS and the YARN components are rack-aware. + + The NameNode and the ResourceManager obtains the rack information of the + slaves in the cluster by invoking an API in an administrator + configured module. + + The API resolves the DNS name (also IP address) to a rack id. + + The site-specific module to use can be configured using the configuration + item <<>>. The default implementation + of the same runs a script/command configured using + <<>>. If <<>> is + not set, the rack id is returned for any passed IP address. + + * Monitoring Health of NodeManagers + + Hadoop provides a mechanism by which administrators can configure the + NodeManager to run an administrator supplied script periodically to + determine if a node is healthy or not. + + Administrators can determine if the node is in a healthy state by + performing any checks of their choice in the script. If the script + detects the node to be in an unhealthy state, it must print a line to + standard output beginning with the string ERROR. The NodeManager spawns + the script periodically and checks its output. If the script's output + contains the string ERROR, as described above, the node's status is + reported as <<>> and the node is black-listed by the + ResourceManager. No further tasks will be assigned to this node. + However, the NodeManager continues to run the script, so that if the + node becomes healthy again, it will be removed from the blacklisted nodes + on the ResourceManager automatically. The node's health along with the + output of the script, if it is unhealthy, is available to the + administrator in the ResourceManager web interface. The time since the + node was healthy is also displayed on the web interface. + + The following parameters can be used to control the node health + monitoring script in <<>>. + +*-------------------------+-------------------------+------------------------+ +|| Parameter || Value || Notes | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | Node health script | | +| | | Script to check for node's health status. | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | Node health script options | | +| | | Options for script to check for node's health status. | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | Node health script interval | | +| | | Time interval for running health script. | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | Node health script timeout interval | | +| | | Timeout for health script execution. | +*-------------------------+-------------------------+------------------------+ + + * {Slaves file} + + Typically you choose one machine in the cluster to act as the NameNode and + one machine as to act as the ResourceManager, exclusively. The rest of the + machines act as both a DataNode and NodeManager and are referred to as + . + + List all slave hostnames or IP addresses in your <<>> file, + one per line. + + * {Logging} + + Hadoop uses the Apache log4j via the Apache Commons Logging framework for + logging. Edit the <<>> file to customize the + Hadoop daemons' logging configuration (log-formats and so on). + + * {Operating the Hadoop Cluster} + + Once all the necessary configuration is complete, distribute the files to the + <<>> directory on all the machines. + + * Hadoop Startup + + To start a Hadoop cluster you will need to start both the HDFS and YARN + cluster. + + Format a new distributed filesystem: + +---- + $ $HADOOP_HDFS_HOME/bin/hdfs namenode -format +---- + + Start the HDFS with the following command, run on the designated NameNode: + +---- + $ $HADOOP_HDFS_HOME/bin/hdfs start namenode --config $HADOOP_CONF_DIR +---- + + Run a script to start DataNodes on all slaves: + +---- + $ $HADOOP_HDFS_HOME/bin/hdfs start datanode --config $HADOOP_CONF_DIR +---- + + Start the YARN with the following command, run on the designated + ResourceManager: + +---- + $ $YARN_HOME/bin/yarn start resourcemanager --config $HADOOP_CONF_DIR +---- + + Run a script to start NodeManagers on all slaves: + +---- + $ $YARN_HOME/bin/hdfs start nodemanager --config $HADOOP_CONF_DIR +---- + + Start the MapReduce JobHistory Server with the following command, run on the + designated server: + +---- + $ $YARN_HOME/bin/yarn start historyserver --config $HADOOP_CONF_DIR +---- + + * Hadoop Shutdown + + Stop the NameNode with the following command, run on the designated + NameNode: + +---- + $ $HADOOP_HDFS_HOME/bin/hdfs stop namenode --config $HADOOP_CONF_DIR +---- + + Run a script to stop DataNodes on all slaves: + +---- + $ $HADOOP_HDFS_HOME/bin/hdfs stop datanode --config $HADOOP_CONF_DIR +---- + + Stop the ResourceManager with the following command, run on the designated + ResourceManager: + +---- + $ $YARN_HOME/bin/yarn stop resourcemanager --config $HADOOP_CONF_DIR +---- + + Run a script to stop NodeManagers on all slaves: + +---- + $ $YARN_HOME/bin/hdfs stop nodemanager --config $HADOOP_CONF_DIR +---- + + Stop the MapReduce JobHistory Server with the following command, run on the + designated server: + +---- + $ $YARN_HOME/bin/yarn stop historyserver --config $HADOOP_CONF_DIR +---- + + +* {Running Hadoop in Secure Mode} + + This section deals with important parameters to be specified in + to run Hadoop in <> with strong, Kerberos-based + authentication. + + * <<>> + + Ensure that HDFS and YARN daemons run as different Unix users, for e.g. + <<>> and <<>>. Also, ensure that the MapReduce JobHistory + server runs as user <<>>. + + It's recommended to have them share a Unix group, for e.g. <<>>. + +*--------------------------------------+--------------------------------------+ +|| User:Group || Daemons | +*--------------------------------------+--------------------------------------+ +| hdfs:hadoop | NameNode, Backup NameNode, DataNode | +*--------------------------------------+--------------------------------------+ +| yarn:hadoop | ResourceManager, NodeManager | +*--------------------------------------+--------------------------------------+ +| mapred:hadoop | MapReduce JobHistory Server | +*--------------------------------------+--------------------------------------+ + + * <<>> + + The following table lists various paths on HDFS and local filesystems (on + all nodes) and recommended permissions: + +*-------------------+-------------------+------------------+------------------+ +|| Filesystem || Path || User:Group || Permissions | +*-------------------+-------------------+------------------+------------------+ +| local | <<>> | hdfs:hadoop | drwx------ | +*-------------------+-------------------+------------------+------------------+ +| local | <<>> | hdfs:hadoop | drwx------ | +*-------------------+-------------------+------------------+------------------+ +| local | $HADOOP_LOG_DIR | hdfs:hadoop | drwxrwxr-x | +*-------------------+-------------------+------------------+------------------+ +| local | $YARN_LOG_DIR | yarn:hadoop | drwxrwxr-x | +*-------------------+-------------------+------------------+------------------+ +| local | <<>> | yarn:hadoop | drwxr-xr-x | +*-------------------+-------------------+------------------+------------------+ +| local | <<>> | yarn:hadoop | drwxr-xr-x | +*-------------------+-------------------+------------------+------------------+ +| local | container-executor | root:hadoop | --Sr-s--- | +*-------------------+-------------------+------------------+------------------+ +| local | <<>> | root:hadoop | r-------- | +*-------------------+-------------------+------------------+------------------+ +| hdfs | / | hdfs:hadoop | drwxr-xr-x | +*-------------------+-------------------+------------------+------------------+ +| hdfs | /tmp | hdfs:hadoop | drwxrwxrwxt | +*-------------------+-------------------+------------------+------------------+ +| hdfs | /user | hdfs:hadoop | drwxr-xr-x | +*-------------------+-------------------+------------------+------------------+ +| hdfs | <<>> | yarn:hadoop | drwxrwxrwxt | +*-------------------+-------------------+------------------+------------------+ +| hdfs | <<>> | mapred:hadoop | | +| | | | drwxrwxrwxt | +*-------------------+-------------------+------------------+------------------+ +| hdfs | <<>> | mapred:hadoop | | +| | | | drwxr-x--- | +*-------------------+-------------------+------------------+------------------+ + + * Kerberos Keytab files + + * HDFS + + The NameNode keytab file, on the NameNode host, should look like the + following: + +---- + +$ /usr/kerberos/bin/klist -e -k -t /etc/security/keytab/nn.service.keytab +Keytab name: FILE:/etc/security/keytab/nn.service.keytab +KVNO Timestamp Principal + 4 07/18/11 21:08:09 nn/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC) + 4 07/18/11 21:08:09 nn/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC) + 4 07/18/11 21:08:09 nn/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5) + 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC) + 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC) + 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5) + +---- + + The Secondary NameNode keytab file, on that host, should look like the + following: + +---- + +$ /usr/kerberos/bin/klist -e -k -t /etc/security/keytab/sn.service.keytab +Keytab name: FILE:/etc/security/keytab/sn.service.keytab +KVNO Timestamp Principal + 4 07/18/11 21:08:09 sn/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC) + 4 07/18/11 21:08:09 sn/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC) + 4 07/18/11 21:08:09 sn/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5) + 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC) + 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC) + 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5) + +---- + + The DataNode keytab file, on each host, should look like the following: + +---- + +$ /usr/kerberos/bin/klist -e -k -t /etc/security/keytab/dn.service.keytab +Keytab name: FILE:/etc/security/keytab/dn.service.keytab +KVNO Timestamp Principal + 4 07/18/11 21:08:09 dn/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC) + 4 07/18/11 21:08:09 dn/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC) + 4 07/18/11 21:08:09 dn/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5) + 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC) + 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC) + 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5) + +---- + + * YARN + + The ResourceManager keytab file, on the ResourceManager host, should look + like the following: + +---- + +$ /usr/kerberos/bin/klist -e -k -t /etc/security/keytab/rm.service.keytab +Keytab name: FILE:/etc/security/keytab/rm.service.keytab +KVNO Timestamp Principal + 4 07/18/11 21:08:09 rm/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC) + 4 07/18/11 21:08:09 rm/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC) + 4 07/18/11 21:08:09 rm/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5) + 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC) + 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC) + 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5) + +---- + + The NodeManager keytab file, on each host, should look like the following: + +---- + +$ /usr/kerberos/bin/klist -e -k -t /etc/security/keytab/nm.service.keytab +Keytab name: FILE:/etc/security/keytab/nm.service.keytab +KVNO Timestamp Principal + 4 07/18/11 21:08:09 nm/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC) + 4 07/18/11 21:08:09 nm/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC) + 4 07/18/11 21:08:09 nm/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5) + 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC) + 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC) + 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5) + +---- + + * MapReduce JobHistory Server + + The MapReduce JobHistory Server keytab file, on that host, should look + like the following: + +---- + +$ /usr/kerberos/bin/klist -e -k -t /etc/security/keytab/jhs.service.keytab +Keytab name: FILE:/etc/security/keytab/jhs.service.keytab +KVNO Timestamp Principal + 4 07/18/11 21:08:09 jhs/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC) + 4 07/18/11 21:08:09 jhs/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC) + 4 07/18/11 21:08:09 jhs/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5) + 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC) + 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC) + 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5) + +---- + + * Configuration in Secure Mode + + * <<>> + +*-------------------------+-------------------------+------------------------+ +|| Parameter || Value || Notes | +*-------------------------+-------------------------+------------------------+ +| <<>> | | is non-secure. | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | | Enable RPC service-level authorization. | +*-------------------------+-------------------------+------------------------+ + + * <<>> + + * Configurations for NameNode: + +*-------------------------+-------------------------+------------------------+ +|| Parameter || Value || Notes | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | | Enable HDFS block access tokens for secure operations. | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +*-------------------------+-------------------------+------------------------+ +| <<>> | <50470> | | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | | Kerberos keytab file for the NameNode. | +*-------------------------+-------------------------+------------------------+ +| <<>> | nn/_HOST@REALM.TLD | | +| | | Kerberos principal name for the NameNode. | +*-------------------------+-------------------------+------------------------+ +| <<>> | host/_HOST@REALM.TLD | | +| | | HTTPS Kerberos principal name for the NameNode. | +*-------------------------+-------------------------+------------------------+ + + * Configurations for Secondary NameNode: + +*-------------------------+-------------------------+------------------------+ +|| Parameter || Value || Notes | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +*-------------------------+-------------------------+------------------------+ +| <<>> | <50090> | | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | | | +| | | Kerberos keytab file for the NameNode. | +*-------------------------+-------------------------+------------------------+ +| <<>> | sn/_HOST@REALM.TLD | | +| | | Kerberos principal name for the Secondary NameNode. | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | host/_HOST@REALM.TLD | | +| | | HTTPS Kerberos principal name for the Secondary NameNode. | +*-------------------------+-------------------------+------------------------+ + + * Configurations for DataNode: + +*-------------------------+-------------------------+------------------------+ +|| Parameter || Value || Notes | +*-------------------------+-------------------------+------------------------+ +| <<>> | 700 | | +*-------------------------+-------------------------+------------------------+ +| <<>> | <0.0.0.0:2003> | | +*-------------------------+-------------------------+------------------------+ +| <<>> | <0.0.0.0:2005> | | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | | Kerberos keytab file for the DataNode. | +*-------------------------+-------------------------+------------------------+ +| <<>> | dn/_HOST@REALM.TLD | | +| | | Kerberos principal name for the DataNode. | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | host/_HOST@REALM.TLD | | +| | | HTTPS Kerberos principal name for the DataNode. | +*-------------------------+-------------------------+------------------------+ + + * <<>> + + * LinuxContainerExecutor + + A <<>> used by YARN framework which define how any + launched and controlled. + + The following are the available in Hadoop YARN: + +*--------------------------------------+--------------------------------------+ +|| ContainerExecutor || Description | +*--------------------------------------+--------------------------------------+ +| <<>> | | +| | The default executor which YARN uses to manage container execution. | +| | The container process has the same Unix user as the NodeManager. | +*--------------------------------------+--------------------------------------+ +| <<>> | | +| | Supported only on GNU/Linux, this executor runs the containers as the | +| | user who submitted the application. It requires all user accounts to be | +| | created on the cluster nodes where the containers are launched. It uses | +| | a executable that is included in the Hadoop distribution. | +| | The NodeManager uses this executable to launch and kill containers. | +| | The setuid executable switches to the user who has submitted the | +| | application and launches or kills the containers. For maximum security, | +| | this executor sets up restricted permissions and user/group ownership of | +| | local files and directories used by the containers such as the shared | +| | objects, jars, intermediate files, log files etc. Particularly note that, | +| | because of this, except the application owner and NodeManager, no other | +| | user can access any of the local files/directories including those | +| | localized as part of the distributed cache. | +*--------------------------------------+--------------------------------------+ + + To build the LinuxContainerExecutor executable run: + +---- + $ mvn package -Dcontainer-executor.conf.dir=/etc/hadoop/ +---- + + The path passed in <<<-Dcontainer-executor.conf.dir>>> should be the + path on the cluster nodes where a configuration file for the setuid + executable should be located. The executable should be installed in + $YARN_HOME/bin. + + The executable must have specific permissions: 6050 or --Sr-s--- + permissions user-owned by (super-user) and group-owned by a + special group (e.g. <<>>) of which the NodeManager Unix user is + the group member and no ordinary application user is. If any application + user belongs to this special group, security will be compromised. This + special group name should be specified for the configuration property + <<>> in both + <<>> and <<>>. + + For example, let's say that the NodeManager is run as user who is + part of the groups users and , any of them being the primary group. + Let also be that has both and another user + (application submitter) as its members, and does not + belong to . Going by the above description, the setuid/setgid + executable should be set 6050 or --Sr-s--- with user-owner as and + group-owner as which has as its member (and not + which has also as its member besides ). + + The LinuxTaskController requires that paths including and leading up to + the directories specified in <<>> and + <<>> to be set 755 permissions as described + above in the table on permissions on directories. + + * <<>> + + The executable requires a configuration file called + <<>> to be present in the configuration + directory passed to the mvn target mentioned above. + + The configuration file must be owned by the user running NodeManager + (user <<>> in the above example), group-owned by anyone and + should have the permissions 0400 or r--------. + + The executable requires following configuration items to be present + in the <<>> file. The items should be + mentioned as simple key=value pairs, one per-line: + +*-------------------------+-------------------------+------------------------+ +|| Parameter || Value || Notes | +*-------------------------+-------------------------+------------------------+ +| <<>> | | +| | Comma-separated list of NodeManager local directories. | | +| | | Paths to NodeManager local directories. Should be same as the value | +| | | which was provided to key in <<>>. This is | +| | | required to validate paths passed to the setuid executable in order | +| | to prevent arbitrary paths being passed to it. | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | | Unix group of the NodeManager. The group owner of the | +| | | binary should be this group. Should be same as the | +| | | value with which the NodeManager is configured. This configuration is | +| | | required for validating the secure access of the | +| | | binary. | +*-------------------------+-------------------------+------------------------+ +| <<>> | | +| | Comma-separated list of NodeManager log directories. | | +| | | Paths to NodeManager log directories. Should be same as the value | +| | | which was provided to key in <<>>. This is | +| | | required to set proper permissions on the log files so that they can | +| | | be written to by the user's containers and read by the NodeManager for | +| | | . | +*-------------------------+-------------------------+------------------------+ +| <<>> | hfds,yarn,mapred,bin | Banned users. | +*-------------------------+-------------------------+------------------------+ +| <<>> | 1000 | Prevent other super-users. | +*-------------------------+-------------------------+------------------------+ + + To re-cap, here are the local file-ssytem permissions required for the + various paths related to the <<>>: + +*-------------------+-------------------+------------------+------------------+ +|| Filesystem || Path || User:Group || Permissions | +*-------------------+-------------------+------------------+------------------+ +| local | container-executor | root:hadoop | --Sr-s--- | +*-------------------+-------------------+------------------+------------------+ +| local | <<>> | root:hadoop | r-------- | +*-------------------+-------------------+------------------+------------------+ +| local | <<>> | yarn:hadoop | drwxr-xr-x | +*-------------------+-------------------+------------------+------------------+ +| local | <<>> | yarn:hadoop | drwxr-xr-x | +*-------------------+-------------------+------------------+------------------+ + + * Configurations for ResourceManager: + +*-------------------------+-------------------------+------------------------+ +|| Parameter || Value || Notes | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | | | +| | | Kerberos keytab file for the ResourceManager. | +*-------------------------+-------------------------+------------------------+ +| <<>> | rm/_HOST@REALM.TLD | | +| | | Kerberos principal name for the ResourceManager. | +*-------------------------+-------------------------+------------------------+ + + * Configurations for NodeManager: + +*-------------------------+-------------------------+------------------------+ +|| Parameter || Value || Notes | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | | Kerberos keytab file for the NodeManager. | +*-------------------------+-------------------------+------------------------+ +| <<>> | nm/_HOST@REALM.TLD | | +| | | Kerberos principal name for the NodeManager. | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | <<>> | +| | | Use LinuxContainerExecutor. | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | | Unix group of the NodeManager. | +*-------------------------+-------------------------+------------------------+ + + * <<>> + + * Configurations for MapReduce JobHistory Server: + +*-------------------------+-------------------------+------------------------+ +|| Parameter || Value || Notes | +*-------------------------+-------------------------+------------------------+ +| <<>> | | | +| | MapReduce JobHistory Server | Default port is 10020. | +*-------------------------+-------------------------+------------------------+ +| <<>> | | +| | | | +| | | Kerberos keytab file for the MapReduce JobHistory Server. | +*-------------------------+-------------------------+------------------------+ +| <<>> | mapred/_HOST@REALM.TLD | | +| | | Kerberos principal name for the MapReduce JobHistory Server. | +*-------------------------+-------------------------+------------------------+ + + + * {Operating the Hadoop Cluster} + + Once all the necessary configuration is complete, distribute the files to the + <<>> directory on all the machines. + + This section also describes the various Unix users who should be starting the + various components and uses the same Unix accounts and groups used previously: + + * Hadoop Startup + + To start a Hadoop cluster you will need to start both the HDFS and YARN + cluster. + + Format a new distributed filesystem as : + +---- +[hdfs]$ $HADOOP_HDFS_HOME/bin/hdfs namenode -format +---- + + Start the HDFS with the following command, run on the designated NameNode + as : + +---- +[hdfs]$ $HADOOP_HDFS_HOME/bin/hdfs start namenode --config $HADOOP_CONF_DIR +---- + + Run a script to start DataNodes on all slaves as with a special + environment variable <<>> set to : + +---- +[root]$ HADOOP_SECURE_DN_USER=hdfs $HADOOP_HDFS_HOME/bin/hdfs start datanode --config $HADOOP_CONF_DIR +---- + + Start the YARN with the following command, run on the designated + ResourceManager as : + +---- +[yarn]$ $YARN_HOME/bin/yarn start resourcemanager --config $HADOOP_CONF_DIR +---- + + Run a script to start NodeManagers on all slaves as : + +---- +[yarn]$ $YARN_HOME/bin/hdfs start nodemanager --config $HADOOP_CONF_DIR +---- + + Start the MapReduce JobHistory Server with the following command, run on the + designated server as : + +---- +[mapred]$ $YARN_HOME/bin/yarn start historyserver --config $HADOOP_CONF_DIR +---- + + * Hadoop Shutdown + + Stop the NameNode with the following command, run on the designated NameNode + as : + +---- +[hdfs]$ $HADOOP_HDFS_HOME/bin/hdfs stop namenode --config $HADOOP_CONF_DIR +---- + + Run a script to stop DataNodes on all slaves as : + +---- +[root]$ $HADOOP_HDFS_HOME/bin/hdfs stop datanode --config $HADOOP_CONF_DIR +---- + + Stop the ResourceManager with the following command, run on the designated + ResourceManager as : + +---- +[yarn]$ $YARN_HOME/bin/yarn stop resourcemanager --config $HADOOP_CONF_DIR +---- + + Run a script to stop NodeManagers on all slaves as : + +---- +[yarn]$ $YARN_HOME/bin/hdfs stop nodemanager --config $HADOOP_CONF_DIR +---- + + Stop the MapReduce JobHistory Server with the following command, run on the + designated server as : + +---- +[mapred]$ $YARN_HOME/bin/yarn stop historyserver --config $HADOOP_CONF_DIR +---- + +* {Web Interfaces} + + Once the Hadoop cluster is up and running check the web-ui of the + components as described below: + +*-------------------------+-------------------------+------------------------+ +|| Daemon || Web Interface || Notes | +*-------------------------+-------------------------+------------------------+ +| NameNode | http:/// | Default HTTP port is 50070. | +*-------------------------+-------------------------+------------------------+ +| ResourceManager | http:/// | Default HTTP port is 8088. | +*-------------------------+-------------------------+------------------------+ +| MapReduce JobHistory Server | http:/// | | +| | | Default HTTP port is 19888. | +*-------------------------+-------------------------+------------------------+ + + \ No newline at end of file