HADOOP-10908. Common needs updates for shell rewrite (aw)
This commit is contained in:
parent
41d72cbd48
commit
94d342e607
|
@ -344,6 +344,8 @@ Trunk (Unreleased)
|
||||||
|
|
||||||
HADOOP-11397. Can't override HADOOP_IDENT_STRING (Kengo Seki via aw)
|
HADOOP-11397. Can't override HADOOP_IDENT_STRING (Kengo Seki via aw)
|
||||||
|
|
||||||
|
HADOOP-10908. Common needs updates for shell rewrite (aw)
|
||||||
|
|
||||||
OPTIMIZATIONS
|
OPTIMIZATIONS
|
||||||
|
|
||||||
HADOOP-7761. Improve the performance of raw comparisons. (todd)
|
HADOOP-7761. Improve the performance of raw comparisons. (todd)
|
||||||
|
|
|
@ -11,83 +11,81 @@
|
||||||
~~ limitations under the License. See accompanying LICENSE file.
|
~~ limitations under the License. See accompanying LICENSE file.
|
||||||
|
|
||||||
---
|
---
|
||||||
Hadoop Map Reduce Next Generation-${project.version} - Cluster Setup
|
Hadoop ${project.version} - Cluster Setup
|
||||||
---
|
---
|
||||||
---
|
---
|
||||||
${maven.build.timestamp}
|
${maven.build.timestamp}
|
||||||
|
|
||||||
%{toc|section=1|fromDepth=0}
|
%{toc|section=1|fromDepth=0}
|
||||||
|
|
||||||
Hadoop MapReduce Next Generation - Cluster Setup
|
Hadoop Cluster Setup
|
||||||
|
|
||||||
* {Purpose}
|
* {Purpose}
|
||||||
|
|
||||||
This document describes how to install, configure and manage non-trivial
|
This document describes how to install and configure
|
||||||
Hadoop clusters ranging from a few nodes to extremely large clusters
|
Hadoop clusters ranging from a few nodes to extremely large clusters
|
||||||
with thousands of nodes.
|
with thousands of nodes. To play with Hadoop, you may first want to
|
||||||
|
install it on a single machine (see {{{./SingleCluster.html}Single Node Setup}}).
|
||||||
|
|
||||||
To play with Hadoop, you may first want to install it on a single
|
This document does not cover advanced topics such as {{{./SecureMode.html}Security}} or
|
||||||
machine (see {{{./SingleCluster.html}Single Node Setup}}).
|
High Availability.
|
||||||
|
|
||||||
* {Prerequisites}
|
* {Prerequisites}
|
||||||
|
|
||||||
Download a stable version of Hadoop from Apache mirrors.
|
* Install Java. See the {{{http://wiki.apache.org/hadoop/HadoopJavaVersions}Hadoop Wiki}} for known good versions.
|
||||||
|
* Download a stable version of Hadoop from Apache mirrors.
|
||||||
|
|
||||||
* {Installation}
|
* {Installation}
|
||||||
|
|
||||||
Installing a Hadoop cluster typically involves unpacking the software on all
|
Installing a Hadoop cluster typically involves unpacking the software on all
|
||||||
the machines in the cluster or installing RPMs.
|
the machines in the cluster or installing it via a packaging system as
|
||||||
|
appropriate for your operating system. It is important to divide up the hardware
|
||||||
|
into functions.
|
||||||
|
|
||||||
Typically one machine in the cluster is designated as the NameNode and
|
Typically one machine in the cluster is designated as the NameNode and
|
||||||
another machine the as ResourceManager, exclusively. These are the masters.
|
another machine the as ResourceManager, exclusively. These are the masters. Other
|
||||||
|
services (such as Web App Proxy Server and MapReduce Job History server) are usually
|
||||||
|
run either on dedicated hardware or on shared infrastrucutre, depending upon the load.
|
||||||
|
|
||||||
The rest of the machines in the cluster act as both DataNode and NodeManager.
|
The rest of the machines in the cluster act as both DataNode and NodeManager.
|
||||||
These are the slaves.
|
These are the slaves.
|
||||||
|
|
||||||
* {Running Hadoop in Non-Secure Mode}
|
* {Configuring Hadoop in Non-Secure Mode}
|
||||||
|
|
||||||
The following sections describe how to configure a Hadoop cluster.
|
Hadoop's Java configuration is driven by two types of important configuration files:
|
||||||
|
|
||||||
{Configuration Files}
|
|
||||||
|
|
||||||
Hadoop configuration is driven by two types of important configuration files:
|
|
||||||
|
|
||||||
* Read-only default configuration - <<<core-default.xml>>>,
|
* Read-only default configuration - <<<core-default.xml>>>,
|
||||||
<<<hdfs-default.xml>>>, <<<yarn-default.xml>>> and
|
<<<hdfs-default.xml>>>, <<<yarn-default.xml>>> and
|
||||||
<<<mapred-default.xml>>>.
|
<<<mapred-default.xml>>>.
|
||||||
|
|
||||||
* Site-specific configuration - <<conf/core-site.xml>>,
|
* Site-specific configuration - <<<etc/hadoop/core-site.xml>>>,
|
||||||
<<conf/hdfs-site.xml>>, <<conf/yarn-site.xml>> and
|
<<<etc/hadoop/hdfs-site.xml>>>, <<<etc/hadoop/yarn-site.xml>>> and
|
||||||
<<conf/mapred-site.xml>>.
|
<<<etc/hadoop/mapred-site.xml>>>.
|
||||||
|
|
||||||
|
|
||||||
Additionally, you can control the Hadoop scripts found in the bin/
|
Additionally, you can control the Hadoop scripts found in the bin/
|
||||||
directory of the distribution, by setting site-specific values via the
|
directory of the distribution, by setting site-specific values via the
|
||||||
<<conf/hadoop-env.sh>> and <<yarn-env.sh>>.
|
<<<etc/hadoop/hadoop-env.sh>>> and <<<etc/hadoop/yarn-env.sh>>>.
|
||||||
|
|
||||||
{Site Configuration}
|
|
||||||
|
|
||||||
To configure the Hadoop cluster you will need to configure the
|
To configure the Hadoop cluster you will need to configure the
|
||||||
<<<environment>>> in which the Hadoop daemons execute as well as the
|
<<<environment>>> in which the Hadoop daemons execute as well as the
|
||||||
<<<configuration parameters>>> for the Hadoop daemons.
|
<<<configuration parameters>>> for the Hadoop daemons.
|
||||||
|
|
||||||
The Hadoop daemons are NameNode/DataNode and ResourceManager/NodeManager.
|
HDFS daemons are NameNode, SecondaryNameNode, and DataNode. YARN damones
|
||||||
|
are ResourceManager, NodeManager, and WebAppProxy. If MapReduce is to be
|
||||||
|
used, then the MapReduce Job History Server will also be running. For
|
||||||
|
large installations, these are generally running on separate hosts.
|
||||||
|
|
||||||
|
|
||||||
** {Configuring Environment of Hadoop Daemons}
|
** {Configuring Environment of Hadoop Daemons}
|
||||||
|
|
||||||
Administrators should use the <<conf/hadoop-env.sh>> and
|
Administrators should use the <<<etc/hadoop/hadoop-env.sh>>> and optionally the
|
||||||
<<conf/yarn-env.sh>> script to do site-specific customization of the
|
<<<etc/hadoop/mapred-env.sh>>> and <<<etc/hadoop/yarn-env.sh>>> scripts to do
|
||||||
Hadoop daemons' process environment.
|
site-specific customization of the Hadoop daemons' process environment.
|
||||||
|
|
||||||
At the very least you should specify the <<<JAVA_HOME>>> so that it is
|
At the very least, you must specify the <<<JAVA_HOME>>> so that it is
|
||||||
correctly defined on each remote node.
|
correctly defined on each remote node.
|
||||||
|
|
||||||
In most cases you should also specify <<<HADOOP_PID_DIR>>> and
|
|
||||||
<<<HADOOP_SECURE_DN_PID_DIR>>> to point to directories that can only be
|
|
||||||
written to by the users that are going to run the hadoop daemons.
|
|
||||||
Otherwise there is the potential for a symlink attack.
|
|
||||||
|
|
||||||
Administrators can configure individual daemons using the configuration
|
Administrators can configure individual daemons using the configuration
|
||||||
options shown below in the table:
|
options shown below in the table:
|
||||||
|
|
||||||
|
@ -114,20 +112,42 @@ Hadoop MapReduce Next Generation - Cluster Setup
|
||||||
statement should be added in hadoop-env.sh :
|
statement should be added in hadoop-env.sh :
|
||||||
|
|
||||||
----
|
----
|
||||||
export HADOOP_NAMENODE_OPTS="-XX:+UseParallelGC ${HADOOP_NAMENODE_OPTS}"
|
export HADOOP_NAMENODE_OPTS="-XX:+UseParallelGC"
|
||||||
----
|
----
|
||||||
|
|
||||||
|
See <<<etc/hadoop/hadoop-env.sh>>> for other examples.
|
||||||
|
|
||||||
Other useful configuration parameters that you can customize include:
|
Other useful configuration parameters that you can customize include:
|
||||||
|
|
||||||
* <<<HADOOP_LOG_DIR>>> / <<<YARN_LOG_DIR>>> - The directory where the
|
* <<<HADOOP_PID_DIR>>> - The directory where the
|
||||||
daemons' log files are stored. They are automatically created if they
|
daemons' process id files are stored.
|
||||||
don't exist.
|
|
||||||
|
|
||||||
* <<<HADOOP_HEAPSIZE>>> / <<<YARN_HEAPSIZE>>> - The maximum amount of
|
* <<<HADOOP_LOG_DIR>>> - The directory where the
|
||||||
heapsize to use, in MB e.g. if the varibale is set to 1000 the heap
|
daemons' log files are stored. Log files are automatically created
|
||||||
will be set to 1000MB. This is used to configure the heap
|
if they don't exist.
|
||||||
size for the daemon. By default, the value is 1000. If you want to
|
|
||||||
configure the values separately for each deamon you can use.
|
* <<<HADOOP_HEAPSIZE_MAX>>> - The maximum amount of
|
||||||
|
memory to use for the Java heapsize. Units supported by the JVM
|
||||||
|
are also supported here. If no unit is present, it will be assumed
|
||||||
|
the number is in megabytes. By default, Hadoop will let the JVM
|
||||||
|
determine how much to use. This value can be overriden on
|
||||||
|
a per-daemon basis using the appropriate <<<_OPTS>>> variable listed above.
|
||||||
|
For example, setting <<<HADOOP_HEAPSIZE_MAX=1g>>> and
|
||||||
|
<<<HADOOP_NAMENODE_OPTS="-Xmx5g">>> will configure the NameNode with 5GB heap.
|
||||||
|
|
||||||
|
In most cases, you should specify the <<<HADOOP_PID_DIR>>> and
|
||||||
|
<<<HADOOP_LOG_DIR>>> directories such that they can only be
|
||||||
|
written to by the users that are going to run the hadoop daemons.
|
||||||
|
Otherwise there is the potential for a symlink attack.
|
||||||
|
|
||||||
|
It is also traditional to configure <<<HADOOP_PREFIX>>> in the system-wide
|
||||||
|
shell environment configuration. For example, a simple script inside
|
||||||
|
<<</etc/profile.d>>>:
|
||||||
|
|
||||||
|
---
|
||||||
|
HADOOP_PREFIX=/path/to/hadoop
|
||||||
|
export HADOOP_PREFIX
|
||||||
|
---
|
||||||
|
|
||||||
*--------------------------------------+--------------------------------------+
|
*--------------------------------------+--------------------------------------+
|
||||||
|| Daemon || Environment Variable |
|
|| Daemon || Environment Variable |
|
||||||
|
@ -141,12 +161,12 @@ Hadoop MapReduce Next Generation - Cluster Setup
|
||||||
| Map Reduce Job History Server | HADOOP_JOB_HISTORYSERVER_HEAPSIZE |
|
| Map Reduce Job History Server | HADOOP_JOB_HISTORYSERVER_HEAPSIZE |
|
||||||
*--------------------------------------+--------------------------------------+
|
*--------------------------------------+--------------------------------------+
|
||||||
|
|
||||||
** {Configuring the Hadoop Daemons in Non-Secure Mode}
|
** {Configuring the Hadoop Daemons}
|
||||||
|
|
||||||
This section deals with important parameters to be specified in
|
This section deals with important parameters to be specified in
|
||||||
the given configuration files:
|
the given configuration files:
|
||||||
|
|
||||||
* <<<conf/core-site.xml>>>
|
* <<<etc/hadoop/core-site.xml>>>
|
||||||
|
|
||||||
*-------------------------+-------------------------+------------------------+
|
*-------------------------+-------------------------+------------------------+
|
||||||
|| Parameter || Value || Notes |
|
|| Parameter || Value || Notes |
|
||||||
|
@ -157,7 +177,7 @@ Hadoop MapReduce Next Generation - Cluster Setup
|
||||||
| | | Size of read/write buffer used in SequenceFiles. |
|
| | | Size of read/write buffer used in SequenceFiles. |
|
||||||
*-------------------------+-------------------------+------------------------+
|
*-------------------------+-------------------------+------------------------+
|
||||||
|
|
||||||
* <<<conf/hdfs-site.xml>>>
|
* <<<etc/hadoop/hdfs-site.xml>>>
|
||||||
|
|
||||||
* Configurations for NameNode:
|
* Configurations for NameNode:
|
||||||
|
|
||||||
|
@ -195,7 +215,7 @@ Hadoop MapReduce Next Generation - Cluster Setup
|
||||||
| | | stored in all named directories, typically on different devices. |
|
| | | stored in all named directories, typically on different devices. |
|
||||||
*-------------------------+-------------------------+------------------------+
|
*-------------------------+-------------------------+------------------------+
|
||||||
|
|
||||||
* <<<conf/yarn-site.xml>>>
|
* <<<etc/hadoop/yarn-site.xml>>>
|
||||||
|
|
||||||
* Configurations for ResourceManager and NodeManager:
|
* Configurations for ResourceManager and NodeManager:
|
||||||
|
|
||||||
|
@ -341,9 +361,7 @@ Hadoop MapReduce Next Generation - Cluster Setup
|
||||||
| | | Be careful, set this too small and you will spam the name node. |
|
| | | Be careful, set this too small and you will spam the name node. |
|
||||||
*-------------------------+-------------------------+------------------------+
|
*-------------------------+-------------------------+------------------------+
|
||||||
|
|
||||||
|
* <<<etc/hadoop/mapred-site.xml>>>
|
||||||
|
|
||||||
* <<<conf/mapred-site.xml>>>
|
|
||||||
|
|
||||||
* Configurations for MapReduce Applications:
|
* Configurations for MapReduce Applications:
|
||||||
|
|
||||||
|
@ -395,22 +413,6 @@ Hadoop MapReduce Next Generation - Cluster Setup
|
||||||
| | | Directory where history files are managed by the MR JobHistory Server. |
|
| | | Directory where history files are managed by the MR JobHistory Server. |
|
||||||
*-------------------------+-------------------------+------------------------+
|
*-------------------------+-------------------------+------------------------+
|
||||||
|
|
||||||
* {Hadoop Rack Awareness}
|
|
||||||
|
|
||||||
The HDFS and the YARN components are rack-aware.
|
|
||||||
|
|
||||||
The NameNode and the ResourceManager obtains the rack information of the
|
|
||||||
slaves in the cluster by invoking an API <resolve> in an administrator
|
|
||||||
configured module.
|
|
||||||
|
|
||||||
The API resolves the DNS name (also IP address) to a rack id.
|
|
||||||
|
|
||||||
The site-specific module to use can be configured using the configuration
|
|
||||||
item <<<topology.node.switch.mapping.impl>>>. The default implementation
|
|
||||||
of the same runs a script/command configured using
|
|
||||||
<<<topology.script.file.name>>>. If <<<topology.script.file.name>>> is
|
|
||||||
not set, the rack id </default-rack> is returned for any passed IP address.
|
|
||||||
|
|
||||||
* {Monitoring Health of NodeManagers}
|
* {Monitoring Health of NodeManagers}
|
||||||
|
|
||||||
Hadoop provides a mechanism by which administrators can configure the
|
Hadoop provides a mechanism by which administrators can configure the
|
||||||
|
@ -433,7 +435,7 @@ Hadoop MapReduce Next Generation - Cluster Setup
|
||||||
node was healthy is also displayed on the web interface.
|
node was healthy is also displayed on the web interface.
|
||||||
|
|
||||||
The following parameters can be used to control the node health
|
The following parameters can be used to control the node health
|
||||||
monitoring script in <<<conf/yarn-site.xml>>>.
|
monitoring script in <<<etc/hadoop/yarn-site.xml>>>.
|
||||||
|
|
||||||
*-------------------------+-------------------------+------------------------+
|
*-------------------------+-------------------------+------------------------+
|
||||||
|| Parameter || Value || Notes |
|
|| Parameter || Value || Notes |
|
||||||
|
@ -465,165 +467,87 @@ Hadoop MapReduce Next Generation - Cluster Setup
|
||||||
disk is either raided or a failure in the boot disk is identified by the
|
disk is either raided or a failure in the boot disk is identified by the
|
||||||
health checker script.
|
health checker script.
|
||||||
|
|
||||||
* {Slaves file}
|
* {Slaves File}
|
||||||
|
|
||||||
Typically you choose one machine in the cluster to act as the NameNode and
|
List all slave hostnames or IP addresses in your <<<etc/hadoop/slaves>>>
|
||||||
one machine as to act as the ResourceManager, exclusively. The rest of the
|
file, one per line. Helper scripts (described below) will use the
|
||||||
machines act as both a DataNode and NodeManager and are referred to as
|
<<<etc/hadoop/slaves>>> file to run commands on many hosts at once. It is not
|
||||||
<slaves>.
|
used for any of the Java-based Hadoop configuration. In order
|
||||||
|
to use this functionality, ssh trusts (via either passphraseless ssh or
|
||||||
|
some other means, such as Kerberos) must be established for the accounts
|
||||||
|
used to run Hadoop.
|
||||||
|
|
||||||
List all slave hostnames or IP addresses in your <<<conf/slaves>>> file,
|
* {Hadoop Rack Awareness}
|
||||||
one per line.
|
|
||||||
|
Many Hadoop components are rack-aware and take advantage of the
|
||||||
|
network topology for performance and safety. Hadoop daemons obtain the
|
||||||
|
rack information of the slaves in the cluster by invoking an administrator
|
||||||
|
configured module. See the {{{./RackAwareness.html}Rack Awareness}}
|
||||||
|
documentation for more specific information.
|
||||||
|
|
||||||
|
It is highly recommended configuring rack awareness prior to starting HDFS.
|
||||||
|
|
||||||
* {Logging}
|
* {Logging}
|
||||||
|
|
||||||
Hadoop uses the Apache log4j via the Apache Commons Logging framework for
|
Hadoop uses the {{{http://logging.apache.org/log4j/2.x/}Apache log4j}} via the Apache Commons Logging framework for
|
||||||
logging. Edit the <<<conf/log4j.properties>>> file to customize the
|
logging. Edit the <<<etc/hadoop/log4j.properties>>> file to customize the
|
||||||
Hadoop daemons' logging configuration (log-formats and so on).
|
Hadoop daemons' logging configuration (log-formats and so on).
|
||||||
|
|
||||||
* {Operating the Hadoop Cluster}
|
* {Operating the Hadoop Cluster}
|
||||||
|
|
||||||
Once all the necessary configuration is complete, distribute the files to the
|
Once all the necessary configuration is complete, distribute the files to the
|
||||||
<<<HADOOP_CONF_DIR>>> directory on all the machines.
|
<<<HADOOP_CONF_DIR>>> directory on all the machines. This should be the
|
||||||
|
same directory on all machines.
|
||||||
|
|
||||||
|
In general, it is recommended that HDFS and YARN run as separate users.
|
||||||
|
In the majority of installations, HDFS processes execute as 'hdfs'. YARN
|
||||||
|
is typically using the 'yarn' account.
|
||||||
|
|
||||||
** Hadoop Startup
|
** Hadoop Startup
|
||||||
|
|
||||||
To start a Hadoop cluster you will need to start both the HDFS and YARN
|
To start a Hadoop cluster you will need to start both the HDFS and YARN
|
||||||
cluster.
|
cluster.
|
||||||
|
|
||||||
Format a new distributed filesystem:
|
The first time you bring up HDFS, it must be formatted. Format a new
|
||||||
|
distributed filesystem as <hdfs>:
|
||||||
----
|
|
||||||
$ $HADOOP_PREFIX/bin/hdfs namenode -format <cluster_name>
|
|
||||||
----
|
|
||||||
|
|
||||||
Start the HDFS with the following command, run on the designated NameNode:
|
|
||||||
|
|
||||||
----
|
|
||||||
$ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start namenode
|
|
||||||
----
|
|
||||||
|
|
||||||
Run a script to start DataNodes on all slaves:
|
|
||||||
|
|
||||||
----
|
|
||||||
$ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start datanode
|
|
||||||
----
|
|
||||||
|
|
||||||
Start the YARN with the following command, run on the designated
|
|
||||||
ResourceManager:
|
|
||||||
|
|
||||||
----
|
|
||||||
$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start resourcemanager
|
|
||||||
----
|
|
||||||
|
|
||||||
Run a script to start NodeManagers on all slaves:
|
|
||||||
|
|
||||||
----
|
|
||||||
$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start nodemanager
|
|
||||||
----
|
|
||||||
|
|
||||||
Start a standalone WebAppProxy server. If multiple servers
|
|
||||||
are used with load balancing it should be run on each of them:
|
|
||||||
|
|
||||||
----
|
|
||||||
$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh start proxyserver --config $HADOOP_CONF_DIR
|
|
||||||
----
|
|
||||||
|
|
||||||
Start the MapReduce JobHistory Server with the following command, run on the
|
|
||||||
designated server:
|
|
||||||
|
|
||||||
----
|
|
||||||
$ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh start historyserver --config $HADOOP_CONF_DIR
|
|
||||||
----
|
|
||||||
|
|
||||||
** Hadoop Shutdown
|
|
||||||
|
|
||||||
Stop the NameNode with the following command, run on the designated
|
|
||||||
NameNode:
|
|
||||||
|
|
||||||
----
|
|
||||||
$ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs stop namenode
|
|
||||||
----
|
|
||||||
|
|
||||||
Run a script to stop DataNodes on all slaves:
|
|
||||||
|
|
||||||
----
|
|
||||||
$ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs stop datanode
|
|
||||||
----
|
|
||||||
|
|
||||||
Stop the ResourceManager with the following command, run on the designated
|
|
||||||
ResourceManager:
|
|
||||||
|
|
||||||
----
|
|
||||||
$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR stop resourcemanager
|
|
||||||
----
|
|
||||||
|
|
||||||
Run a script to stop NodeManagers on all slaves:
|
|
||||||
|
|
||||||
----
|
|
||||||
$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR stop nodemanager
|
|
||||||
----
|
|
||||||
|
|
||||||
Stop the WebAppProxy server. If multiple servers are used with load
|
|
||||||
balancing it should be run on each of them:
|
|
||||||
|
|
||||||
----
|
|
||||||
$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh stop proxyserver --config $HADOOP_CONF_DIR
|
|
||||||
----
|
|
||||||
|
|
||||||
|
|
||||||
Stop the MapReduce JobHistory Server with the following command, run on the
|
|
||||||
designated server:
|
|
||||||
|
|
||||||
----
|
|
||||||
$ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh stop historyserver --config $HADOOP_CONF_DIR
|
|
||||||
----
|
|
||||||
|
|
||||||
|
|
||||||
* {Operating the Hadoop Cluster}
|
|
||||||
|
|
||||||
Once all the necessary configuration is complete, distribute the files to the
|
|
||||||
<<<HADOOP_CONF_DIR>>> directory on all the machines.
|
|
||||||
|
|
||||||
This section also describes the various Unix users who should be starting the
|
|
||||||
various components and uses the same Unix accounts and groups used previously:
|
|
||||||
|
|
||||||
** Hadoop Startup
|
|
||||||
|
|
||||||
To start a Hadoop cluster you will need to start both the HDFS and YARN
|
|
||||||
cluster.
|
|
||||||
|
|
||||||
Format a new distributed filesystem as <hdfs>:
|
|
||||||
|
|
||||||
----
|
----
|
||||||
[hdfs]$ $HADOOP_PREFIX/bin/hdfs namenode -format <cluster_name>
|
[hdfs]$ $HADOOP_PREFIX/bin/hdfs namenode -format <cluster_name>
|
||||||
----
|
----
|
||||||
|
|
||||||
Start the HDFS with the following command, run on the designated NameNode
|
Start the HDFS NameNode with the following command on the
|
||||||
as <hdfs>:
|
designated node as <hdfs>:
|
||||||
|
|
||||||
----
|
----
|
||||||
[hdfs]$ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start namenode
|
[hdfs]$ $HADOOP_PREFIX/bin/hdfs --daemon start namenode
|
||||||
----
|
----
|
||||||
|
|
||||||
Run a script to start DataNodes on all slaves as <root> with a special
|
Start a HDFS DataNode with the following command on each
|
||||||
environment variable <<<HADOOP_SECURE_DN_USER>>> set to <hdfs>:
|
designated node as <hdfs>:
|
||||||
|
|
||||||
----
|
----
|
||||||
[root]$ HADOOP_SECURE_DN_USER=hdfs $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start datanode
|
[hdfs]$ $HADOOP_PREFIX/bin/hdfs --daemon start datanode
|
||||||
|
----
|
||||||
|
|
||||||
|
If <<<etc/hadoop/slaves>>> and ssh trusted access is configured
|
||||||
|
(see {{{./SingleCluster.html}Single Node Setup}}), all of the
|
||||||
|
HDFS processes can be started with a utility script. As <hdfs>:
|
||||||
|
|
||||||
|
----
|
||||||
|
[hdfs]$ $HADOOP_PREFIX/sbin/start-dfs.sh
|
||||||
----
|
----
|
||||||
|
|
||||||
Start the YARN with the following command, run on the designated
|
Start the YARN with the following command, run on the designated
|
||||||
ResourceManager as <yarn>:
|
ResourceManager as <yarn>:
|
||||||
|
|
||||||
----
|
----
|
||||||
[yarn]$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start resourcemanager
|
[yarn]$ $HADOOP_PREFIX/bin/yarn --daemon start resourcemanager
|
||||||
----
|
----
|
||||||
|
|
||||||
Run a script to start NodeManagers on all slaves as <yarn>:
|
Run a script to start a NodeManager on each designated host as <yarn>:
|
||||||
|
|
||||||
----
|
----
|
||||||
[yarn]$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start nodemanager
|
[yarn]$ $HADOOP_PREFIX/bin/yarn --daemon start nodemanager
|
||||||
----
|
----
|
||||||
|
|
||||||
Start a standalone WebAppProxy server. Run on the WebAppProxy
|
Start a standalone WebAppProxy server. Run on the WebAppProxy
|
||||||
|
@ -631,14 +555,22 @@ $ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh stop historyserver --config $HADOO
|
||||||
it should be run on each of them:
|
it should be run on each of them:
|
||||||
|
|
||||||
----
|
----
|
||||||
[yarn]$ $HADOOP_YARN_HOME/bin/yarn start proxyserver --config $HADOOP_CONF_DIR
|
[yarn]$ $HADOOP_PREFIX/bin/yarn --daemon start proxyserver
|
||||||
----
|
----
|
||||||
|
|
||||||
Start the MapReduce JobHistory Server with the following command, run on the
|
If <<<etc/hadoop/slaves>>> and ssh trusted access is configured
|
||||||
designated server as <mapred>:
|
(see {{{./SingleCluster.html}Single Node Setup}}), all of the
|
||||||
|
YARN processes can be started with a utility script. As <yarn>:
|
||||||
|
|
||||||
----
|
----
|
||||||
[mapred]$ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh start historyserver --config $HADOOP_CONF_DIR
|
[yarn]$ $HADOOP_PREFIX/sbin/start-yarn.sh
|
||||||
|
----
|
||||||
|
|
||||||
|
Start the MapReduce JobHistory Server with the following command, run
|
||||||
|
on the designated server as <mapred>:
|
||||||
|
|
||||||
|
----
|
||||||
|
[mapred]$ $HADOOP_PREFIX/bin/mapred --daemon start historyserver
|
||||||
----
|
----
|
||||||
|
|
||||||
** Hadoop Shutdown
|
** Hadoop Shutdown
|
||||||
|
@ -647,26 +579,42 @@ $ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh stop historyserver --config $HADOO
|
||||||
as <hdfs>:
|
as <hdfs>:
|
||||||
|
|
||||||
----
|
----
|
||||||
[hdfs]$ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs stop namenode
|
[hdfs]$ $HADOOP_PREFIX/bin/hdfs --daemon stop namenode
|
||||||
----
|
----
|
||||||
|
|
||||||
Run a script to stop DataNodes on all slaves as <root>:
|
Run a script to stop a DataNode as <hdfs>:
|
||||||
|
|
||||||
----
|
----
|
||||||
[root]$ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs stop datanode
|
[hdfs]$ $HADOOP_PREFIX/bin/hdfs --daemon stop datanode
|
||||||
|
----
|
||||||
|
|
||||||
|
If <<<etc/hadoop/slaves>>> and ssh trusted access is configured
|
||||||
|
(see {{{./SingleCluster.html}Single Node Setup}}), all of the
|
||||||
|
HDFS processes may be stopped with a utility script. As <hdfs>:
|
||||||
|
|
||||||
|
----
|
||||||
|
[hdfs]$ $HADOOP_PREFIX/sbin/stop-dfs.sh
|
||||||
----
|
----
|
||||||
|
|
||||||
Stop the ResourceManager with the following command, run on the designated
|
Stop the ResourceManager with the following command, run on the designated
|
||||||
ResourceManager as <yarn>:
|
ResourceManager as <yarn>:
|
||||||
|
|
||||||
----
|
----
|
||||||
[yarn]$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR stop resourcemanager
|
[yarn]$ $HADOOP_PREFIX/bin/yarn --daemon stop resourcemanager
|
||||||
----
|
----
|
||||||
|
|
||||||
Run a script to stop NodeManagers on all slaves as <yarn>:
|
Run a script to stop a NodeManager on a slave as <yarn>:
|
||||||
|
|
||||||
----
|
----
|
||||||
[yarn]$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR stop nodemanager
|
[yarn]$ $HADOOP_PREFIX/bin/yarn --daemon stop nodemanager
|
||||||
|
----
|
||||||
|
|
||||||
|
If <<<etc/hadoop/slaves>>> and ssh trusted access is configured
|
||||||
|
(see {{{./SingleCluster.html}Single Node Setup}}), all of the
|
||||||
|
YARN processes can be stopped with a utility script. As <yarn>:
|
||||||
|
|
||||||
|
----
|
||||||
|
[yarn]$ $HADOOP_PREFIX/sbin/stop-yarn.sh
|
||||||
----
|
----
|
||||||
|
|
||||||
Stop the WebAppProxy server. Run on the WebAppProxy server as
|
Stop the WebAppProxy server. Run on the WebAppProxy server as
|
||||||
|
@ -674,14 +622,14 @@ $ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh stop historyserver --config $HADOO
|
||||||
should be run on each of them:
|
should be run on each of them:
|
||||||
|
|
||||||
----
|
----
|
||||||
[yarn]$ $HADOOP_YARN_HOME/bin/yarn stop proxyserver --config $HADOOP_CONF_DIR
|
[yarn]$ $HADOOP_PREFIX/bin/yarn stop proxyserver
|
||||||
----
|
----
|
||||||
|
|
||||||
Stop the MapReduce JobHistory Server with the following command, run on the
|
Stop the MapReduce JobHistory Server with the following command, run on the
|
||||||
designated server as <mapred>:
|
designated server as <mapred>:
|
||||||
|
|
||||||
----
|
----
|
||||||
[mapred]$ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh stop historyserver --config $HADOOP_CONF_DIR
|
[mapred]$ $HADOOP_PREFIX/bin/mapred --daemon stop historyserver
|
||||||
----
|
----
|
||||||
|
|
||||||
* {Web Interfaces}
|
* {Web Interfaces}
|
||||||
|
|
|
@ -21,102 +21,161 @@
|
||||||
|
|
||||||
%{toc}
|
%{toc}
|
||||||
|
|
||||||
Overview
|
Hadoop Commands Guide
|
||||||
|
|
||||||
All hadoop commands are invoked by the <<<bin/hadoop>>> script. Running the
|
* Overview
|
||||||
hadoop script without any arguments prints the description for all
|
|
||||||
commands.
|
|
||||||
|
|
||||||
Usage: <<<hadoop [--config confdir] [--loglevel loglevel] [COMMAND]
|
All of the Hadoop commands and subprojects follow the same basic structure:
|
||||||
[GENERIC_OPTIONS] [COMMAND_OPTIONS]>>>
|
|
||||||
|
|
||||||
Hadoop has an option parsing framework that employs parsing generic
|
Usage: <<<shellcommand [SHELL_OPTIONS] [COMMAND] [GENERIC_OPTIONS] [COMMAND_OPTIONS]>>>
|
||||||
options as well as running classes.
|
|
||||||
|
*--------+---------+
|
||||||
|
|| FIELD || Description
|
||||||
|
*-----------------------+---------------+
|
||||||
|
| shellcommand | The command of the project being invoked. For example,
|
||||||
|
| Hadoop common uses <<<hadoop>>>, HDFS uses <<<hdfs>>>,
|
||||||
|
| and YARN uses <<<yarn>>>.
|
||||||
|
*---------------+-------------------+
|
||||||
|
| SHELL_OPTIONS | Options that the shell processes prior to executing Java.
|
||||||
|
*-----------------------+---------------+
|
||||||
|
| COMMAND | Action to perform.
|
||||||
|
*-----------------------+---------------+
|
||||||
|
| GENERIC_OPTIONS | The common set of options supported by
|
||||||
|
| multiple commands.
|
||||||
|
*-----------------------+---------------+
|
||||||
|
| COMMAND_OPTIONS | Various commands with their options are
|
||||||
|
| described in this documention for the
|
||||||
|
| Hadoop common sub-project. HDFS and YARN are
|
||||||
|
| covered in other documents.
|
||||||
|
*-----------------------+---------------+
|
||||||
|
|
||||||
|
** {Shell Options}
|
||||||
|
|
||||||
|
All of the shell commands will accept a common set of options. For some commands,
|
||||||
|
these options are ignored. For example, passing <<<---hostnames>>> on a
|
||||||
|
command that only executes on a single host will be ignored.
|
||||||
|
|
||||||
*-----------------------+---------------+
|
*-----------------------+---------------+
|
||||||
|| COMMAND_OPTION || Description
|
|| SHELL_OPTION || Description
|
||||||
*-----------------------+---------------+
|
*-----------------------+---------------+
|
||||||
| <<<--config confdir>>>| Overwrites the default Configuration directory. Default is <<<${HADOOP_HOME}/conf>>>.
|
| <<<--buildpaths>>> | Enables developer versions of jars.
|
||||||
*-----------------------+---------------+
|
*-----------------------+---------------+
|
||||||
| <<<--loglevel loglevel>>>| Overwrites the log level. Valid log levels are
|
| <<<--config confdir>>> | Overwrites the default Configuration
|
||||||
|
| directory. Default is <<<${HADOOP_PREFIX}/conf>>>.
|
||||||
|
*-----------------------+----------------+
|
||||||
|
| <<<--daemon mode>>> | If the command supports daemonization (e.g.,
|
||||||
|
| <<<hdfs namenode>>>), execute in the appropriate
|
||||||
|
| mode. Supported modes are <<<start>>> to start the
|
||||||
|
| process in daemon mode, <<<stop>>> to stop the
|
||||||
|
| process, and <<<status>>> to determine the active
|
||||||
|
| status of the process. <<<status>>> will return
|
||||||
|
| an {{{http://refspecs.linuxbase.org/LSB_3.0.0/LSB-generic/LSB-generic/iniscrptact.html}LSB-compliant}} result code.
|
||||||
|
| If no option is provided, commands that support
|
||||||
|
| daemonization will run in the foreground.
|
||||||
|
*-----------------------+---------------+
|
||||||
|
| <<<--debug>>> | Enables shell level configuration debugging information
|
||||||
|
*-----------------------+---------------+
|
||||||
|
| <<<--help>>> | Shell script usage information.
|
||||||
|
*-----------------------+---------------+
|
||||||
|
| <<<--hostnames>>> | A space delimited list of hostnames where to execute
|
||||||
|
| a multi-host subcommand. By default, the content of
|
||||||
|
| the <<<slaves>>> file is used.
|
||||||
|
*-----------------------+----------------+
|
||||||
|
| <<<--hosts>>> | A file that contains a list of hostnames where to execute
|
||||||
|
| a multi-host subcommand. By default, the content of the
|
||||||
|
| <<<slaves>>> file is used.
|
||||||
|
*-----------------------+----------------+
|
||||||
|
| <<<--loglevel loglevel>>> | Overrides the log level. Valid log levels are
|
||||||
| | FATAL, ERROR, WARN, INFO, DEBUG, and TRACE.
|
| | FATAL, ERROR, WARN, INFO, DEBUG, and TRACE.
|
||||||
| | Default is INFO.
|
| | Default is INFO.
|
||||||
*-----------------------+---------------+
|
*-----------------------+---------------+
|
||||||
| GENERIC_OPTIONS | The common set of options supported by multiple commands.
|
|
||||||
| COMMAND_OPTIONS | Various commands with their options are described in the following sections. The commands have been grouped into User Commands and Administration Commands.
|
|
||||||
*-----------------------+---------------+
|
|
||||||
|
|
||||||
Generic Options
|
** {Generic Options}
|
||||||
|
|
||||||
The following options are supported by {{dfsadmin}}, {{fs}}, {{fsck}},
|
Many subcommands honor a common set of configuration options to alter their behavior:
|
||||||
{{job}} and {{fetchdt}}. Applications should implement
|
|
||||||
{{{../../api/org/apache/hadoop/util/Tool.html}Tool}} to support
|
|
||||||
GenericOptions.
|
|
||||||
|
|
||||||
*------------------------------------------------+-----------------------------+
|
*------------------------------------------------+-----------------------------+
|
||||||
|| GENERIC_OPTION || Description
|
|| GENERIC_OPTION || Description
|
||||||
*------------------------------------------------+-----------------------------+
|
*------------------------------------------------+-----------------------------+
|
||||||
|<<<-conf \<configuration file\> >>> | Specify an application
|
|
||||||
| configuration file.
|
|
||||||
*------------------------------------------------+-----------------------------+
|
|
||||||
|<<<-D \<property\>=\<value\> >>> | Use value for given property.
|
|
||||||
*------------------------------------------------+-----------------------------+
|
|
||||||
|<<<-jt \<local\> or \<resourcemanager:port\>>>> | Specify a ResourceManager.
|
|
||||||
| Applies only to job.
|
|
||||||
*------------------------------------------------+-----------------------------+
|
|
||||||
|<<<-files \<comma separated list of files\> >>> | Specify comma separated files
|
|
||||||
| to be copied to the map
|
|
||||||
| reduce cluster. Applies only
|
|
||||||
| to job.
|
|
||||||
*------------------------------------------------+-----------------------------+
|
|
||||||
|<<<-libjars \<comma seperated list of jars\> >>>| Specify comma separated jar
|
|
||||||
| files to include in the
|
|
||||||
| classpath. Applies only to
|
|
||||||
| job.
|
|
||||||
*------------------------------------------------+-----------------------------+
|
|
||||||
|<<<-archives \<comma separated list of archives\> >>> | Specify comma separated
|
|<<<-archives \<comma separated list of archives\> >>> | Specify comma separated
|
||||||
| archives to be unarchived on
|
| archives to be unarchived on
|
||||||
| the compute machines. Applies
|
| the compute machines. Applies
|
||||||
| only to job.
|
| only to job.
|
||||||
*------------------------------------------------+-----------------------------+
|
*------------------------------------------------+-----------------------------+
|
||||||
|
|<<<-conf \<configuration file\> >>> | Specify an application
|
||||||
|
| configuration file.
|
||||||
|
*------------------------------------------------+-----------------------------+
|
||||||
|
|<<<-D \<property\>=\<value\> >>> | Use value for given property.
|
||||||
|
*------------------------------------------------+-----------------------------+
|
||||||
|
|<<<-files \<comma separated list of files\> >>> | Specify comma separated files
|
||||||
|
| to be copied to the map
|
||||||
|
| reduce cluster. Applies only
|
||||||
|
| to job.
|
||||||
|
*------------------------------------------------+-----------------------------+
|
||||||
|
|<<<-jt \<local\> or \<resourcemanager:port\>>>> | Specify a ResourceManager.
|
||||||
|
| Applies only to job.
|
||||||
|
*------------------------------------------------+-----------------------------+
|
||||||
|
|<<<-libjars \<comma seperated list of jars\> >>>| Specify comma separated jar
|
||||||
|
| files to include in the
|
||||||
|
| classpath. Applies only to
|
||||||
|
| job.
|
||||||
|
*------------------------------------------------+-----------------------------+
|
||||||
|
|
||||||
User Commands
|
Hadoop Common Commands
|
||||||
|
|
||||||
|
All of these commands are executed from the <<<hadoop>>> shell command. They
|
||||||
|
have been broken up into {{User Commands}} and
|
||||||
|
{{Admininistration Commands}}.
|
||||||
|
|
||||||
|
* User Commands
|
||||||
|
|
||||||
Commands useful for users of a hadoop cluster.
|
Commands useful for users of a hadoop cluster.
|
||||||
|
|
||||||
* <<<archive>>>
|
** <<<archive>>>
|
||||||
|
|
||||||
Creates a hadoop archive. More information can be found at
|
Creates a hadoop archive. More information can be found at
|
||||||
{{{../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/HadoopArchives.html}
|
{{{../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/HadoopArchives.html}
|
||||||
Hadoop Archives Guide}}.
|
Hadoop Archives Guide}}.
|
||||||
|
|
||||||
* <<<credential>>>
|
** <<<checknative>>>
|
||||||
|
|
||||||
Command to manage credentials, passwords and secrets within credential providers.
|
Usage: <<<hadoop checknative [-a] [-h] >>>
|
||||||
|
|
||||||
The CredentialProvider API in Hadoop allows for the separation of applications
|
*-----------------+-----------------------------------------------------------+
|
||||||
and how they store their required passwords/secrets. In order to indicate
|
|| COMMAND_OPTION || Description
|
||||||
a particular provider type and location, the user must provide the
|
*-----------------+-----------------------------------------------------------+
|
||||||
<hadoop.security.credential.provider.path> configuration element in core-site.xml
|
| -a | Check all libraries are available.
|
||||||
or use the command line option <<<-provider>>> on each of the following commands.
|
*-----------------+-----------------------------------------------------------+
|
||||||
This provider path is a comma-separated list of URLs that indicates the type and
|
| -h | print help
|
||||||
location of a list of providers that should be consulted.
|
*-----------------+-----------------------------------------------------------+
|
||||||
For example, the following path:
|
|
||||||
|
|
||||||
<<<user:///,jceks://file/tmp/test.jceks,jceks://hdfs@nn1.example.com/my/path/test.jceks>>>
|
This command checks the availability of the Hadoop native code. See
|
||||||
|
{{{NativeLibraries.html}}} for more information. By default, this command
|
||||||
|
only checks the availability of libhadoop.
|
||||||
|
|
||||||
indicates that the current user's credentials file should be consulted through
|
** <<<classpath>>>
|
||||||
the User Provider, that the local file located at <<</tmp/test.jceks>>> is a Java Keystore
|
|
||||||
Provider and that the file located within HDFS at <<<nn1.example.com/my/path/test.jceks>>>
|
|
||||||
is also a store for a Java Keystore Provider.
|
|
||||||
|
|
||||||
When utilizing the credential command it will often be for provisioning a password
|
Usage: <<<hadoop classpath [--glob|--jar <path>|-h|--help]>>>
|
||||||
or secret to a particular credential store provider. In order to explicitly
|
|
||||||
indicate which provider store to use the <<<-provider>>> option should be used. Otherwise,
|
|
||||||
given a path of multiple providers, the first non-transient provider will be used.
|
|
||||||
This may or may not be the one that you intended.
|
|
||||||
|
|
||||||
Example: <<<-provider jceks://file/tmp/test.jceks>>>
|
*-----------------+-----------------------------------------------------------+
|
||||||
|
|| COMMAND_OPTION || Description
|
||||||
|
*-----------------+-----------------------------------------------------------+
|
||||||
|
| --glob | expand wildcards
|
||||||
|
*-----------------+-----------------------------------------------------------+
|
||||||
|
| --jar <path> | write classpath as manifest in jar named <path>
|
||||||
|
*-----------------+-----------------------------------------------------------+
|
||||||
|
| -h, --help | print help
|
||||||
|
*-----------------+-----------------------------------------------------------+
|
||||||
|
|
||||||
|
Prints the class path needed to get the Hadoop jar and the required
|
||||||
|
libraries. If called without arguments, then prints the classpath set up by
|
||||||
|
the command scripts, which is likely to contain wildcards in the classpath
|
||||||
|
entries. Additional options print the classpath after wildcard expansion or
|
||||||
|
write the classpath into the manifest of a jar file. The latter is useful in
|
||||||
|
environments where wildcards cannot be used and the expanded classpath exceeds
|
||||||
|
the maximum supported command line length.
|
||||||
|
|
||||||
|
** <<<credential>>>
|
||||||
|
|
||||||
Usage: <<<hadoop credential <subcommand> [options]>>>
|
Usage: <<<hadoop credential <subcommand> [options]>>>
|
||||||
|
|
||||||
|
@ -143,109 +202,96 @@ User Commands
|
||||||
| indicated.
|
| indicated.
|
||||||
*-------------------+-------------------------------------------------------+
|
*-------------------+-------------------------------------------------------+
|
||||||
|
|
||||||
* <<<distcp>>>
|
Command to manage credentials, passwords and secrets within credential providers.
|
||||||
|
|
||||||
|
The CredentialProvider API in Hadoop allows for the separation of applications
|
||||||
|
and how they store their required passwords/secrets. In order to indicate
|
||||||
|
a particular provider type and location, the user must provide the
|
||||||
|
<hadoop.security.credential.provider.path> configuration element in core-site.xml
|
||||||
|
or use the command line option <<<-provider>>> on each of the following commands.
|
||||||
|
This provider path is a comma-separated list of URLs that indicates the type and
|
||||||
|
location of a list of providers that should be consulted. For example, the following path:
|
||||||
|
<<<user:///,jceks://file/tmp/test.jceks,jceks://hdfs@nn1.example.com/my/path/test.jceks>>>
|
||||||
|
|
||||||
|
indicates that the current user's credentials file should be consulted through
|
||||||
|
the User Provider, that the local file located at <<</tmp/test.jceks>>> is a Java Keystore
|
||||||
|
Provider and that the file located within HDFS at <<<nn1.example.com/my/path/test.jceks>>>
|
||||||
|
is also a store for a Java Keystore Provider.
|
||||||
|
|
||||||
|
When utilizing the credential command it will often be for provisioning a password
|
||||||
|
or secret to a particular credential store provider. In order to explicitly
|
||||||
|
indicate which provider store to use the <<<-provider>>> option should be used. Otherwise,
|
||||||
|
given a path of multiple providers, the first non-transient provider will be used.
|
||||||
|
This may or may not be the one that you intended.
|
||||||
|
|
||||||
|
Example: <<<-provider jceks://file/tmp/test.jceks>>>
|
||||||
|
|
||||||
|
** <<<distch>>>
|
||||||
|
|
||||||
|
Usage: <<<hadoop distch [-f urilist_url] [-i] [-log logdir] path:owner:group:permissions>>>
|
||||||
|
|
||||||
|
*-------------------+-------------------------------------------------------+
|
||||||
|
||COMMAND_OPTION || Description
|
||||||
|
*-------------------+-------------------------------------------------------+
|
||||||
|
| -f | List of objects to change
|
||||||
|
*----+------------+
|
||||||
|
| -i | Ignore failures
|
||||||
|
*----+------------+
|
||||||
|
| -log | Directory to log output
|
||||||
|
*-----+---------+
|
||||||
|
|
||||||
|
Change the ownership and permissions on many files at once.
|
||||||
|
|
||||||
|
** <<<distcp>>>
|
||||||
|
|
||||||
Copy file or directories recursively. More information can be found at
|
Copy file or directories recursively. More information can be found at
|
||||||
{{{../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/DistCp.html}
|
{{{../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/DistCp.html}
|
||||||
Hadoop DistCp Guide}}.
|
Hadoop DistCp Guide}}.
|
||||||
|
|
||||||
* <<<fs>>>
|
** <<<fs>>>
|
||||||
|
|
||||||
Deprecated, use {{{../hadoop-hdfs/HDFSCommands.html#dfs}<<<hdfs dfs>>>}}
|
This command is documented in the {{{./FileSystemShell.html}File System Shell Guide}}. It is a synonym for <<<hdfs dfs>>> when HDFS is in use.
|
||||||
instead.
|
|
||||||
|
|
||||||
* <<<fsck>>>
|
** <<<jar>>>
|
||||||
|
|
||||||
Deprecated, use {{{../hadoop-hdfs/HDFSCommands.html#fsck}<<<hdfs fsck>>>}}
|
|
||||||
instead.
|
|
||||||
|
|
||||||
* <<<fetchdt>>>
|
|
||||||
|
|
||||||
Deprecated, use {{{../hadoop-hdfs/HDFSCommands.html#fetchdt}
|
|
||||||
<<<hdfs fetchdt>>>}} instead.
|
|
||||||
|
|
||||||
* <<<jar>>>
|
|
||||||
|
|
||||||
Runs a jar file. Users can bundle their Map Reduce code in a jar file and
|
|
||||||
execute it using this command.
|
|
||||||
|
|
||||||
Usage: <<<hadoop jar <jar> [mainClass] args...>>>
|
Usage: <<<hadoop jar <jar> [mainClass] args...>>>
|
||||||
|
|
||||||
The streaming jobs are run via this command. Examples can be referred from
|
Runs a jar file.
|
||||||
Streaming examples
|
|
||||||
|
|
||||||
Word count example is also run using jar command. It can be referred from
|
|
||||||
Wordcount example
|
|
||||||
|
|
||||||
Use {{{../../hadoop-yarn/hadoop-yarn-site/YarnCommands.html#jar}<<<yarn jar>>>}}
|
Use {{{../../hadoop-yarn/hadoop-yarn-site/YarnCommands.html#jar}<<<yarn jar>>>}}
|
||||||
to launch YARN applications instead.
|
to launch YARN applications instead.
|
||||||
|
|
||||||
* <<<job>>>
|
** <<<jnipath>>>
|
||||||
|
|
||||||
Deprecated. Use
|
Usage: <<<hadoop jnipath>>>
|
||||||
{{{../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapredCommands.html#job}
|
|
||||||
<<<mapred job>>>}} instead.
|
|
||||||
|
|
||||||
* <<<pipes>>>
|
Print the computed java.library.path.
|
||||||
|
|
||||||
Deprecated. Use
|
** <<<key>>>
|
||||||
{{{../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapredCommands.html#pipes}
|
|
||||||
<<<mapred pipes>>>}} instead.
|
|
||||||
|
|
||||||
* <<<queue>>>
|
Manage keys via the KeyProvider.
|
||||||
|
|
||||||
Deprecated. Use
|
** <<<trace>>>
|
||||||
{{{../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapredCommands.html#queue}
|
|
||||||
<<<mapred queue>>>}} instead.
|
|
||||||
|
|
||||||
* <<<version>>>
|
View and modify Hadoop tracing settings. See the {{{./Tracing.html}Tracing Guide}}.
|
||||||
|
|
||||||
Prints the version.
|
** <<<version>>>
|
||||||
|
|
||||||
Usage: <<<hadoop version>>>
|
Usage: <<<hadoop version>>>
|
||||||
|
|
||||||
* <<<CLASSNAME>>>
|
Prints the version.
|
||||||
|
|
||||||
hadoop script can be used to invoke any class.
|
** <<<CLASSNAME>>>
|
||||||
|
|
||||||
Usage: <<<hadoop CLASSNAME>>>
|
Usage: <<<hadoop CLASSNAME>>>
|
||||||
|
|
||||||
Runs the class named <<<CLASSNAME>>>.
|
Runs the class named <<<CLASSNAME>>>. The class must be part of a package.
|
||||||
|
|
||||||
* <<<classpath>>>
|
* {Administration Commands}
|
||||||
|
|
||||||
Prints the class path needed to get the Hadoop jar and the required
|
|
||||||
libraries. If called without arguments, then prints the classpath set up by
|
|
||||||
the command scripts, which is likely to contain wildcards in the classpath
|
|
||||||
entries. Additional options print the classpath after wildcard expansion or
|
|
||||||
write the classpath into the manifest of a jar file. The latter is useful in
|
|
||||||
environments where wildcards cannot be used and the expanded classpath exceeds
|
|
||||||
the maximum supported command line length.
|
|
||||||
|
|
||||||
Usage: <<<hadoop classpath [--glob|--jar <path>|-h|--help]>>>
|
|
||||||
|
|
||||||
*-----------------+-----------------------------------------------------------+
|
|
||||||
|| COMMAND_OPTION || Description
|
|
||||||
*-----------------+-----------------------------------------------------------+
|
|
||||||
| --glob | expand wildcards
|
|
||||||
*-----------------+-----------------------------------------------------------+
|
|
||||||
| --jar <path> | write classpath as manifest in jar named <path>
|
|
||||||
*-----------------+-----------------------------------------------------------+
|
|
||||||
| -h, --help | print help
|
|
||||||
*-----------------+-----------------------------------------------------------+
|
|
||||||
|
|
||||||
Administration Commands
|
|
||||||
|
|
||||||
Commands useful for administrators of a hadoop cluster.
|
Commands useful for administrators of a hadoop cluster.
|
||||||
|
|
||||||
* <<<balancer>>>
|
** <<<daemonlog>>>
|
||||||
|
|
||||||
Deprecated, use {{{../hadoop-hdfs/HDFSCommands.html#balancer}
|
|
||||||
<<<hdfs balancer>>>}} instead.
|
|
||||||
|
|
||||||
* <<<daemonlog>>>
|
|
||||||
|
|
||||||
Get/Set the log level for each daemon.
|
|
||||||
|
|
||||||
Usage: <<<hadoop daemonlog -getlevel <host:port> <name> >>>
|
Usage: <<<hadoop daemonlog -getlevel <host:port> <name> >>>
|
||||||
Usage: <<<hadoop daemonlog -setlevel <host:port> <name> <level> >>>
|
Usage: <<<hadoop daemonlog -setlevel <host:port> <name> <level> >>>
|
||||||
|
@ -262,22 +308,20 @@ Administration Commands
|
||||||
| connects to http://<host:port>/logLevel?log=<name>
|
| connects to http://<host:port>/logLevel?log=<name>
|
||||||
*------------------------------+-----------------------------------------------------------+
|
*------------------------------+-----------------------------------------------------------+
|
||||||
|
|
||||||
* <<<datanode>>>
|
Get/Set the log level for each daemon.
|
||||||
|
|
||||||
Deprecated, use {{{../hadoop-hdfs/HDFSCommands.html#datanode}
|
* Files
|
||||||
<<<hdfs datanode>>>}} instead.
|
|
||||||
|
|
||||||
* <<<dfsadmin>>>
|
** <<etc/hadoop/hadoop-env.sh>>
|
||||||
|
|
||||||
Deprecated, use {{{../hadoop-hdfs/HDFSCommands.html#dfsadmin}
|
This file stores the global settings used by all Hadoop shell commands.
|
||||||
<<<hdfs dfsadmin>>>}} instead.
|
|
||||||
|
|
||||||
* <<<namenode>>>
|
** <<etc/hadoop/hadoop-user-functions.sh>>
|
||||||
|
|
||||||
Deprecated, use {{{../hadoop-hdfs/HDFSCommands.html#namenode}
|
This file allows for advanced users to override some shell functionality.
|
||||||
<<<hdfs namenode>>>}} instead.
|
|
||||||
|
|
||||||
* <<<secondarynamenode>>>
|
** <<~/.hadooprc>>
|
||||||
|
|
||||||
Deprecated, use {{{../hadoop-hdfs/HDFSCommands.html#secondarynamenode}
|
This stores the personal environment for an individual user. It is
|
||||||
<<<hdfs secondarynamenode>>>}} instead.
|
processed after the hadoop-env.sh and hadoop-user-functions.sh files
|
||||||
|
and can contain the same settings.
|
||||||
|
|
|
@ -45,46 +45,62 @@ bin/hadoop fs <args>
|
||||||
Differences are described with each of the commands. Error information is
|
Differences are described with each of the commands. Error information is
|
||||||
sent to stderr and the output is sent to stdout.
|
sent to stderr and the output is sent to stdout.
|
||||||
|
|
||||||
appendToFile
|
If HDFS is being used, <<<hdfs dfs>>> is a synonym.
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -appendToFile <localsrc> ... <dst> >>>
|
See the {{{./CommandsManual.html}Commands Manual}} for generic shell options.
|
||||||
|
|
||||||
|
* appendToFile
|
||||||
|
|
||||||
|
Usage: <<<hadoop fs -appendToFile <localsrc> ... <dst> >>>
|
||||||
|
|
||||||
Append single src, or multiple srcs from local file system to the
|
Append single src, or multiple srcs from local file system to the
|
||||||
destination file system. Also reads input from stdin and appends to
|
destination file system. Also reads input from stdin and appends to
|
||||||
destination file system.
|
destination file system.
|
||||||
|
|
||||||
* <<<hdfs dfs -appendToFile localfile /user/hadoop/hadoopfile>>>
|
* <<<hadoop fs -appendToFile localfile /user/hadoop/hadoopfile>>>
|
||||||
|
|
||||||
* <<<hdfs dfs -appendToFile localfile1 localfile2 /user/hadoop/hadoopfile>>>
|
* <<<hadoop fs -appendToFile localfile1 localfile2 /user/hadoop/hadoopfile>>>
|
||||||
|
|
||||||
* <<<hdfs dfs -appendToFile localfile hdfs://nn.example.com/hadoop/hadoopfile>>>
|
* <<<hadoop fs -appendToFile localfile hdfs://nn.example.com/hadoop/hadoopfile>>>
|
||||||
|
|
||||||
* <<<hdfs dfs -appendToFile - hdfs://nn.example.com/hadoop/hadoopfile>>>
|
* <<<hadoop fs -appendToFile - hdfs://nn.example.com/hadoop/hadoopfile>>>
|
||||||
Reads the input from stdin.
|
Reads the input from stdin.
|
||||||
|
|
||||||
Exit Code:
|
Exit Code:
|
||||||
|
|
||||||
Returns 0 on success and 1 on error.
|
Returns 0 on success and 1 on error.
|
||||||
|
|
||||||
cat
|
* cat
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -cat URI [URI ...]>>>
|
Usage: <<<hadoop fs -cat URI [URI ...]>>>
|
||||||
|
|
||||||
Copies source paths to stdout.
|
Copies source paths to stdout.
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
* <<<hdfs dfs -cat hdfs://nn1.example.com/file1 hdfs://nn2.example.com/file2>>>
|
* <<<hadoop fs -cat hdfs://nn1.example.com/file1 hdfs://nn2.example.com/file2>>>
|
||||||
|
|
||||||
* <<<hdfs dfs -cat file:///file3 /user/hadoop/file4>>>
|
* <<<hadoop fs -cat file:///file3 /user/hadoop/file4>>>
|
||||||
|
|
||||||
Exit Code:
|
Exit Code:
|
||||||
|
|
||||||
Returns 0 on success and -1 on error.
|
Returns 0 on success and -1 on error.
|
||||||
|
|
||||||
chgrp
|
* checksum
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -chgrp [-R] GROUP URI [URI ...]>>>
|
Usage: <<<hadoop fs -checksum URI>>>
|
||||||
|
|
||||||
|
Returns the checksum information of a file.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
* <<<hadoop fs -checksum hdfs://nn1.example.com/file1>>>
|
||||||
|
|
||||||
|
* <<<hadoop fs -checksum file:///etc/hosts>>>
|
||||||
|
|
||||||
|
* chgrp
|
||||||
|
|
||||||
|
Usage: <<<hadoop fs -chgrp [-R] GROUP URI [URI ...]>>>
|
||||||
|
|
||||||
Change group association of files. The user must be the owner of files, or
|
Change group association of files. The user must be the owner of files, or
|
||||||
else a super-user. Additional information is in the
|
else a super-user. Additional information is in the
|
||||||
|
@ -94,9 +110,9 @@ chgrp
|
||||||
|
|
||||||
* The -R option will make the change recursively through the directory structure.
|
* The -R option will make the change recursively through the directory structure.
|
||||||
|
|
||||||
chmod
|
* chmod
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -chmod [-R] <MODE[,MODE]... | OCTALMODE> URI [URI ...]>>>
|
Usage: <<<hadoop fs -chmod [-R] <MODE[,MODE]... | OCTALMODE> URI [URI ...]>>>
|
||||||
|
|
||||||
Change the permissions of files. With -R, make the change recursively
|
Change the permissions of files. With -R, make the change recursively
|
||||||
through the directory structure. The user must be the owner of the file, or
|
through the directory structure. The user must be the owner of the file, or
|
||||||
|
@ -107,9 +123,9 @@ chmod
|
||||||
|
|
||||||
* The -R option will make the change recursively through the directory structure.
|
* The -R option will make the change recursively through the directory structure.
|
||||||
|
|
||||||
chown
|
* chown
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -chown [-R] [OWNER][:[GROUP]] URI [URI ]>>>
|
Usage: <<<hadoop fs -chown [-R] [OWNER][:[GROUP]] URI [URI ]>>>
|
||||||
|
|
||||||
Change the owner of files. The user must be a super-user. Additional information
|
Change the owner of files. The user must be a super-user. Additional information
|
||||||
is in the {{{../hadoop-hdfs/HdfsPermissionsGuide.html}Permissions Guide}}.
|
is in the {{{../hadoop-hdfs/HdfsPermissionsGuide.html}Permissions Guide}}.
|
||||||
|
@ -118,9 +134,9 @@ chown
|
||||||
|
|
||||||
* The -R option will make the change recursively through the directory structure.
|
* The -R option will make the change recursively through the directory structure.
|
||||||
|
|
||||||
copyFromLocal
|
* copyFromLocal
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -copyFromLocal <localsrc> URI>>>
|
Usage: <<<hadoop fs -copyFromLocal <localsrc> URI>>>
|
||||||
|
|
||||||
Similar to put command, except that the source is restricted to a local
|
Similar to put command, except that the source is restricted to a local
|
||||||
file reference.
|
file reference.
|
||||||
|
@ -129,16 +145,16 @@ copyFromLocal
|
||||||
|
|
||||||
* The -f option will overwrite the destination if it already exists.
|
* The -f option will overwrite the destination if it already exists.
|
||||||
|
|
||||||
copyToLocal
|
* copyToLocal
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -copyToLocal [-ignorecrc] [-crc] URI <localdst> >>>
|
Usage: <<<hadoop fs -copyToLocal [-ignorecrc] [-crc] URI <localdst> >>>
|
||||||
|
|
||||||
Similar to get command, except that the destination is restricted to a
|
Similar to get command, except that the destination is restricted to a
|
||||||
local file reference.
|
local file reference.
|
||||||
|
|
||||||
count
|
* count
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -count [-q] [-h] <paths> >>>
|
Usage: <<<hadoop fs -count [-q] [-h] <paths> >>>
|
||||||
|
|
||||||
Count the number of directories, files and bytes under the paths that match
|
Count the number of directories, files and bytes under the paths that match
|
||||||
the specified file pattern. The output columns with -count are: DIR_COUNT,
|
the specified file pattern. The output columns with -count are: DIR_COUNT,
|
||||||
|
@ -151,19 +167,19 @@ count
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
* <<<hdfs dfs -count hdfs://nn1.example.com/file1 hdfs://nn2.example.com/file2>>>
|
* <<<hadoop fs -count hdfs://nn1.example.com/file1 hdfs://nn2.example.com/file2>>>
|
||||||
|
|
||||||
* <<<hdfs dfs -count -q hdfs://nn1.example.com/file1>>>
|
* <<<hadoop fs -count -q hdfs://nn1.example.com/file1>>>
|
||||||
|
|
||||||
* <<<hdfs dfs -count -q -h hdfs://nn1.example.com/file1>>>
|
* <<<hadoop fs -count -q -h hdfs://nn1.example.com/file1>>>
|
||||||
|
|
||||||
Exit Code:
|
Exit Code:
|
||||||
|
|
||||||
Returns 0 on success and -1 on error.
|
Returns 0 on success and -1 on error.
|
||||||
|
|
||||||
cp
|
* cp
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -cp [-f] [-p | -p[topax]] URI [URI ...] <dest> >>>
|
Usage: <<<hadoop fs -cp [-f] [-p | -p[topax]] URI [URI ...] <dest> >>>
|
||||||
|
|
||||||
Copy files from source to destination. This command allows multiple sources
|
Copy files from source to destination. This command allows multiple sources
|
||||||
as well in which case the destination must be a directory.
|
as well in which case the destination must be a directory.
|
||||||
|
@ -187,17 +203,41 @@ cp
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
* <<<hdfs dfs -cp /user/hadoop/file1 /user/hadoop/file2>>>
|
* <<<hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2>>>
|
||||||
|
|
||||||
* <<<hdfs dfs -cp /user/hadoop/file1 /user/hadoop/file2 /user/hadoop/dir>>>
|
* <<<hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2 /user/hadoop/dir>>>
|
||||||
|
|
||||||
Exit Code:
|
Exit Code:
|
||||||
|
|
||||||
Returns 0 on success and -1 on error.
|
Returns 0 on success and -1 on error.
|
||||||
|
|
||||||
du
|
* createSnapshot
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -du [-s] [-h] URI [URI ...]>>>
|
See {{{../hadoop-hdfs/HdfsSnapshots.html}HDFS Snapshots Guide}}.
|
||||||
|
|
||||||
|
|
||||||
|
* deleteSnapshot
|
||||||
|
|
||||||
|
See {{{../hadoop-hdfs/HdfsSnapshots.html}HDFS Snapshots Guide}}.
|
||||||
|
|
||||||
|
* df
|
||||||
|
|
||||||
|
Usage: <<<hadoop fs -df [-h] URI [URI ...]>>>
|
||||||
|
|
||||||
|
Displays free space.
|
||||||
|
|
||||||
|
Options:
|
||||||
|
|
||||||
|
* The -h option will format file sizes in a "human-readable" fashion (e.g
|
||||||
|
64.0m instead of 67108864)
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
* <<<hadoop dfs -df /user/hadoop/dir1>>>
|
||||||
|
|
||||||
|
* du
|
||||||
|
|
||||||
|
Usage: <<<hadoop fs -du [-s] [-h] URI [URI ...]>>>
|
||||||
|
|
||||||
Displays sizes of files and directories contained in the given directory or
|
Displays sizes of files and directories contained in the given directory or
|
||||||
the length of a file in case its just a file.
|
the length of a file in case its just a file.
|
||||||
|
@ -212,29 +252,29 @@ du
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
* hdfs dfs -du /user/hadoop/dir1 /user/hadoop/file1 hdfs://nn.example.com/user/hadoop/dir1
|
* <<<hadoop fs -du /user/hadoop/dir1 /user/hadoop/file1 hdfs://nn.example.com/user/hadoop/dir1>>>
|
||||||
|
|
||||||
Exit Code:
|
Exit Code:
|
||||||
Returns 0 on success and -1 on error.
|
Returns 0 on success and -1 on error.
|
||||||
|
|
||||||
dus
|
* dus
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -dus <args> >>>
|
Usage: <<<hadoop fs -dus <args> >>>
|
||||||
|
|
||||||
Displays a summary of file lengths.
|
Displays a summary of file lengths.
|
||||||
|
|
||||||
<<Note:>> This command is deprecated. Instead use <<<hdfs dfs -du -s>>>.
|
<<Note:>> This command is deprecated. Instead use <<<hadoop fs -du -s>>>.
|
||||||
|
|
||||||
expunge
|
* expunge
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -expunge>>>
|
Usage: <<<hadoop fs -expunge>>>
|
||||||
|
|
||||||
Empty the Trash. Refer to the {{{../hadoop-hdfs/HdfsDesign.html}
|
Empty the Trash. Refer to the {{{../hadoop-hdfs/HdfsDesign.html}
|
||||||
HDFS Architecture Guide}} for more information on the Trash feature.
|
HDFS Architecture Guide}} for more information on the Trash feature.
|
||||||
|
|
||||||
find
|
* find
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -find <path> ... <expression> ... >>>
|
Usage: <<<hadoop fs -find <path> ... <expression> ... >>>
|
||||||
|
|
||||||
Finds all files that match the specified expression and applies selected
|
Finds all files that match the specified expression and applies selected
|
||||||
actions to them. If no <path> is specified then defaults to the current
|
actions to them. If no <path> is specified then defaults to the current
|
||||||
|
@ -269,15 +309,15 @@ find
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
<<<hdfs dfs -find / -name test -print>>>
|
<<<hadoop fs -find / -name test -print>>>
|
||||||
|
|
||||||
Exit Code:
|
Exit Code:
|
||||||
|
|
||||||
Returns 0 on success and -1 on error.
|
Returns 0 on success and -1 on error.
|
||||||
|
|
||||||
get
|
* get
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -get [-ignorecrc] [-crc] <src> <localdst> >>>
|
Usage: <<<hadoop fs -get [-ignorecrc] [-crc] <src> <localdst> >>>
|
||||||
|
|
||||||
Copy files to the local file system. Files that fail the CRC check may be
|
Copy files to the local file system. Files that fail the CRC check may be
|
||||||
copied with the -ignorecrc option. Files and CRCs may be copied using the
|
copied with the -ignorecrc option. Files and CRCs may be copied using the
|
||||||
|
@ -285,17 +325,17 @@ get
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
* <<<hdfs dfs -get /user/hadoop/file localfile>>>
|
* <<<hadoop fs -get /user/hadoop/file localfile>>>
|
||||||
|
|
||||||
* <<<hdfs dfs -get hdfs://nn.example.com/user/hadoop/file localfile>>>
|
* <<<hadoop fs -get hdfs://nn.example.com/user/hadoop/file localfile>>>
|
||||||
|
|
||||||
Exit Code:
|
Exit Code:
|
||||||
|
|
||||||
Returns 0 on success and -1 on error.
|
Returns 0 on success and -1 on error.
|
||||||
|
|
||||||
getfacl
|
* getfacl
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -getfacl [-R] <path> >>>
|
Usage: <<<hadoop fs -getfacl [-R] <path> >>>
|
||||||
|
|
||||||
Displays the Access Control Lists (ACLs) of files and directories. If a
|
Displays the Access Control Lists (ACLs) of files and directories. If a
|
||||||
directory has a default ACL, then getfacl also displays the default ACL.
|
directory has a default ACL, then getfacl also displays the default ACL.
|
||||||
|
@ -308,17 +348,17 @@ getfacl
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
* <<<hdfs dfs -getfacl /file>>>
|
* <<<hadoop fs -getfacl /file>>>
|
||||||
|
|
||||||
* <<<hdfs dfs -getfacl -R /dir>>>
|
* <<<hadoop fs -getfacl -R /dir>>>
|
||||||
|
|
||||||
Exit Code:
|
Exit Code:
|
||||||
|
|
||||||
Returns 0 on success and non-zero on error.
|
Returns 0 on success and non-zero on error.
|
||||||
|
|
||||||
getfattr
|
* getfattr
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -getfattr [-R] {-n name | -d} [-e en] <path> >>>
|
Usage: <<<hadoop fs -getfattr [-R] {-n name | -d} [-e en] <path> >>>
|
||||||
|
|
||||||
Displays the extended attribute names and values (if any) for a file or
|
Displays the extended attribute names and values (if any) for a file or
|
||||||
directory.
|
directory.
|
||||||
|
@ -337,26 +377,32 @@ getfattr
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
* <<<hdfs dfs -getfattr -d /file>>>
|
* <<<hadoop fs -getfattr -d /file>>>
|
||||||
|
|
||||||
* <<<hdfs dfs -getfattr -R -n user.myAttr /dir>>>
|
* <<<hadoop fs -getfattr -R -n user.myAttr /dir>>>
|
||||||
|
|
||||||
Exit Code:
|
Exit Code:
|
||||||
|
|
||||||
Returns 0 on success and non-zero on error.
|
Returns 0 on success and non-zero on error.
|
||||||
|
|
||||||
getmerge
|
* getmerge
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -getmerge <src> <localdst> [addnl]>>>
|
Usage: <<<hadoop fs -getmerge <src> <localdst> [addnl]>>>
|
||||||
|
|
||||||
Takes a source directory and a destination file as input and concatenates
|
Takes a source directory and a destination file as input and concatenates
|
||||||
files in src into the destination local file. Optionally addnl can be set to
|
files in src into the destination local file. Optionally addnl can be set to
|
||||||
enable adding a newline character at the
|
enable adding a newline character at the
|
||||||
end of each file.
|
end of each file.
|
||||||
|
|
||||||
ls
|
* help
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -ls [-R] <args> >>>
|
Usage: <<<hadoop fs -help>>>
|
||||||
|
|
||||||
|
Return usage output.
|
||||||
|
|
||||||
|
* ls
|
||||||
|
|
||||||
|
Usage: <<<hadoop fs -ls [-R] <args> >>>
|
||||||
|
|
||||||
Options:
|
Options:
|
||||||
|
|
||||||
|
@ -377,23 +423,23 @@ permissions userid groupid modification_date modification_time dirname
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
* <<<hdfs dfs -ls /user/hadoop/file1>>>
|
* <<<hadoop fs -ls /user/hadoop/file1>>>
|
||||||
|
|
||||||
Exit Code:
|
Exit Code:
|
||||||
|
|
||||||
Returns 0 on success and -1 on error.
|
Returns 0 on success and -1 on error.
|
||||||
|
|
||||||
lsr
|
* lsr
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -lsr <args> >>>
|
Usage: <<<hadoop fs -lsr <args> >>>
|
||||||
|
|
||||||
Recursive version of ls.
|
Recursive version of ls.
|
||||||
|
|
||||||
<<Note:>> This command is deprecated. Instead use <<<hdfs dfs -ls -R>>>
|
<<Note:>> This command is deprecated. Instead use <<<hadoop fs -ls -R>>>
|
||||||
|
|
||||||
mkdir
|
* mkdir
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -mkdir [-p] <paths> >>>
|
Usage: <<<hadoop fs -mkdir [-p] <paths> >>>
|
||||||
|
|
||||||
Takes path uri's as argument and creates directories.
|
Takes path uri's as argument and creates directories.
|
||||||
|
|
||||||
|
@ -403,30 +449,30 @@ mkdir
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
* <<<hdfs dfs -mkdir /user/hadoop/dir1 /user/hadoop/dir2>>>
|
* <<<hadoop fs -mkdir /user/hadoop/dir1 /user/hadoop/dir2>>>
|
||||||
|
|
||||||
* <<<hdfs dfs -mkdir hdfs://nn1.example.com/user/hadoop/dir hdfs://nn2.example.com/user/hadoop/dir>>>
|
* <<<hadoop fs -mkdir hdfs://nn1.example.com/user/hadoop/dir hdfs://nn2.example.com/user/hadoop/dir>>>
|
||||||
|
|
||||||
Exit Code:
|
Exit Code:
|
||||||
|
|
||||||
Returns 0 on success and -1 on error.
|
Returns 0 on success and -1 on error.
|
||||||
|
|
||||||
moveFromLocal
|
* moveFromLocal
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -moveFromLocal <localsrc> <dst> >>>
|
Usage: <<<hadoop fs -moveFromLocal <localsrc> <dst> >>>
|
||||||
|
|
||||||
Similar to put command, except that the source localsrc is deleted after
|
Similar to put command, except that the source localsrc is deleted after
|
||||||
it's copied.
|
it's copied.
|
||||||
|
|
||||||
moveToLocal
|
* moveToLocal
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -moveToLocal [-crc] <src> <dst> >>>
|
Usage: <<<hadoop fs -moveToLocal [-crc] <src> <dst> >>>
|
||||||
|
|
||||||
Displays a "Not implemented yet" message.
|
Displays a "Not implemented yet" message.
|
||||||
|
|
||||||
mv
|
* mv
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -mv URI [URI ...] <dest> >>>
|
Usage: <<<hadoop fs -mv URI [URI ...] <dest> >>>
|
||||||
|
|
||||||
Moves files from source to destination. This command allows multiple sources
|
Moves files from source to destination. This command allows multiple sources
|
||||||
as well in which case the destination needs to be a directory. Moving files
|
as well in which case the destination needs to be a directory. Moving files
|
||||||
|
@ -434,38 +480,42 @@ mv
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
* <<<hdfs dfs -mv /user/hadoop/file1 /user/hadoop/file2>>>
|
* <<<hadoop fs -mv /user/hadoop/file1 /user/hadoop/file2>>>
|
||||||
|
|
||||||
* <<<hdfs dfs -mv hdfs://nn.example.com/file1 hdfs://nn.example.com/file2 hdfs://nn.example.com/file3 hdfs://nn.example.com/dir1>>>
|
* <<<hadoop fs -mv hdfs://nn.example.com/file1 hdfs://nn.example.com/file2 hdfs://nn.example.com/file3 hdfs://nn.example.com/dir1>>>
|
||||||
|
|
||||||
Exit Code:
|
Exit Code:
|
||||||
|
|
||||||
Returns 0 on success and -1 on error.
|
Returns 0 on success and -1 on error.
|
||||||
|
|
||||||
put
|
* put
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -put <localsrc> ... <dst> >>>
|
Usage: <<<hadoop fs -put <localsrc> ... <dst> >>>
|
||||||
|
|
||||||
Copy single src, or multiple srcs from local file system to the destination
|
Copy single src, or multiple srcs from local file system to the destination
|
||||||
file system. Also reads input from stdin and writes to destination file
|
file system. Also reads input from stdin and writes to destination file
|
||||||
system.
|
system.
|
||||||
|
|
||||||
* <<<hdfs dfs -put localfile /user/hadoop/hadoopfile>>>
|
* <<<hadoop fs -put localfile /user/hadoop/hadoopfile>>>
|
||||||
|
|
||||||
* <<<hdfs dfs -put localfile1 localfile2 /user/hadoop/hadoopdir>>>
|
* <<<hadoop fs -put localfile1 localfile2 /user/hadoop/hadoopdir>>>
|
||||||
|
|
||||||
* <<<hdfs dfs -put localfile hdfs://nn.example.com/hadoop/hadoopfile>>>
|
* <<<hadoop fs -put localfile hdfs://nn.example.com/hadoop/hadoopfile>>>
|
||||||
|
|
||||||
* <<<hdfs dfs -put - hdfs://nn.example.com/hadoop/hadoopfile>>>
|
* <<<hadoop fs -put - hdfs://nn.example.com/hadoop/hadoopfile>>>
|
||||||
Reads the input from stdin.
|
Reads the input from stdin.
|
||||||
|
|
||||||
Exit Code:
|
Exit Code:
|
||||||
|
|
||||||
Returns 0 on success and -1 on error.
|
Returns 0 on success and -1 on error.
|
||||||
|
|
||||||
rm
|
* renameSnapshot
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -rm [-f] [-r|-R] [-skipTrash] URI [URI ...]>>>
|
See {{{../hadoop-hdfs/HdfsSnapshots.html}HDFS Snapshots Guide}}.
|
||||||
|
|
||||||
|
* rm
|
||||||
|
|
||||||
|
Usage: <<<hadoop fs -rm [-f] [-r|-R] [-skipTrash] URI [URI ...]>>>
|
||||||
|
|
||||||
Delete files specified as args.
|
Delete files specified as args.
|
||||||
|
|
||||||
|
@ -484,23 +534,37 @@ rm
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
* <<<hdfs dfs -rm hdfs://nn.example.com/file /user/hadoop/emptydir>>>
|
* <<<hadoop fs -rm hdfs://nn.example.com/file /user/hadoop/emptydir>>>
|
||||||
|
|
||||||
Exit Code:
|
Exit Code:
|
||||||
|
|
||||||
Returns 0 on success and -1 on error.
|
Returns 0 on success and -1 on error.
|
||||||
|
|
||||||
rmr
|
* rmdir
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -rmr [-skipTrash] URI [URI ...]>>>
|
Usage: <<<hadoop fs -rmdir [--ignore-fail-on-non-empty] URI [URI ...]>>>
|
||||||
|
|
||||||
|
Delete a directory.
|
||||||
|
|
||||||
|
Options:
|
||||||
|
|
||||||
|
* --ignore-fail-on-non-empty: When using wildcards, do not fail if a directory still contains files.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
* <<<hadoop fs -rmdir /user/hadoop/emptydir>>>
|
||||||
|
|
||||||
|
* rmr
|
||||||
|
|
||||||
|
Usage: <<<hadoop fs -rmr [-skipTrash] URI [URI ...]>>>
|
||||||
|
|
||||||
Recursive version of delete.
|
Recursive version of delete.
|
||||||
|
|
||||||
<<Note:>> This command is deprecated. Instead use <<<hdfs dfs -rm -r>>>
|
<<Note:>> This command is deprecated. Instead use <<<hadoop fs -rm -r>>>
|
||||||
|
|
||||||
setfacl
|
* setfacl
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>] >>>
|
Usage: <<<hadoop fs -setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>] >>>
|
||||||
|
|
||||||
Sets Access Control Lists (ACLs) of files and directories.
|
Sets Access Control Lists (ACLs) of files and directories.
|
||||||
|
|
||||||
|
@ -528,27 +592,27 @@ setfacl
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
* <<<hdfs dfs -setfacl -m user:hadoop:rw- /file>>>
|
* <<<hadoop fs -setfacl -m user:hadoop:rw- /file>>>
|
||||||
|
|
||||||
* <<<hdfs dfs -setfacl -x user:hadoop /file>>>
|
* <<<hadoop fs -setfacl -x user:hadoop /file>>>
|
||||||
|
|
||||||
* <<<hdfs dfs -setfacl -b /file>>>
|
* <<<hadoop fs -setfacl -b /file>>>
|
||||||
|
|
||||||
* <<<hdfs dfs -setfacl -k /dir>>>
|
* <<<hadoop fs -setfacl -k /dir>>>
|
||||||
|
|
||||||
* <<<hdfs dfs -setfacl --set user::rw-,user:hadoop:rw-,group::r--,other::r-- /file>>>
|
* <<<hadoop fs -setfacl --set user::rw-,user:hadoop:rw-,group::r--,other::r-- /file>>>
|
||||||
|
|
||||||
* <<<hdfs dfs -setfacl -R -m user:hadoop:r-x /dir>>>
|
* <<<hadoop fs -setfacl -R -m user:hadoop:r-x /dir>>>
|
||||||
|
|
||||||
* <<<hdfs dfs -setfacl -m default:user:hadoop:r-x /dir>>>
|
* <<<hadoop fs -setfacl -m default:user:hadoop:r-x /dir>>>
|
||||||
|
|
||||||
Exit Code:
|
Exit Code:
|
||||||
|
|
||||||
Returns 0 on success and non-zero on error.
|
Returns 0 on success and non-zero on error.
|
||||||
|
|
||||||
setfattr
|
* setfattr
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -setfattr {-n name [-v value] | -x name} <path> >>>
|
Usage: <<<hadoop fs -setfattr {-n name [-v value] | -x name} <path> >>>
|
||||||
|
|
||||||
Sets an extended attribute name and value for a file or directory.
|
Sets an extended attribute name and value for a file or directory.
|
||||||
|
|
||||||
|
@ -566,19 +630,19 @@ setfattr
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
* <<<hdfs dfs -setfattr -n user.myAttr -v myValue /file>>>
|
* <<<hadoop fs -setfattr -n user.myAttr -v myValue /file>>>
|
||||||
|
|
||||||
* <<<hdfs dfs -setfattr -n user.noValue /file>>>
|
* <<<hadoop fs -setfattr -n user.noValue /file>>>
|
||||||
|
|
||||||
* <<<hdfs dfs -setfattr -x user.myAttr /file>>>
|
* <<<hadoop fs -setfattr -x user.myAttr /file>>>
|
||||||
|
|
||||||
Exit Code:
|
Exit Code:
|
||||||
|
|
||||||
Returns 0 on success and non-zero on error.
|
Returns 0 on success and non-zero on error.
|
||||||
|
|
||||||
setrep
|
* setrep
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -setrep [-R] [-w] <numReplicas> <path> >>>
|
Usage: <<<hadoop fs -setrep [-R] [-w] <numReplicas> <path> >>>
|
||||||
|
|
||||||
Changes the replication factor of a file. If <path> is a directory then
|
Changes the replication factor of a file. If <path> is a directory then
|
||||||
the command recursively changes the replication factor of all files under
|
the command recursively changes the replication factor of all files under
|
||||||
|
@ -593,28 +657,28 @@ setrep
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
* <<<hdfs dfs -setrep -w 3 /user/hadoop/dir1>>>
|
* <<<hadoop fs -setrep -w 3 /user/hadoop/dir1>>>
|
||||||
|
|
||||||
Exit Code:
|
Exit Code:
|
||||||
|
|
||||||
Returns 0 on success and -1 on error.
|
Returns 0 on success and -1 on error.
|
||||||
|
|
||||||
stat
|
* stat
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -stat URI [URI ...]>>>
|
Usage: <<<hadoop fs -stat URI [URI ...]>>>
|
||||||
|
|
||||||
Returns the stat information on the path.
|
Returns the stat information on the path.
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
* <<<hdfs dfs -stat path>>>
|
* <<<hadoop fs -stat path>>>
|
||||||
|
|
||||||
Exit Code:
|
Exit Code:
|
||||||
Returns 0 on success and -1 on error.
|
Returns 0 on success and -1 on error.
|
||||||
|
|
||||||
tail
|
* tail
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -tail [-f] URI>>>
|
Usage: <<<hadoop fs -tail [-f] URI>>>
|
||||||
|
|
||||||
Displays last kilobyte of the file to stdout.
|
Displays last kilobyte of the file to stdout.
|
||||||
|
|
||||||
|
@ -624,43 +688,54 @@ tail
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
* <<<hdfs dfs -tail pathname>>>
|
* <<<hadoop fs -tail pathname>>>
|
||||||
|
|
||||||
Exit Code:
|
Exit Code:
|
||||||
Returns 0 on success and -1 on error.
|
Returns 0 on success and -1 on error.
|
||||||
|
|
||||||
test
|
* test
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -test -[ezd] URI>>>
|
Usage: <<<hadoop fs -test -[defsz] URI>>>
|
||||||
|
|
||||||
Options:
|
Options:
|
||||||
|
|
||||||
* The -e option will check to see if the file exists, returning 0 if true.
|
* -d: f the path is a directory, return 0.
|
||||||
|
|
||||||
* The -z option will check to see if the file is zero length, returning 0 if true.
|
* -e: if the path exists, return 0.
|
||||||
|
|
||||||
* The -d option will check to see if the path is directory, returning 0 if true.
|
* -f: if the path is a file, return 0.
|
||||||
|
|
||||||
|
* -s: if the path is not empty, return 0.
|
||||||
|
|
||||||
|
* -z: if the file is zero length, return 0.
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
* <<<hdfs dfs -test -e filename>>>
|
* <<<hadoop fs -test -e filename>>>
|
||||||
|
|
||||||
text
|
* text
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -text <src> >>>
|
Usage: <<<hadoop fs -text <src> >>>
|
||||||
|
|
||||||
Takes a source file and outputs the file in text format. The allowed formats
|
Takes a source file and outputs the file in text format. The allowed formats
|
||||||
are zip and TextRecordInputStream.
|
are zip and TextRecordInputStream.
|
||||||
|
|
||||||
touchz
|
* touchz
|
||||||
|
|
||||||
Usage: <<<hdfs dfs -touchz URI [URI ...]>>>
|
Usage: <<<hadoop fs -touchz URI [URI ...]>>>
|
||||||
|
|
||||||
Create a file of zero length.
|
Create a file of zero length.
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
* <<<hdfs dfs -touchz pathname>>>
|
* <<<hadoop fs -touchz pathname>>>
|
||||||
|
|
||||||
Exit Code:
|
Exit Code:
|
||||||
Returns 0 on success and -1 on error.
|
Returns 0 on success and -1 on error.
|
||||||
|
|
||||||
|
|
||||||
|
* usage
|
||||||
|
|
||||||
|
Usage: <<<hadoop fs -usage command>>>
|
||||||
|
|
||||||
|
Return the help for an individual command.
|
|
@ -11,12 +11,12 @@
|
||||||
~~ limitations under the License. See accompanying LICENSE file.
|
~~ limitations under the License. See accompanying LICENSE file.
|
||||||
|
|
||||||
---
|
---
|
||||||
Hadoop MapReduce Next Generation ${project.version} - Setting up a Single Node Cluster.
|
Hadoop ${project.version} - Setting up a Single Node Cluster.
|
||||||
---
|
---
|
||||||
---
|
---
|
||||||
${maven.build.timestamp}
|
${maven.build.timestamp}
|
||||||
|
|
||||||
Hadoop MapReduce Next Generation - Setting up a Single Node Cluster.
|
Hadoop - Setting up a Single Node Cluster.
|
||||||
|
|
||||||
%{toc|section=1|fromDepth=0}
|
%{toc|section=1|fromDepth=0}
|
||||||
|
|
||||||
|
@ -46,7 +46,9 @@ Hadoop MapReduce Next Generation - Setting up a Single Node Cluster.
|
||||||
HadoopJavaVersions}}.
|
HadoopJavaVersions}}.
|
||||||
|
|
||||||
[[2]] ssh must be installed and sshd must be running to use the Hadoop
|
[[2]] ssh must be installed and sshd must be running to use the Hadoop
|
||||||
scripts that manage remote Hadoop daemons.
|
scripts that manage remote Hadoop daemons if the optional start
|
||||||
|
and stop scripts are to be used. Additionally, it is recommmended that
|
||||||
|
pdsh also be installed for better ssh resource management.
|
||||||
|
|
||||||
** Installing Software
|
** Installing Software
|
||||||
|
|
||||||
|
@ -57,7 +59,7 @@ Hadoop MapReduce Next Generation - Setting up a Single Node Cluster.
|
||||||
|
|
||||||
----
|
----
|
||||||
$ sudo apt-get install ssh
|
$ sudo apt-get install ssh
|
||||||
$ sudo apt-get install rsync
|
$ sudo apt-get install pdsh
|
||||||
----
|
----
|
||||||
|
|
||||||
* Download
|
* Download
|
||||||
|
@ -75,9 +77,6 @@ Hadoop MapReduce Next Generation - Setting up a Single Node Cluster.
|
||||||
----
|
----
|
||||||
# set to the root of your Java installation
|
# set to the root of your Java installation
|
||||||
export JAVA_HOME=/usr/java/latest
|
export JAVA_HOME=/usr/java/latest
|
||||||
|
|
||||||
# Assuming your installation directory is /usr/local/hadoop
|
|
||||||
export HADOOP_PREFIX=/usr/local/hadoop
|
|
||||||
----
|
----
|
||||||
|
|
||||||
Try the following command:
|
Try the following command:
|
||||||
|
@ -158,6 +157,7 @@ Hadoop MapReduce Next Generation - Setting up a Single Node Cluster.
|
||||||
----
|
----
|
||||||
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
|
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
|
||||||
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
|
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
|
||||||
|
$ chmod 0700 ~/.ssh/authorized_keys
|
||||||
----
|
----
|
||||||
|
|
||||||
** Execution
|
** Execution
|
||||||
|
@ -228,7 +228,7 @@ Hadoop MapReduce Next Generation - Setting up a Single Node Cluster.
|
||||||
$ sbin/stop-dfs.sh
|
$ sbin/stop-dfs.sh
|
||||||
----
|
----
|
||||||
|
|
||||||
** YARN on Single Node
|
** YARN on a Single Node
|
||||||
|
|
||||||
You can run a MapReduce job on YARN in a pseudo-distributed mode by setting
|
You can run a MapReduce job on YARN in a pseudo-distributed mode by setting
|
||||||
a few parameters and running ResourceManager daemon and NodeManager daemon
|
a few parameters and running ResourceManager daemon and NodeManager daemon
|
||||||
|
@ -239,7 +239,7 @@ Hadoop MapReduce Next Generation - Setting up a Single Node Cluster.
|
||||||
|
|
||||||
[[1]] Configure parameters as follows:
|
[[1]] Configure parameters as follows:
|
||||||
|
|
||||||
etc/hadoop/mapred-site.xml:
|
<<<etc/hadoop/mapred-site.xml>>>:
|
||||||
|
|
||||||
+---+
|
+---+
|
||||||
<configuration>
|
<configuration>
|
||||||
|
@ -250,7 +250,7 @@ Hadoop MapReduce Next Generation - Setting up a Single Node Cluster.
|
||||||
</configuration>
|
</configuration>
|
||||||
+---+
|
+---+
|
||||||
|
|
||||||
etc/hadoop/yarn-site.xml:
|
<<<etc/hadoop/yarn-site.xml>>>:
|
||||||
|
|
||||||
+---+
|
+---+
|
||||||
<configuration>
|
<configuration>
|
||||||
|
|
Loading…
Reference in New Issue