HADOOP-10908. Common needs updates for shell rewrite (aw)

2015-01-05 14:26:41 -08:00 · 2015-01-05 14:26:41 -08:00 · 94d342e607
parent 41d72cbd48
commit 94d342e607
5 changed files with 558 additions and 489 deletions
--- a/hadoop-common-project/hadoop-common/CHANGES.txt
+++ b/hadoop-common-project/hadoop-common/CHANGES.txt
@ -344,6 +344,8 @@ Trunk (Unreleased)
    HADOOP-11397. Can't override HADOOP_IDENT_STRING (Kengo Seki via aw)
    HADOOP-10908. Common needs updates for shell rewrite (aw)
  OPTIMIZATIONS
    HADOOP-7761. Improve the performance of raw comparisons. (todd)
--- a/hadoop-common-project/hadoop-common/src/site/apt/ClusterSetup.apt.vm
+++ b/hadoop-common-project/hadoop-common/src/site/apt/ClusterSetup.apt.vm
@ -11,83 +11,81 @@
 ~~ limitations under the License. See accompanying LICENSE file.
  ---
-  Hadoop Map Reduce Next Generation-${project.version} - Cluster Setup
+  Hadoop ${project.version} - Cluster Setup
  ---
  ---
  ${maven.build.timestamp}
 %{toc|section=1|fromDepth=0}
-Hadoop MapReduce Next Generation - Cluster Setup
+Hadoop Cluster Setup
 * {Purpose}
-  This document describes how to install, configure and manage non-trivial
+  This document describes how to install and configure
  Hadoop clusters ranging from a few nodes to extremely large clusters
-  with thousands of nodes.
+  with thousands of nodes.  To play with Hadoop, you may first want to
  install it on a single machine (see {{{./SingleCluster.html}Single Node Setup}}).
-  To play with Hadoop, you may first want to install it on a single
+  This document does not cover advanced topics such as {{{./SecureMode.html}Security}} or
-  machine (see {{{./SingleCluster.html}Single Node Setup}}).
+  High Availability.
 * {Prerequisites}
-  Download a stable version of Hadoop from Apache mirrors.
+  * Install Java. See the {{{http://wiki.apache.org/hadoop/HadoopJavaVersions}Hadoop Wiki}} for known good versions.
  * Download a stable version of Hadoop from Apache mirrors.
 * {Installation}
  Installing a Hadoop cluster typically involves unpacking the software on all
-  the machines in the cluster or installing RPMs.
+  the machines in the cluster or installing it via a packaging system as
  appropriate for your operating system.  It is important to divide up the hardware
  into functions.
  Typically one machine in the cluster is designated as the NameNode and
-  another machine the as ResourceManager, exclusively. These are the masters.
+  another machine the as ResourceManager, exclusively. These are the masters. Other
  services (such as Web App Proxy Server and MapReduce Job History server) are usually
  run either on dedicated hardware or on shared infrastrucutre, depending upon the load.
  The rest of the machines in the cluster act as both DataNode and NodeManager.
  These are the slaves.
-* {Running Hadoop in Non-Secure Mode}
+* {Configuring Hadoop in Non-Secure Mode}
-  The following sections describe how to configure a Hadoop cluster.
+    Hadoop's Java configuration is driven by two types of important configuration files:
  {Configuration Files}
    Hadoop configuration is driven by two types of important configuration files:
      * Read-only default configuration - <<<core-default.xml>>>,
        <<<hdfs-default.xml>>>, <<<yarn-default.xml>>> and
        <<<mapred-default.xml>>>.
-      * Site-specific configuration - <<conf/core-site.xml>>,
+      * Site-specific configuration - <<<etc/hadoop/core-site.xml>>>,
-        <<conf/hdfs-site.xml>>, <<conf/yarn-site.xml>> and
+        <<<etc/hadoop/hdfs-site.xml>>>, <<<etc/hadoop/yarn-site.xml>>> and
-        <<conf/mapred-site.xml>>.
+        <<<etc/hadoop/mapred-site.xml>>>.
  Additionally, you can control the Hadoop scripts found in the bin/
  directory of the distribution, by setting site-specific values via the
-    <<conf/hadoop-env.sh>> and <<yarn-env.sh>>.
+  <<<etc/hadoop/hadoop-env.sh>>> and <<<etc/hadoop/yarn-env.sh>>>.
  {Site Configuration}
  To configure the Hadoop cluster you will need to configure the
  <<<environment>>> in which the Hadoop daemons execute as well as the
  <<<configuration parameters>>> for the Hadoop daemons.
-  The Hadoop daemons are NameNode/DataNode and ResourceManager/NodeManager.
+  HDFS daemons are NameNode, SecondaryNameNode, and DataNode.  YARN damones
  are ResourceManager, NodeManager, and WebAppProxy.  If MapReduce is to be
  used, then the MapReduce Job History Server will also be running.  For
  large installations, these are generally running on separate hosts.
 ** {Configuring Environment of Hadoop Daemons}
-  Administrators should use the <<conf/hadoop-env.sh>> and
+  Administrators should use the <<<etc/hadoop/hadoop-env.sh>>> and optionally the
-  <<conf/yarn-env.sh>> script to do site-specific customization of the
+  <<<etc/hadoop/mapred-env.sh>>> and <<<etc/hadoop/yarn-env.sh>>> scripts to do
-  Hadoop daemons' process environment.
+  site-specific customization of the Hadoop daemons' process environment.
-  At the very least you should specify the <<<JAVA_HOME>>> so that it is
+  At the very least, you must specify the <<<JAVA_HOME>>> so that it is
  correctly defined on each remote node.
  In most cases you should also specify <<<HADOOP_PID_DIR>>> and
  <<<HADOOP_SECURE_DN_PID_DIR>>> to point to directories that can only be
  written to by the users that are going to run the hadoop daemons.
  Otherwise there is the potential for a symlink attack.
  Administrators can configure individual daemons using the configuration
  options shown below in the table:
@ -114,20 +112,42 @@ Hadoop MapReduce Next Generation - Cluster Setup
  statement should be added in hadoop-env.sh :
 ----
-  export HADOOP_NAMENODE_OPTS="-XX:+UseParallelGC ${HADOOP_NAMENODE_OPTS}"
+  export HADOOP_NAMENODE_OPTS="-XX:+UseParallelGC"
 ----
  See <<<etc/hadoop/hadoop-env.sh>>> for other examples.
  Other useful configuration parameters that you can customize include:
-    * <<<HADOOP_LOG_DIR>>> / <<<YARN_LOG_DIR>>> - The directory where the
+    * <<<HADOOP_PID_DIR>>> - The directory where the
-      daemons' log files are stored. They are automatically created if they
+      daemons' process id files are stored.
      don't exist.
-    * <<<HADOOP_HEAPSIZE>>> / <<<YARN_HEAPSIZE>>> - The maximum amount of
+    * <<<HADOOP_LOG_DIR>>> - The directory where the
-      heapsize to use, in MB e.g. if the varibale is set to 1000 the heap
+      daemons' log files are stored. Log files are automatically created
-      will be set to 1000MB.  This is used to configure the heap
+      if they don't exist.
-      size for the daemon. By default, the value is 1000.  If you want to
+
-      configure the values separately for each deamon you can use.
+    * <<<HADOOP_HEAPSIZE_MAX>>> - The maximum amount of
      memory to use for the Java heapsize.  Units supported by the JVM
      are also supported here.  If no unit is present, it will be assumed
      the number is in megabytes. By default, Hadoop will let the JVM
      determine how much to use. This value can be overriden on
      a per-daemon basis using the appropriate <<<_OPTS>>> variable listed above.
      For example, setting <<<HADOOP_HEAPSIZE_MAX=1g>>> and
      <<<HADOOP_NAMENODE_OPTS="-Xmx5g">>>  will configure the NameNode with 5GB heap.
  In most cases, you should specify the <<<HADOOP_PID_DIR>>> and
  <<<HADOOP_LOG_DIR>>> directories such that they can only be
  written to by the users that are going to run the hadoop daemons.
  Otherwise there is the potential for a symlink attack.
  It is also traditional to configure <<<HADOOP_PREFIX>>> in the system-wide
  shell environment configuration.  For example, a simple script inside
  <<</etc/profile.d>>>:
 ---
  HADOOP_PREFIX=/path/to/hadoop
  export HADOOP_PREFIX
 ---
 *--------------------------------------+--------------------------------------+
 || Daemon                              || Environment Variable                |
@ -141,12 +161,12 @@ Hadoop MapReduce Next Generation - Cluster Setup
 | Map Reduce Job History Server        | HADOOP_JOB_HISTORYSERVER_HEAPSIZE    |
 *--------------------------------------+--------------------------------------+
-** {Configuring the Hadoop Daemons in Non-Secure Mode}
+** {Configuring the Hadoop Daemons}
    This section deals with important parameters to be specified in
    the given configuration files:
-    * <<<conf/core-site.xml>>>
+    * <<<etc/hadoop/core-site.xml>>>
 *-------------------------+-------------------------+------------------------+
 || Parameter              || Value                  || Notes                 |
@ -157,7 +177,7 @@ Hadoop MapReduce Next Generation - Cluster Setup
 | | | Size of read/write buffer used in SequenceFiles. |
 *-------------------------+-------------------------+------------------------+
-    * <<<conf/hdfs-site.xml>>>
+    * <<<etc/hadoop/hdfs-site.xml>>>
      * Configurations for NameNode:
@ -195,7 +215,7 @@ Hadoop MapReduce Next Generation - Cluster Setup
 | | | stored in all named directories, typically on different devices. |
 *-------------------------+-------------------------+------------------------+
-    * <<<conf/yarn-site.xml>>>
+    * <<<etc/hadoop/yarn-site.xml>>>
      * Configurations for ResourceManager and NodeManager:
@ -341,9 +361,7 @@ Hadoop MapReduce Next Generation - Cluster Setup
 | | | Be careful, set this too small and you will spam the name node. |
 *-------------------------+-------------------------+------------------------+
-
+    * <<<etc/hadoop/mapred-site.xml>>>
    * <<<conf/mapred-site.xml>>>
      * Configurations for MapReduce Applications:
@ -395,22 +413,6 @@ Hadoop MapReduce Next Generation - Cluster Setup
 | | | Directory where history files are managed by the MR JobHistory Server. |
 *-------------------------+-------------------------+------------------------+
 * {Hadoop Rack Awareness}
    The HDFS and the YARN components are rack-aware.
    The NameNode and the ResourceManager obtains the rack information of the
    slaves in the cluster by invoking an API <resolve> in an administrator
    configured module.
    The API resolves the DNS name (also IP address) to a rack id.
    The site-specific module to use can be configured using the configuration
    item <<<topology.node.switch.mapping.impl>>>. The default implementation
    of the same runs a script/command configured using
    <<<topology.script.file.name>>>. If <<<topology.script.file.name>>> is
    not set, the rack id </default-rack> is returned for any passed IP address.
 * {Monitoring Health of NodeManagers}
    Hadoop provides a mechanism by which administrators can configure the
@ -433,7 +435,7 @@ Hadoop MapReduce Next Generation - Cluster Setup
    node was healthy is also displayed on the web interface.
    The following parameters can be used to control the node health
-    monitoring script in <<<conf/yarn-site.xml>>>.
+    monitoring script in <<<etc/hadoop/yarn-site.xml>>>.
 *-------------------------+-------------------------+------------------------+
 || Parameter              || Value                  || Notes                 |
@ -465,165 +467,87 @@ Hadoop MapReduce Next Generation - Cluster Setup
  disk is either raided or a failure in the boot disk is identified by the
  health checker script.
-* {Slaves file}
+* {Slaves File}
-  Typically you choose one machine in the cluster to act as the NameNode and
+  List all slave hostnames or IP addresses in your <<<etc/hadoop/slaves>>>
-  one machine as to act as the ResourceManager, exclusively. The rest of the
+  file, one per line.  Helper scripts (described below) will use the
-  machines act as both a DataNode and NodeManager and are referred to as
+  <<<etc/hadoop/slaves>>> file to run commands on many hosts at once.  It is not
-  <slaves>.
+  used for any of the Java-based Hadoop configuration.  In order
  to use this functionality, ssh trusts (via either passphraseless ssh or
  some other means, such as Kerberos) must be established for the accounts
  used to run Hadoop.
-  List all slave hostnames or IP addresses in your <<<conf/slaves>>> file,
+* {Hadoop Rack Awareness}
-  one per line.
+
  Many Hadoop components are rack-aware and take advantage of the
  network topology for performance and safety. Hadoop daemons obtain the
  rack information of the slaves in the cluster by invoking an administrator
  configured module.  See the {{{./RackAwareness.html}Rack Awareness}}
  documentation for more specific information.
  It is highly recommended configuring rack awareness prior to starting HDFS.
 * {Logging}
-  Hadoop uses the Apache log4j via the Apache Commons Logging framework for
+  Hadoop uses the {{{http://logging.apache.org/log4j/2.x/}Apache log4j}} via the Apache Commons Logging framework for
-  logging. Edit the <<<conf/log4j.properties>>> file to customize the
+  logging. Edit the <<<etc/hadoop/log4j.properties>>> file to customize the
  Hadoop daemons' logging configuration (log-formats and so on).
 * {Operating the Hadoop Cluster}
  Once all the necessary configuration is complete, distribute the files to the
-  <<<HADOOP_CONF_DIR>>> directory on all the machines.
+  <<<HADOOP_CONF_DIR>>> directory on all the machines.  This should be the
  same directory on all machines.
  In general, it is recommended that HDFS and YARN run as separate users.
  In the majority of installations, HDFS processes execute as 'hdfs'.  YARN
  is typically using the 'yarn' account.
 ** Hadoop Startup
    To start a Hadoop cluster you will need to start both the HDFS and YARN
    cluster.
-  Format a new distributed filesystem:
+    The first time you bring up HDFS, it must be formatted.  Format a new
-
+    distributed filesystem as <hdfs>:
 ----
 $ $HADOOP_PREFIX/bin/hdfs namenode -format <cluster_name>
 ----
  Start the HDFS with the following command, run on the designated NameNode:
 ----
 $ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start namenode
 ----    	
  Run a script to start DataNodes on all slaves:
 ----
 $ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start datanode
 ----    	
  Start the YARN with the following command, run on the designated
  ResourceManager:
 ----
 $ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start resourcemanager
 ----    	
  Run a script to start NodeManagers on all slaves:
 ----
 $ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start nodemanager
 ----    	
  Start a standalone WebAppProxy server.  If multiple servers
  are used with load balancing it should be run on each of them:
 ----
 $ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh start proxyserver --config $HADOOP_CONF_DIR
 ----
  Start the MapReduce JobHistory Server with the following command, run on the
  designated server:
 ----
 $ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh start historyserver --config $HADOOP_CONF_DIR
 ----    	
 ** Hadoop Shutdown
  Stop the NameNode with the following command, run on the designated
  NameNode:
 ----
 $ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs stop namenode
 ----    	
  Run a script to stop DataNodes on all slaves:
 ----
 $ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs stop datanode
 ----    	
  Stop the ResourceManager with the following command, run on the designated
  ResourceManager:
 ----
 $ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR stop resourcemanager
 ----    	
  Run a script to stop NodeManagers on all slaves:
 ----
 $ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR stop nodemanager
 ----    	
  Stop the WebAppProxy server. If multiple servers are used with load
  balancing it should be run on each of them:
 ----
 $ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh stop proxyserver --config $HADOOP_CONF_DIR
 ----
  Stop the MapReduce JobHistory Server with the following command, run on the
  designated server:
 ----
 $ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh stop historyserver --config $HADOOP_CONF_DIR
 ----    	
 * {Operating the Hadoop Cluster}
  Once all the necessary configuration is complete, distribute the files to the
  <<<HADOOP_CONF_DIR>>> directory on all the machines.
  This section also describes the various Unix users who should be starting the
  various components and uses the same Unix accounts and groups used previously:
 ** Hadoop Startup
    To start a Hadoop cluster you will need to start both the HDFS and YARN
    cluster.
    Format a new distributed filesystem as <hdfs>:
 ----
 [hdfs]$ $HADOOP_PREFIX/bin/hdfs namenode -format <cluster_name>
 ----
-    Start the HDFS with the following command, run on the designated NameNode
+    Start the HDFS NameNode with the following command on the
-    as <hdfs>:
+    designated node as <hdfs>:
 ----
-[hdfs]$ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start namenode
+[hdfs]$ $HADOOP_PREFIX/bin/hdfs --daemon start namenode
 ----
-    Run a script to start DataNodes on all slaves as <root> with a special
+    Start a HDFS DataNode with the following command on each
-    environment variable <<<HADOOP_SECURE_DN_USER>>> set to <hdfs>:
+    designated node as <hdfs>:
 ----
-[root]$ HADOOP_SECURE_DN_USER=hdfs $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start datanode
+[hdfs]$ $HADOOP_PREFIX/bin/hdfs --daemon start datanode
 ----
    If <<<etc/hadoop/slaves>>> and ssh trusted access is configured
    (see {{{./SingleCluster.html}Single Node Setup}}), all of the
    HDFS processes can be started with a utility script.  As <hdfs>:
 ----
 [hdfs]$ $HADOOP_PREFIX/sbin/start-dfs.sh
 ----
    Start the YARN with the following command, run on the designated
    ResourceManager as <yarn>:
 ----
-[yarn]$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start resourcemanager
+[yarn]$ $HADOOP_PREFIX/bin/yarn --daemon start resourcemanager
 ----
-    Run a script to start NodeManagers on all slaves as <yarn>:
+    Run a script to start a NodeManager on each designated host as <yarn>:
 ----
-[yarn]$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start nodemanager
+[yarn]$ $HADOOP_PREFIX/bin/yarn --daemon start nodemanager
 ----
    Start a standalone WebAppProxy server. Run on the WebAppProxy
@ -631,14 +555,22 @@ $ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh stop historyserver --config $HADOO
    it should be run on each of them:
 ----
-[yarn]$ $HADOOP_YARN_HOME/bin/yarn start proxyserver --config $HADOOP_CONF_DIR
+[yarn]$ $HADOOP_PREFIX/bin/yarn --daemon start proxyserver
 ----
-    Start the MapReduce JobHistory Server with the following command, run on the
+    If <<<etc/hadoop/slaves>>> and ssh trusted access is configured
-    designated server as <mapred>:
+    (see {{{./SingleCluster.html}Single Node Setup}}), all of the
    YARN processes can be started with a utility script.  As <yarn>:
 ----
-[mapred]$ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh start historyserver --config $HADOOP_CONF_DIR
+[yarn]$ $HADOOP_PREFIX/sbin/start-yarn.sh
 ----
    Start the MapReduce JobHistory Server with the following command, run
    on the designated server as <mapred>:
 ----
 [mapred]$ $HADOOP_PREFIX/bin/mapred --daemon start historyserver
 ----
 ** Hadoop Shutdown
@ -647,26 +579,42 @@ $ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh stop historyserver --config $HADOO
  as <hdfs>:
 ----
-[hdfs]$ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs stop namenode
+[hdfs]$ $HADOOP_PREFIX/bin/hdfs --daemon stop namenode
 ----
-  Run a script to stop DataNodes on all slaves as <root>:
+  Run a script to stop a DataNode as <hdfs>:
 ----
-[root]$ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs stop datanode
+[hdfs]$ $HADOOP_PREFIX/bin/hdfs --daemon stop datanode
 ----
    If <<<etc/hadoop/slaves>>> and ssh trusted access is configured
    (see {{{./SingleCluster.html}Single Node Setup}}), all of the
    HDFS processes may be stopped with a utility script.  As <hdfs>:
 ----
 [hdfs]$ $HADOOP_PREFIX/sbin/stop-dfs.sh
 ----
  Stop the ResourceManager with the following command, run on the designated
  ResourceManager as <yarn>:
 ----
-[yarn]$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR stop resourcemanager
+[yarn]$ $HADOOP_PREFIX/bin/yarn --daemon stop resourcemanager
 ----
-  Run a script to stop NodeManagers on all slaves as <yarn>:
+  Run a script to stop a NodeManager on a slave as <yarn>:
 ----
-[yarn]$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR stop nodemanager
+[yarn]$ $HADOOP_PREFIX/bin/yarn --daemon stop nodemanager
 ----
    If <<<etc/hadoop/slaves>>> and ssh trusted access is configured
    (see {{{./SingleCluster.html}Single Node Setup}}), all of the
    YARN processes can be stopped with a utility script.  As <yarn>:
 ----
 [yarn]$ $HADOOP_PREFIX/sbin/stop-yarn.sh
 ----
  Stop the WebAppProxy server. Run on the WebAppProxy  server as
@ -674,14 +622,14 @@ $ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh stop historyserver --config $HADOO
  should be run on each of them:
 ----
-[yarn]$ $HADOOP_YARN_HOME/bin/yarn stop proxyserver --config $HADOOP_CONF_DIR
+[yarn]$ $HADOOP_PREFIX/bin/yarn stop proxyserver
 ----
  Stop the MapReduce JobHistory Server with the following command, run on the
  designated server as <mapred>:
 ----
-[mapred]$ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh stop historyserver --config $HADOOP_CONF_DIR
+[mapred]$ $HADOOP_PREFIX/bin/mapred --daemon stop historyserver
 ----
 * {Web Interfaces}
--- a/hadoop-common-project/hadoop-common/src/site/apt/CommandsManual.apt.vm
+++ b/hadoop-common-project/hadoop-common/src/site/apt/CommandsManual.apt.vm
@ -21,102 +21,161 @@
 %{toc}
-Overview
+Hadoop Commands Guide
-   All hadoop commands are invoked by the <<<bin/hadoop>>> script. Running the
+* Overview
   hadoop script without any arguments prints the description for all
   commands.
-   Usage: <<<hadoop [--config confdir] [--loglevel loglevel] [COMMAND]
+   All of the Hadoop commands and subprojects follow the same basic structure:
             [GENERIC_OPTIONS] [COMMAND_OPTIONS]>>>
-   Hadoop has an option parsing framework that employs parsing generic
+   Usage: <<<shellcommand [SHELL_OPTIONS] [COMMAND] [GENERIC_OPTIONS] [COMMAND_OPTIONS]>>>
-   options as well as running classes.
+
 *--------+---------+
 || FIELD || Description 
 *-----------------------+---------------+
 | shellcommand | The command of the project being invoked.  For example,
               | Hadoop common uses <<<hadoop>>>, HDFS uses <<<hdfs>>>, 
               | and YARN uses <<<yarn>>>.
 *---------------+-------------------+
 | SHELL_OPTIONS | Options that the shell processes prior to executing Java.
 *-----------------------+---------------+
 | COMMAND | Action to perform.
 *-----------------------+---------------+
 | GENERIC_OPTIONS       | The common set of options supported by 
                        | multiple commands.
 *-----------------------+---------------+
 | COMMAND_OPTIONS       | Various commands with their options are 
                        | described in this documention for the 
                        | Hadoop common sub-project.  HDFS and YARN are
                        | covered in other documents.
 *-----------------------+---------------+
 ** {Shell Options}
   All of the shell commands will accept a common set of options.  For some commands,
   these options are ignored. For example, passing <<<---hostnames>>> on a
   command that only executes on a single host will be ignored.
 *-----------------------+---------------+
-|| COMMAND_OPTION       || Description
+|| SHELL_OPTION       || Description
 *-----------------------+---------------+
-| <<<--config confdir>>>| Overwrites the default Configuration directory.  Default is <<<${HADOOP_HOME}/conf>>>.
+| <<<--buildpaths>>>    | Enables developer versions of jars.
 *-----------------------+---------------+
-| <<<--loglevel loglevel>>>| Overwrites the log level. Valid log levels are
+| <<<--config confdir>>> | Overwrites the default Configuration 
                         | directory.  Default is <<<${HADOOP_PREFIX}/conf>>>.
 *-----------------------+----------------+
 | <<<--daemon mode>>>   | If the command supports daemonization (e.g.,
                        | <<<hdfs namenode>>>), execute in the appropriate
                        | mode. Supported modes are <<<start>>> to start the
                        | process in daemon mode, <<<stop>>> to stop the
                        | process, and <<<status>>> to determine the active
                        | status of the process.  <<<status>>> will return
                        | an {{{http://refspecs.linuxbase.org/LSB_3.0.0/LSB-generic/LSB-generic/iniscrptact.html}LSB-compliant}} result code. 
                        | If no option is provided, commands that support
                        | daemonization will run in the foreground.   
 *-----------------------+---------------+
 | <<<--debug>>>         | Enables shell level configuration debugging information
 *-----------------------+---------------+
 | <<<--help>>>          | Shell script usage information.
 *-----------------------+---------------+
 | <<<--hostnames>>> | A space delimited list of hostnames where to execute 
                    | a multi-host subcommand. By default, the content of
                    | the <<<slaves>>> file is used.  
 *-----------------------+----------------+
 | <<<--hosts>>> | A file that contains a list of hostnames where to execute
                | a multi-host subcommand. By default, the content of the
                | <<<slaves>>> file is used.  
 *-----------------------+----------------+
 | <<<--loglevel loglevel>>> | Overrides the log level. Valid log levels are
 |                           | FATAL, ERROR, WARN, INFO, DEBUG, and TRACE.
 |                           | Default is INFO.
 *-----------------------+---------------+
 | GENERIC_OPTIONS       | The common set of options supported by multiple commands.
 | COMMAND_OPTIONS       | Various commands with their options are described in the following sections. The commands have been grouped into User Commands and Administration Commands.
 *-----------------------+---------------+
-Generic Options
+** {Generic Options}
-   The following options are supported by {{dfsadmin}}, {{fs}}, {{fsck}},
+   Many subcommands honor a common set of configuration options to alter their behavior:
   {{job}} and {{fetchdt}}. Applications should implement 
   {{{../../api/org/apache/hadoop/util/Tool.html}Tool}} to support
   GenericOptions.
 *------------------------------------------------+-----------------------------+
 ||            GENERIC_OPTION                     ||            Description
 *------------------------------------------------+-----------------------------+
 |<<<-conf \<configuration file\> >>>             | Specify an application
                                                 | configuration file.
 *------------------------------------------------+-----------------------------+
 |<<<-D \<property\>=\<value\> >>>                | Use value for given property.
 *------------------------------------------------+-----------------------------+
 |<<<-jt \<local\> or \<resourcemanager:port\>>>> | Specify a ResourceManager.
                                                 | Applies only to job.
 *------------------------------------------------+-----------------------------+
 |<<<-files \<comma separated list of files\> >>> | Specify comma separated files
                                                 | to be copied to the map
                                                 | reduce cluster.  Applies only
                                                 | to job.
 *------------------------------------------------+-----------------------------+
 |<<<-libjars \<comma seperated list of jars\> >>>| Specify comma separated jar
                                                 | files to include in the
                                                 | classpath. Applies only to
                                                 | job.
 *------------------------------------------------+-----------------------------+
 |<<<-archives \<comma separated list of archives\> >>> | Specify comma separated
                                                 | archives to be unarchived on
                                                 | the compute machines. Applies
                                                 | only to job.
 *------------------------------------------------+-----------------------------+
 |<<<-conf \<configuration file\> >>>             | Specify an application
                                                 | configuration file.
 *------------------------------------------------+-----------------------------+
 |<<<-D \<property\>=\<value\> >>>                | Use value for given property.
 *------------------------------------------------+-----------------------------+
 |<<<-files \<comma separated list of files\> >>> | Specify comma separated files
                                                 | to be copied to the map
                                                 | reduce cluster.  Applies only
                                                 | to job.
 *------------------------------------------------+-----------------------------+
 |<<<-jt \<local\> or \<resourcemanager:port\>>>> | Specify a ResourceManager.
                                                 | Applies only to job.
 *------------------------------------------------+-----------------------------+
 |<<<-libjars \<comma seperated list of jars\> >>>| Specify comma separated jar
                                                 | files to include in the
                                                 | classpath. Applies only to
                                                 | job.
 *------------------------------------------------+-----------------------------+
-User Commands
+Hadoop Common Commands
  All of these commands are executed from the <<<hadoop>>> shell command.  They
  have been broken up into {{User Commands}} and 
  {{Admininistration Commands}}.
 * User Commands
   Commands useful for users of a hadoop cluster.
-* <<<archive>>>
+** <<<archive>>>
   Creates a hadoop archive. More information can be found at
  {{{../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/HadoopArchives.html}
   Hadoop Archives Guide}}.
-* <<<credential>>>
+** <<<checknative>>>
-   Command to manage credentials, passwords and secrets within credential providers.
+    Usage: <<<hadoop checknative [-a] [-h] >>>
-   The CredentialProvider API in Hadoop allows for the separation of applications
+*-----------------+-----------------------------------------------------------+
-   and how they store their required passwords/secrets. In order to indicate
+|| COMMAND_OPTION || Description
-   a particular provider type and location, the user must provide the
+*-----------------+-----------------------------------------------------------+
-   <hadoop.security.credential.provider.path> configuration element in core-site.xml
+| -a              | Check all libraries are available.
-   or use the command line option <<<-provider>>> on each of the following commands.
+*-----------------+-----------------------------------------------------------+
-   This provider path is a comma-separated list of URLs that indicates the type and
+| -h              | print help
-   location of a list of providers that should be consulted.
+*-----------------+-----------------------------------------------------------+
   For example, the following path:
-   <<<user:///,jceks://file/tmp/test.jceks,jceks://hdfs@nn1.example.com/my/path/test.jceks>>>
+    This command checks the availability of the Hadoop native code.  See
    {{{NativeLibraries.html}}} for more information.  By default, this command 
    only checks the availability of libhadoop.
-   indicates that the current user's credentials file should be consulted through
+** <<<classpath>>>
   the User Provider, that the local file located at <<</tmp/test.jceks>>> is a Java Keystore
   Provider and that the file located within HDFS at <<<nn1.example.com/my/path/test.jceks>>>
   is also a store for a Java Keystore Provider.
-   When utilizing the credential command it will often be for provisioning a password
+   Usage: <<<hadoop classpath [--glob|--jar <path>|-h|--help]>>>
   or secret to a particular credential store provider. In order to explicitly
   indicate which provider store to use the <<<-provider>>> option should be used. Otherwise,
   given a path of multiple providers, the first non-transient provider will be used.
   This may or may not be the one that you intended.
-   Example: <<<-provider jceks://file/tmp/test.jceks>>>
+*-----------------+-----------------------------------------------------------+
 || COMMAND_OPTION || Description
 *-----------------+-----------------------------------------------------------+
 | --glob          | expand wildcards
 *-----------------+-----------------------------------------------------------+
 | --jar <path>    | write classpath as manifest in jar named <path>
 *-----------------+-----------------------------------------------------------+
 | -h, --help      | print help
 *-----------------+-----------------------------------------------------------+
   Prints the class path needed to get the Hadoop jar and the required
   libraries.  If called without arguments, then prints the classpath set up by
   the command scripts, which is likely to contain wildcards in the classpath
   entries.  Additional options print the classpath after wildcard expansion or
   write the classpath into the manifest of a jar file.  The latter is useful in
   environments where wildcards cannot be used and the expanded classpath exceeds
   the maximum supported command line length.
 ** <<<credential>>>
   Usage: <<<hadoop credential <subcommand> [options]>>>
@ -143,109 +202,96 @@ User Commands
                    | indicated.
 *-------------------+-------------------------------------------------------+
-* <<<distcp>>>
+   Command to manage credentials, passwords and secrets within credential providers.
   The CredentialProvider API in Hadoop allows for the separation of applications
   and how they store their required passwords/secrets. In order to indicate
   a particular provider type and location, the user must provide the
   <hadoop.security.credential.provider.path> configuration element in core-site.xml
   or use the command line option <<<-provider>>> on each of the following commands.
   This provider path is a comma-separated list of URLs that indicates the type and
   location of a list of providers that should be consulted. For example, the following path:
   <<<user:///,jceks://file/tmp/test.jceks,jceks://hdfs@nn1.example.com/my/path/test.jceks>>>
   indicates that the current user's credentials file should be consulted through
   the User Provider, that the local file located at <<</tmp/test.jceks>>> is a Java Keystore
   Provider and that the file located within HDFS at <<<nn1.example.com/my/path/test.jceks>>>
   is also a store for a Java Keystore Provider.
   When utilizing the credential command it will often be for provisioning a password
   or secret to a particular credential store provider. In order to explicitly
   indicate which provider store to use the <<<-provider>>> option should be used. Otherwise,
   given a path of multiple providers, the first non-transient provider will be used.
   This may or may not be the one that you intended.
   Example: <<<-provider jceks://file/tmp/test.jceks>>>
 ** <<<distch>>>
  Usage: <<<hadoop distch [-f urilist_url] [-i] [-log logdir] path:owner:group:permissions>>>
 *-------------------+-------------------------------------------------------+
 ||COMMAND_OPTION    ||                   Description
 *-------------------+-------------------------------------------------------+
 | -f | List of objects to change
 *----+------------+
 | -i | Ignore failures
 *----+------------+
 | -log | Directory to log output
 *-----+---------+
  Change the ownership and permissions on many files at once.
 ** <<<distcp>>>
   Copy file or directories recursively. More information can be found at
   {{{../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/DistCp.html}
   Hadoop DistCp Guide}}.
-* <<<fs>>>
+** <<<fs>>>
-   Deprecated, use {{{../hadoop-hdfs/HDFSCommands.html#dfs}<<<hdfs dfs>>>}}
+   This command is documented in the {{{./FileSystemShell.html}File System Shell Guide}}.  It is a synonym for <<<hdfs dfs>>> when HDFS is in use.
   instead.
-* <<<fsck>>>
+** <<<jar>>>
   Deprecated, use {{{../hadoop-hdfs/HDFSCommands.html#fsck}<<<hdfs fsck>>>}}
   instead.
 * <<<fetchdt>>>
   Deprecated, use {{{../hadoop-hdfs/HDFSCommands.html#fetchdt}
   <<<hdfs fetchdt>>>}} instead.
 * <<<jar>>>
   Runs a jar file. Users can bundle their Map Reduce code in a jar file and
   execute it using this command.
  Usage: <<<hadoop jar <jar> [mainClass] args...>>>
-   The streaming jobs are run via this command. Examples can be referred from
+  Runs a jar file. 
   Streaming examples
   Word count example is also run using jar command. It can be referred from
   Wordcount example
  Use {{{../../hadoop-yarn/hadoop-yarn-site/YarnCommands.html#jar}<<<yarn jar>>>}}
  to launch YARN applications instead.
-* <<<job>>>
+** <<<jnipath>>>
-   Deprecated. Use
+    Usage: <<<hadoop jnipath>>>
   {{{../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapredCommands.html#job}
   <<<mapred job>>>}} instead.
-* <<<pipes>>>
+    Print the computed java.library.path.
-   Deprecated. Use
+** <<<key>>>
   {{{../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapredCommands.html#pipes}
   <<<mapred pipes>>>}} instead.
-* <<<queue>>>
+    Manage keys via the KeyProvider.
-   Deprecated. Use
+** <<<trace>>>
   {{{../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapredCommands.html#queue}
   <<<mapred queue>>>}} instead.
-* <<<version>>>
+    View and modify Hadoop tracing settings.   See the {{{./Tracing.html}Tracing Guide}}.
-   Prints the version.
+** <<<version>>>
   Usage: <<<hadoop version>>>
-* <<<CLASSNAME>>>
+   Prints the version.
-   hadoop script can be used to invoke any class.
+** <<<CLASSNAME>>>
   Usage: <<<hadoop CLASSNAME>>>
-   Runs the class named <<<CLASSNAME>>>.
+   Runs the class named <<<CLASSNAME>>>.  The class must be part of a package.
-* <<<classpath>>>
+* {Administration Commands}
   Prints the class path needed to get the Hadoop jar and the required
   libraries.  If called without arguments, then prints the classpath set up by
   the command scripts, which is likely to contain wildcards in the classpath
   entries.  Additional options print the classpath after wildcard expansion or
   write the classpath into the manifest of a jar file.  The latter is useful in
   environments where wildcards cannot be used and the expanded classpath exceeds
   the maximum supported command line length.
   Usage: <<<hadoop classpath [--glob|--jar <path>|-h|--help]>>>
 *-----------------+-----------------------------------------------------------+
 || COMMAND_OPTION || Description
 *-----------------+-----------------------------------------------------------+
 | --glob          | expand wildcards
 *-----------------+-----------------------------------------------------------+
 | --jar <path>    | write classpath as manifest in jar named <path>
 *-----------------+-----------------------------------------------------------+
 | -h, --help      | print help
 *-----------------+-----------------------------------------------------------+
 Administration Commands
   Commands useful for administrators of a hadoop cluster.
-* <<<balancer>>>
+** <<<daemonlog>>>
   Deprecated, use {{{../hadoop-hdfs/HDFSCommands.html#balancer}
   <<<hdfs balancer>>>}} instead.
 * <<<daemonlog>>>
   Get/Set the log level for each daemon.
   Usage: <<<hadoop daemonlog -getlevel <host:port> <name> >>>
   Usage: <<<hadoop daemonlog -setlevel <host:port> <name> <level> >>>
@ -262,22 +308,20 @@ Administration Commands
                               | connects to http://<host:port>/logLevel?log=<name>
 *------------------------------+-----------------------------------------------------------+
-* <<<datanode>>>
+   Get/Set the log level for each daemon.
-   Deprecated, use {{{../hadoop-hdfs/HDFSCommands.html#datanode}
+* Files
   <<<hdfs datanode>>>}} instead.
-* <<<dfsadmin>>>
+** <<etc/hadoop/hadoop-env.sh>>
-   Deprecated, use {{{../hadoop-hdfs/HDFSCommands.html#dfsadmin}
+    This file stores the global settings used by all Hadoop shell commands.
   <<<hdfs dfsadmin>>>}} instead.
-* <<<namenode>>>
+** <<etc/hadoop/hadoop-user-functions.sh>>
-   Deprecated, use {{{../hadoop-hdfs/HDFSCommands.html#namenode}
+    This file allows for advanced users to override some shell functionality.
   <<<hdfs namenode>>>}} instead.
-* <<<secondarynamenode>>>
+** <<~/.hadooprc>>
-   Deprecated, use {{{../hadoop-hdfs/HDFSCommands.html#secondarynamenode}
+    This stores the personal environment for an individual user.  It is
-   <<<hdfs secondarynamenode>>>}} instead.
+    processed after the hadoop-env.sh and hadoop-user-functions.sh files
    and can contain the same settings.
--- a/hadoop-common-project/hadoop-common/src/site/apt/FileSystemShell.apt.vm
+++ b/hadoop-common-project/hadoop-common/src/site/apt/FileSystemShell.apt.vm
@ -45,46 +45,62 @@ bin/hadoop fs <args>
   Differences are described with each of the commands. Error information is
   sent to stderr and the output is sent to stdout.
-appendToFile
+   If HDFS is being used, <<<hdfs dfs>>> is a synonym.
-      Usage: <<<hdfs dfs -appendToFile <localsrc> ... <dst> >>>
+   See the {{{./CommandsManual.html}Commands Manual}} for generic shell options.
 * appendToFile
      Usage: <<<hadoop fs -appendToFile <localsrc> ... <dst> >>>
      Append single src, or multiple srcs from local file system to the
      destination file system. Also reads input from stdin and appends to
      destination file system.
-        * <<<hdfs dfs -appendToFile localfile /user/hadoop/hadoopfile>>>
+        * <<<hadoop fs -appendToFile localfile /user/hadoop/hadoopfile>>>
-        * <<<hdfs dfs -appendToFile localfile1 localfile2 /user/hadoop/hadoopfile>>>
+        * <<<hadoop fs -appendToFile localfile1 localfile2 /user/hadoop/hadoopfile>>>
-        * <<<hdfs dfs -appendToFile localfile hdfs://nn.example.com/hadoop/hadoopfile>>>
+        * <<<hadoop fs -appendToFile localfile hdfs://nn.example.com/hadoop/hadoopfile>>>
-        * <<<hdfs dfs -appendToFile - hdfs://nn.example.com/hadoop/hadoopfile>>>
+        * <<<hadoop fs -appendToFile - hdfs://nn.example.com/hadoop/hadoopfile>>>
          Reads the input from stdin.
      Exit Code:
      Returns 0 on success and 1 on error.
-cat
+* cat
-   Usage: <<<hdfs dfs -cat URI [URI ...]>>>
+   Usage: <<<hadoop fs -cat URI [URI ...]>>>
   Copies source paths to stdout.
   Example:
-     * <<<hdfs dfs -cat hdfs://nn1.example.com/file1 hdfs://nn2.example.com/file2>>>
+     * <<<hadoop fs -cat hdfs://nn1.example.com/file1 hdfs://nn2.example.com/file2>>>
-     * <<<hdfs dfs -cat file:///file3 /user/hadoop/file4>>>
+     * <<<hadoop fs -cat file:///file3 /user/hadoop/file4>>>
   Exit Code:
   Returns 0 on success and -1 on error.
-chgrp
+* checksum
-   Usage: <<<hdfs dfs -chgrp [-R] GROUP URI [URI ...]>>>
+  Usage: <<<hadoop fs -checksum URI>>>
  Returns the checksum information of a file.
  Example:
    * <<<hadoop fs -checksum hdfs://nn1.example.com/file1>>>
    * <<<hadoop fs -checksum file:///etc/hosts>>>
 * chgrp
   Usage: <<<hadoop fs -chgrp [-R] GROUP URI [URI ...]>>>
   Change group association of files. The user must be the owner of files, or
   else a super-user. Additional information is in the
@ -94,9 +110,9 @@ chgrp
     * The -R option will make the change recursively through the directory structure.
-chmod
+* chmod
-   Usage: <<<hdfs dfs -chmod [-R] <MODE[,MODE]... | OCTALMODE> URI [URI ...]>>>
+   Usage: <<<hadoop fs -chmod [-R] <MODE[,MODE]... | OCTALMODE> URI [URI ...]>>>
   Change the permissions of files. With -R, make the change recursively
   through the directory structure. The user must be the owner of the file, or
@ -107,9 +123,9 @@ chmod
     * The -R option will make the change recursively through the directory structure.
-chown
+* chown
-   Usage: <<<hdfs dfs -chown [-R] [OWNER][:[GROUP]] URI [URI ]>>>
+   Usage: <<<hadoop fs -chown [-R] [OWNER][:[GROUP]] URI [URI ]>>>
   Change the owner of files. The user must be a super-user. Additional information
   is in the {{{../hadoop-hdfs/HdfsPermissionsGuide.html}Permissions Guide}}.
@ -118,9 +134,9 @@ chown
     * The -R option will make the change recursively through the directory structure.
-copyFromLocal
+* copyFromLocal
-   Usage: <<<hdfs dfs -copyFromLocal <localsrc> URI>>>
+   Usage: <<<hadoop fs -copyFromLocal <localsrc> URI>>>
   Similar to put command, except that the source is restricted to a local
   file reference.
@ -129,16 +145,16 @@ copyFromLocal
     * The -f option will overwrite the destination if it already exists.
-copyToLocal
+* copyToLocal
-   Usage: <<<hdfs dfs -copyToLocal [-ignorecrc] [-crc] URI <localdst> >>>
+   Usage: <<<hadoop fs -copyToLocal [-ignorecrc] [-crc] URI <localdst> >>>
   Similar to get command, except that the destination is restricted to a
   local file reference.
-count
+* count
-   Usage: <<<hdfs dfs -count [-q] [-h] <paths> >>>
+   Usage: <<<hadoop fs -count [-q] [-h] <paths> >>>
   Count the number of directories, files and bytes under the paths that match
   the specified file pattern.  The output columns with -count are: DIR_COUNT,
@ -151,19 +167,19 @@ count
   Example:
-     * <<<hdfs dfs -count hdfs://nn1.example.com/file1 hdfs://nn2.example.com/file2>>>
+     * <<<hadoop fs -count hdfs://nn1.example.com/file1 hdfs://nn2.example.com/file2>>>
-     * <<<hdfs dfs -count -q hdfs://nn1.example.com/file1>>>
+     * <<<hadoop fs -count -q hdfs://nn1.example.com/file1>>>
-     * <<<hdfs dfs -count -q -h hdfs://nn1.example.com/file1>>>
+     * <<<hadoop fs -count -q -h hdfs://nn1.example.com/file1>>>
   Exit Code:
   Returns 0 on success and -1 on error.
-cp
+* cp
-   Usage: <<<hdfs dfs -cp [-f] [-p | -p[topax]] URI [URI ...] <dest> >>>
+   Usage: <<<hadoop fs -cp [-f] [-p | -p[topax]] URI [URI ...] <dest> >>>
   Copy files from source to destination. This command allows multiple sources
   as well in which case the destination must be a directory.
@ -187,17 +203,41 @@ cp
   Example:
-     * <<<hdfs dfs -cp /user/hadoop/file1 /user/hadoop/file2>>>
+     * <<<hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2>>>
-     * <<<hdfs dfs -cp /user/hadoop/file1 /user/hadoop/file2 /user/hadoop/dir>>>
+     * <<<hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2 /user/hadoop/dir>>>
   Exit Code:
   Returns 0 on success and -1 on error.
-du
+* createSnapshot
-   Usage: <<<hdfs dfs -du [-s] [-h] URI [URI ...]>>>
+  See {{{../hadoop-hdfs/HdfsSnapshots.html}HDFS Snapshots Guide}}.
 * deleteSnapshot
  See {{{../hadoop-hdfs/HdfsSnapshots.html}HDFS Snapshots Guide}}.
 * df
   Usage: <<<hadoop fs -df [-h] URI [URI ...]>>>
   Displays free space.
   Options:
     * The -h option will format file sizes in a "human-readable" fashion (e.g
       64.0m instead of 67108864)
   Example:
     * <<<hadoop dfs -df /user/hadoop/dir1>>>
 * du
   Usage: <<<hadoop fs -du [-s] [-h] URI [URI ...]>>>
   Displays sizes of files and directories contained in the given directory or
   the length of a file in case its just a file.
@ -212,29 +252,29 @@ du
   Example:
-    * hdfs dfs -du /user/hadoop/dir1 /user/hadoop/file1 hdfs://nn.example.com/user/hadoop/dir1
+    * <<<hadoop fs -du /user/hadoop/dir1 /user/hadoop/file1 hdfs://nn.example.com/user/hadoop/dir1>>>
   Exit Code:
   Returns 0 on success and -1 on error.
-dus
+* dus
-   Usage: <<<hdfs dfs -dus <args> >>>
+   Usage: <<<hadoop fs -dus <args> >>>
   Displays a summary of file lengths.
-   <<Note:>> This command is deprecated. Instead use <<<hdfs dfs -du -s>>>.
+   <<Note:>> This command is deprecated. Instead use <<<hadoop fs -du -s>>>.
-expunge
+* expunge
-   Usage: <<<hdfs dfs -expunge>>>
+   Usage: <<<hadoop fs -expunge>>>
   Empty the Trash. Refer to the {{{../hadoop-hdfs/HdfsDesign.html}
   HDFS Architecture Guide}} for more information on the Trash feature.
-find
+* find
-   Usage: <<<hdfs dfs -find <path> ... <expression> ... >>>
+   Usage: <<<hadoop fs -find <path> ... <expression> ... >>>
   Finds all files that match the specified expression and applies selected
   actions to them. If no <path> is specified then defaults to the current
@ -269,15 +309,15 @@ find
   Example:
-   <<<hdfs dfs -find / -name test -print>>>
+   <<<hadoop fs -find / -name test -print>>>
   Exit Code:
     Returns 0 on success and -1 on error.
-get
+* get
-   Usage: <<<hdfs dfs -get [-ignorecrc] [-crc] <src> <localdst> >>>
+   Usage: <<<hadoop fs -get [-ignorecrc] [-crc] <src> <localdst> >>>
   Copy files to the local file system. Files that fail the CRC check may be
   copied with the -ignorecrc option. Files and CRCs may be copied using the
@ -285,17 +325,17 @@ get
   Example:
-     * <<<hdfs dfs -get /user/hadoop/file localfile>>>
+     * <<<hadoop fs -get /user/hadoop/file localfile>>>
-     * <<<hdfs dfs -get hdfs://nn.example.com/user/hadoop/file localfile>>>
+     * <<<hadoop fs -get hdfs://nn.example.com/user/hadoop/file localfile>>>
   Exit Code:
   Returns 0 on success and -1 on error.
-getfacl
+* getfacl
-   Usage: <<<hdfs dfs -getfacl [-R] <path> >>>
+   Usage: <<<hadoop fs -getfacl [-R] <path> >>>
   Displays the Access Control Lists (ACLs) of files and directories. If a
   directory has a default ACL, then getfacl also displays the default ACL.
@ -308,17 +348,17 @@ getfacl
   Examples:
-     * <<<hdfs dfs -getfacl /file>>>
+     * <<<hadoop fs -getfacl /file>>>
-     * <<<hdfs dfs -getfacl -R /dir>>>
+     * <<<hadoop fs -getfacl -R /dir>>>
   Exit Code:
   Returns 0 on success and non-zero on error.
-getfattr
+* getfattr
-   Usage: <<<hdfs dfs -getfattr [-R] {-n name | -d} [-e en] <path> >>>
+   Usage: <<<hadoop fs -getfattr [-R] {-n name | -d} [-e en] <path> >>>
   Displays the extended attribute names and values (if any) for a file or
   directory.
@ -337,26 +377,32 @@ getfattr
   Examples:
-     * <<<hdfs dfs -getfattr -d /file>>>
+     * <<<hadoop fs -getfattr -d /file>>>
-     * <<<hdfs dfs -getfattr -R -n user.myAttr /dir>>>
+     * <<<hadoop fs -getfattr -R -n user.myAttr /dir>>>
   Exit Code:
   Returns 0 on success and non-zero on error.
-getmerge
+* getmerge
-   Usage: <<<hdfs dfs -getmerge <src> <localdst> [addnl]>>>
+   Usage: <<<hadoop fs -getmerge <src> <localdst> [addnl]>>>
   Takes a source directory and a destination file as input and concatenates
   files in src into the destination local file. Optionally addnl can be set to
   enable adding a newline character at the
   end of each file.
-ls
+* help
-   Usage: <<<hdfs dfs -ls [-R] <args> >>>
+   Usage: <<<hadoop fs -help>>>
   Return usage output.
 * ls
   Usage: <<<hadoop fs -ls [-R] <args> >>>
   Options:
@ -377,23 +423,23 @@ permissions userid groupid modification_date modification_time dirname
   Example:
-     * <<<hdfs dfs -ls /user/hadoop/file1>>>
+     * <<<hadoop fs -ls /user/hadoop/file1>>>
   Exit Code:
   Returns 0 on success and -1 on error.
-lsr
+* lsr
-   Usage: <<<hdfs dfs -lsr <args> >>>
+   Usage: <<<hadoop fs -lsr <args> >>>
   Recursive version of ls.
-   <<Note:>> This command is deprecated. Instead use <<<hdfs dfs -ls -R>>>
+   <<Note:>> This command is deprecated. Instead use <<<hadoop fs -ls -R>>>
-mkdir
+* mkdir
-   Usage: <<<hdfs dfs -mkdir [-p] <paths> >>>
+   Usage: <<<hadoop fs -mkdir [-p] <paths> >>>
   Takes path uri's as argument and creates directories.
@ -403,30 +449,30 @@ mkdir
   Example:
-     * <<<hdfs dfs -mkdir /user/hadoop/dir1 /user/hadoop/dir2>>>
+     * <<<hadoop fs -mkdir /user/hadoop/dir1 /user/hadoop/dir2>>>
-     * <<<hdfs dfs -mkdir hdfs://nn1.example.com/user/hadoop/dir hdfs://nn2.example.com/user/hadoop/dir>>>
+     * <<<hadoop fs -mkdir hdfs://nn1.example.com/user/hadoop/dir hdfs://nn2.example.com/user/hadoop/dir>>>
   Exit Code:
   Returns 0 on success and -1 on error.
-moveFromLocal
+* moveFromLocal
-   Usage: <<<hdfs dfs -moveFromLocal <localsrc> <dst> >>>
+   Usage: <<<hadoop fs -moveFromLocal <localsrc> <dst> >>>
   Similar to put command, except that the source localsrc is deleted after
   it's copied.
-moveToLocal
+* moveToLocal
-   Usage: <<<hdfs dfs -moveToLocal [-crc] <src> <dst> >>>
+   Usage: <<<hadoop fs -moveToLocal [-crc] <src> <dst> >>>
   Displays a "Not implemented yet" message.
-mv
+* mv
-   Usage: <<<hdfs dfs -mv URI [URI ...] <dest> >>>
+   Usage: <<<hadoop fs -mv URI [URI ...] <dest> >>>
   Moves files from source to destination. This command allows multiple sources
   as well in which case the destination needs to be a directory. Moving files
@ -434,38 +480,42 @@ mv
   Example:
-     * <<<hdfs dfs -mv /user/hadoop/file1 /user/hadoop/file2>>>
+     * <<<hadoop fs -mv /user/hadoop/file1 /user/hadoop/file2>>>
-     * <<<hdfs dfs -mv hdfs://nn.example.com/file1 hdfs://nn.example.com/file2 hdfs://nn.example.com/file3 hdfs://nn.example.com/dir1>>>
+     * <<<hadoop fs -mv hdfs://nn.example.com/file1 hdfs://nn.example.com/file2 hdfs://nn.example.com/file3 hdfs://nn.example.com/dir1>>>
   Exit Code:
   Returns 0 on success and -1 on error.
-put
+* put
-   Usage: <<<hdfs dfs -put <localsrc> ... <dst> >>>
+   Usage: <<<hadoop fs -put <localsrc> ... <dst> >>>
   Copy single src, or multiple srcs from local file system to the destination
   file system. Also reads input from stdin and writes to destination file
   system.
-     * <<<hdfs dfs -put localfile /user/hadoop/hadoopfile>>>
+     * <<<hadoop fs -put localfile /user/hadoop/hadoopfile>>>
-     * <<<hdfs dfs -put localfile1 localfile2 /user/hadoop/hadoopdir>>>
+     * <<<hadoop fs -put localfile1 localfile2 /user/hadoop/hadoopdir>>>
-     * <<<hdfs dfs -put localfile hdfs://nn.example.com/hadoop/hadoopfile>>>
+     * <<<hadoop fs -put localfile hdfs://nn.example.com/hadoop/hadoopfile>>>
-     * <<<hdfs dfs -put - hdfs://nn.example.com/hadoop/hadoopfile>>>
+     * <<<hadoop fs -put - hdfs://nn.example.com/hadoop/hadoopfile>>>
       Reads the input from stdin.
   Exit Code:
   Returns 0 on success and -1 on error.
-rm
+* renameSnapshot
-   Usage: <<<hdfs dfs -rm [-f] [-r|-R] [-skipTrash] URI [URI ...]>>>
+  See {{{../hadoop-hdfs/HdfsSnapshots.html}HDFS Snapshots Guide}}.
 * rm
   Usage: <<<hadoop fs -rm [-f] [-r|-R] [-skipTrash] URI [URI ...]>>>
   Delete files specified as args.
@ -484,23 +534,37 @@ rm
   Example:
-     * <<<hdfs dfs -rm hdfs://nn.example.com/file /user/hadoop/emptydir>>>
+     * <<<hadoop fs -rm hdfs://nn.example.com/file /user/hadoop/emptydir>>>
   Exit Code:
   Returns 0 on success and -1 on error.
-rmr
+* rmdir
-   Usage: <<<hdfs dfs -rmr [-skipTrash] URI [URI ...]>>>
+   Usage: <<<hadoop fs -rmdir [--ignore-fail-on-non-empty] URI [URI ...]>>>
   Delete a directory.
   Options:
     * --ignore-fail-on-non-empty: When using wildcards, do not fail if a directory still contains files.
   Example:
     * <<<hadoop fs -rmdir /user/hadoop/emptydir>>>
 * rmr
   Usage: <<<hadoop fs -rmr [-skipTrash] URI [URI ...]>>>
   Recursive version of delete.
-   <<Note:>> This command is deprecated. Instead use <<<hdfs dfs -rm -r>>>
+   <<Note:>> This command is deprecated. Instead use <<<hadoop fs -rm -r>>>
-setfacl
+* setfacl
-   Usage: <<<hdfs dfs -setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>] >>>
+   Usage: <<<hadoop fs -setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>] >>>
   Sets Access Control Lists (ACLs) of files and directories.
@ -528,27 +592,27 @@ setfacl
   Examples:
-      * <<<hdfs dfs -setfacl -m user:hadoop:rw- /file>>>
+      * <<<hadoop fs -setfacl -m user:hadoop:rw- /file>>>
-      * <<<hdfs dfs -setfacl -x user:hadoop /file>>>
+      * <<<hadoop fs -setfacl -x user:hadoop /file>>>
-      * <<<hdfs dfs -setfacl -b /file>>>
+      * <<<hadoop fs -setfacl -b /file>>>
-      * <<<hdfs dfs -setfacl -k /dir>>>
+      * <<<hadoop fs -setfacl -k /dir>>>
-      * <<<hdfs dfs -setfacl --set user::rw-,user:hadoop:rw-,group::r--,other::r-- /file>>>
+      * <<<hadoop fs -setfacl --set user::rw-,user:hadoop:rw-,group::r--,other::r-- /file>>>
-      * <<<hdfs dfs -setfacl -R -m user:hadoop:r-x /dir>>>
+      * <<<hadoop fs -setfacl -R -m user:hadoop:r-x /dir>>>
-      * <<<hdfs dfs -setfacl -m default:user:hadoop:r-x /dir>>>
+      * <<<hadoop fs -setfacl -m default:user:hadoop:r-x /dir>>>
   Exit Code:
   Returns 0 on success and non-zero on error.
-setfattr
+* setfattr
-   Usage: <<<hdfs dfs -setfattr {-n name [-v value] | -x name} <path> >>>
+   Usage: <<<hadoop fs -setfattr {-n name [-v value] | -x name} <path> >>>
   Sets an extended attribute name and value for a file or directory.
@ -566,19 +630,19 @@ setfattr
   Examples:
-      * <<<hdfs dfs -setfattr -n user.myAttr -v myValue /file>>>
+      * <<<hadoop fs -setfattr -n user.myAttr -v myValue /file>>>
-      * <<<hdfs dfs -setfattr -n user.noValue /file>>>
+      * <<<hadoop fs -setfattr -n user.noValue /file>>>
-      * <<<hdfs dfs -setfattr -x user.myAttr /file>>>
+      * <<<hadoop fs -setfattr -x user.myAttr /file>>>
   Exit Code:
   Returns 0 on success and non-zero on error.
-setrep
+* setrep
-   Usage: <<<hdfs dfs -setrep [-R] [-w] <numReplicas> <path> >>>
+   Usage: <<<hadoop fs -setrep [-R] [-w] <numReplicas> <path> >>>
   Changes the replication factor of a file. If <path> is a directory then
   the command recursively changes the replication factor of all files under
@ -593,28 +657,28 @@ setrep
   Example:
-     * <<<hdfs dfs -setrep -w 3 /user/hadoop/dir1>>>
+     * <<<hadoop fs -setrep -w 3 /user/hadoop/dir1>>>
   Exit Code:
   Returns 0 on success and -1 on error.
-stat
+* stat
-   Usage: <<<hdfs dfs -stat URI [URI ...]>>>
+   Usage: <<<hadoop fs -stat URI [URI ...]>>>
   Returns the stat information on the path.
   Example:
-     * <<<hdfs dfs -stat path>>>
+     * <<<hadoop fs -stat path>>>
   Exit Code:
   Returns 0 on success and -1 on error.
-tail
+* tail
-   Usage: <<<hdfs dfs -tail [-f] URI>>>
+   Usage: <<<hadoop fs -tail [-f] URI>>>
   Displays last kilobyte of the file to stdout.
@ -624,43 +688,54 @@ tail
   Example:
-     * <<<hdfs dfs -tail pathname>>>
+     * <<<hadoop fs -tail pathname>>>
   Exit Code:
   Returns 0 on success and -1 on error.
-test
+* test
-   Usage: <<<hdfs dfs -test -[ezd] URI>>>
+   Usage: <<<hadoop fs -test -[defsz] URI>>>
   Options:
-     * The -e option will check to see if the file exists, returning 0 if true.
+     * -d: f the path is a directory, return 0.
-     * The -z option will check to see if the file is zero length, returning 0 if true.
+     * -e: if the path exists, return 0.
-     * The -d option will check to see if the path is directory, returning 0 if true.
+     * -f: if the path is a file, return 0.
     * -s: if the path is not empty, return 0.
     * -z: if the file is zero length, return 0.
   Example:
-     * <<<hdfs dfs -test -e filename>>>
+     * <<<hadoop fs -test -e filename>>>
-text
+* text
-   Usage: <<<hdfs dfs -text <src> >>>
+   Usage: <<<hadoop fs -text <src> >>>
   Takes a source file and outputs the file in text format. The allowed formats
   are zip and TextRecordInputStream.
-touchz
+* touchz
-   Usage: <<<hdfs dfs -touchz URI [URI ...]>>>
+   Usage: <<<hadoop fs -touchz URI [URI ...]>>>
   Create a file of zero length.
   Example:
-     * <<<hdfs dfs -touchz pathname>>>
+     * <<<hadoop fs -touchz pathname>>>
   Exit Code:
   Returns 0 on success and -1 on error.
 * usage
   Usage: <<<hadoop fs -usage command>>>
   Return the help for an individual command.
--- a/hadoop-common-project/hadoop-common/src/site/apt/SingleCluster.apt.vm
+++ b/hadoop-common-project/hadoop-common/src/site/apt/SingleCluster.apt.vm
@ -11,12 +11,12 @@
 ~~ limitations under the License. See accompanying LICENSE file.
  ---
-  Hadoop MapReduce Next Generation ${project.version} - Setting up a Single Node Cluster.
+  Hadoop ${project.version} - Setting up a Single Node Cluster.
  ---
  ---
  ${maven.build.timestamp}
-Hadoop MapReduce Next Generation - Setting up a Single Node Cluster.
+Hadoop - Setting up a Single Node Cluster.
 %{toc|section=1|fromDepth=0}
@ -46,7 +46,9 @@ Hadoop MapReduce Next Generation - Setting up a Single Node Cluster.
         HadoopJavaVersions}}.
   [[2]] ssh must be installed and sshd must be running to use the Hadoop
-         scripts that manage remote Hadoop daemons.
+         scripts that manage remote Hadoop daemons if the optional start
         and stop scripts are to be used.  Additionally, it is recommmended that
         pdsh also be installed for better ssh resource management.
 ** Installing Software
@ -57,7 +59,7 @@ Hadoop MapReduce Next Generation - Setting up a Single Node Cluster.
 ----
  $ sudo apt-get install ssh
-  $ sudo apt-get install rsync
+  $ sudo apt-get install pdsh
 ----
 * Download
@ -75,9 +77,6 @@ Hadoop MapReduce Next Generation - Setting up a Single Node Cluster.
 ----
  # set to the root of your Java installation
  export JAVA_HOME=/usr/java/latest
  # Assuming your installation directory is /usr/local/hadoop
  export HADOOP_PREFIX=/usr/local/hadoop
 ----
  Try the following command:
@ -158,6 +157,7 @@ Hadoop MapReduce Next Generation - Setting up a Single Node Cluster.
 ----
  $ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
  $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
  $ chmod 0700 ~/.ssh/authorized_keys
 ----
 ** Execution
@ -228,7 +228,7 @@ Hadoop MapReduce Next Generation - Setting up a Single Node Cluster.
  $ sbin/stop-dfs.sh
 ----
-** YARN on Single Node
+** YARN on a Single Node
  You can run a MapReduce job on YARN in a pseudo-distributed mode by setting
  a few parameters and running ResourceManager daemon and NodeManager daemon
@ -239,7 +239,7 @@ Hadoop MapReduce Next Generation - Setting up a Single Node Cluster.
  [[1]] Configure parameters as follows:
-        etc/hadoop/mapred-site.xml:
+        <<<etc/hadoop/mapred-site.xml>>>:
 +---+
 <configuration>
@ -250,7 +250,7 @@ Hadoop MapReduce Next Generation - Setting up a Single Node Cluster.
 </configuration>
 +---+
-        etc/hadoop/yarn-site.xml:
+        <<<etc/hadoop/yarn-site.xml>>>:
 +---+
 <configuration>