HDFS-7581. HDFS documentation needs updating post-shell rewrite (aw)
This commit is contained in:
parent
533e551eb4
commit
ce0117636a
|
@ -265,6 +265,8 @@ Trunk (Unreleased)
|
|||
|
||||
HDFS-7407. Minor typo in privileged pid/out/log names (aw)
|
||||
|
||||
HDFS-7581. HDFS documentation needs updating post-shell rewrite (aw)
|
||||
|
||||
Release 2.7.0 - UNRELEASED
|
||||
|
||||
INCOMPATIBLE CHANGES
|
||||
|
|
|
@ -34,40 +34,40 @@ HDFS Federation
|
|||
|
||||
* Consists of directories, files and blocks
|
||||
|
||||
* It supports all the namespace related file system operations such as
|
||||
* It supports all the namespace related file system operations such as
|
||||
create, delete, modify and list files and directories.
|
||||
|
||||
* <<Block Storage Service>> has two parts
|
||||
|
||||
* Block Management (which is done in Namenode)
|
||||
|
||||
* Provides datanode cluster membership by handling registrations, and
|
||||
* Provides datanode cluster membership by handling registrations, and
|
||||
periodic heart beats.
|
||||
|
||||
* Processes block reports and maintains location of blocks.
|
||||
|
||||
* Supports block related operations such as create, delete, modify and
|
||||
* Supports block related operations such as create, delete, modify and
|
||||
get block location.
|
||||
|
||||
* Manages replica placement and replication of a block for under
|
||||
* Manages replica placement and replication of a block for under
|
||||
replicated blocks and deletes blocks that are over replicated.
|
||||
|
||||
* Storage - is provided by datanodes by storing blocks on the local file
|
||||
* Storage - is provided by datanodes by storing blocks on the local file
|
||||
system and allows read/write access.
|
||||
|
||||
The prior HDFS architecture allows only a single namespace for the
|
||||
entire cluster. A single Namenode manages this namespace. HDFS
|
||||
Federation addresses limitation of the prior architecture by adding
|
||||
The prior HDFS architecture allows only a single namespace for the
|
||||
entire cluster. A single Namenode manages this namespace. HDFS
|
||||
Federation addresses limitation of the prior architecture by adding
|
||||
support multiple Namenodes/namespaces to HDFS file system.
|
||||
|
||||
|
||||
* {Multiple Namenodes/Namespaces}
|
||||
|
||||
In order to scale the name service horizontally, federation uses multiple
|
||||
independent Namenodes/namespaces. The Namenodes are federated, that is, the
|
||||
Namenodes are independent and don’t require coordination with each other.
|
||||
The datanodes are used as common storage for blocks by all the Namenodes.
|
||||
Each datanode registers with all the Namenodes in the cluster. Datanodes
|
||||
send periodic heartbeats and block reports and handles commands from the
|
||||
In order to scale the name service horizontally, federation uses multiple
|
||||
independent Namenodes/namespaces. The Namenodes are federated, that is, the
|
||||
Namenodes are independent and do not require coordination with each other.
|
||||
The datanodes are used as common storage for blocks by all the Namenodes.
|
||||
Each datanode registers with all the Namenodes in the cluster. Datanodes
|
||||
send periodic heartbeats and block reports and handles commands from the
|
||||
Namenodes.
|
||||
|
||||
Users may use {{{./ViewFs.html}ViewFs}} to create personalized namespace views,
|
||||
|
@ -78,48 +78,48 @@ HDFS Federation
|
|||
|
||||
<<Block Pool>>
|
||||
|
||||
A Block Pool is a set of blocks that belong to a single namespace.
|
||||
A Block Pool is a set of blocks that belong to a single namespace.
|
||||
Datanodes store blocks for all the block pools in the cluster.
|
||||
It is managed independently of other block pools. This allows a namespace
|
||||
to generate Block IDs for new blocks without the need for coordination
|
||||
with the other namespaces. The failure of a Namenode does not prevent
|
||||
It is managed independently of other block pools. This allows a namespace
|
||||
to generate Block IDs for new blocks without the need for coordination
|
||||
with the other namespaces. The failure of a Namenode does not prevent
|
||||
the datanode from serving other Namenodes in the cluster.
|
||||
|
||||
A Namespace and its block pool together are called Namespace Volume.
|
||||
It is a self-contained unit of management. When a Namenode/namespace
|
||||
A Namespace and its block pool together are called Namespace Volume.
|
||||
It is a self-contained unit of management. When a Namenode/namespace
|
||||
is deleted, the corresponding block pool at the datanodes is deleted.
|
||||
Each namespace volume is upgraded as a unit, during cluster upgrade.
|
||||
|
||||
<<ClusterID>>
|
||||
|
||||
A new identifier <<ClusterID>> is added to identify all the nodes in
|
||||
the cluster. When a Namenode is formatted, this identifier is provided
|
||||
or auto generated. This ID should be used for formatting the other
|
||||
A new identifier <<ClusterID>> is added to identify all the nodes in
|
||||
the cluster. When a Namenode is formatted, this identifier is provided
|
||||
or auto generated. This ID should be used for formatting the other
|
||||
Namenodes into the cluster.
|
||||
|
||||
** Key Benefits
|
||||
|
||||
* Namespace Scalability - HDFS cluster storage scales horizontally but
|
||||
the namespace does not. Large deployments or deployments using lot
|
||||
of small files benefit from scaling the namespace by adding more
|
||||
* Namespace Scalability - HDFS cluster storage scales horizontally but
|
||||
the namespace does not. Large deployments or deployments using lot
|
||||
of small files benefit from scaling the namespace by adding more
|
||||
Namenodes to the cluster
|
||||
|
||||
* Performance - File system operation throughput is limited by a single
|
||||
Namenode in the prior architecture. Adding more Namenodes to the cluster
|
||||
scales the file system read/write operations throughput.
|
||||
|
||||
* Isolation - A single Namenode offers no isolation in multi user
|
||||
environment. An experimental application can overload the Namenode
|
||||
and slow down production critical applications. With multiple Namenodes,
|
||||
different categories of applications and users can be isolated to
|
||||
* Isolation - A single Namenode offers no isolation in multi user
|
||||
environment. An experimental application can overload the Namenode
|
||||
and slow down production critical applications. With multiple Namenodes,
|
||||
different categories of applications and users can be isolated to
|
||||
different namespaces.
|
||||
|
||||
* {Federation Configuration}
|
||||
|
||||
Federation configuration is <<backward compatible>> and allows existing
|
||||
single Namenode configuration to work without any change. The new
|
||||
configuration is designed such that all the nodes in the cluster have
|
||||
same configuration without the need for deploying different configuration
|
||||
Federation configuration is <<backward compatible>> and allows existing
|
||||
single Namenode configuration to work without any change. The new
|
||||
configuration is designed such that all the nodes in the cluster have
|
||||
same configuration without the need for deploying different configuration
|
||||
based on the type of the node in the cluster.
|
||||
|
||||
A new abstraction called <<<NameServiceID>>> is added with
|
||||
|
@ -132,12 +132,12 @@ HDFS Federation
|
|||
** Configuration:
|
||||
|
||||
<<Step 1>>: Add the following parameters to your configuration:
|
||||
<<<dfs.nameservices>>>: Configure with list of comma separated
|
||||
NameServiceIDs. This will be used by Datanodes to determine all the
|
||||
<<<dfs.nameservices>>>: Configure with list of comma separated
|
||||
NameServiceIDs. This will be used by Datanodes to determine all the
|
||||
Namenodes in the cluster.
|
||||
|
||||
<<Step 2>>: For each Namenode and Secondary Namenode/BackupNode/Checkpointer
|
||||
add the following configuration suffixed with the corresponding
|
||||
|
||||
<<Step 2>>: For each Namenode and Secondary Namenode/BackupNode/Checkpointer
|
||||
add the following configuration suffixed with the corresponding
|
||||
<<<NameServiceID>>> into the common configuration file.
|
||||
|
||||
*---------------------+--------------------------------------------+
|
||||
|
@ -159,7 +159,7 @@ HDFS Federation
|
|||
| BackupNode | <<<dfs.namenode.backup.address>>> |
|
||||
| | <<<dfs.secondary.namenode.keytab.file>>> |
|
||||
*---------------------+--------------------------------------------+
|
||||
|
||||
|
||||
Here is an example configuration with two namenodes:
|
||||
|
||||
----
|
||||
|
@ -200,31 +200,31 @@ HDFS Federation
|
|||
** Formatting Namenodes
|
||||
|
||||
<<Step 1>>: Format a namenode using the following command:
|
||||
|
||||
|
||||
----
|
||||
> $HADOOP_PREFIX_HOME/bin/hdfs namenode -format [-clusterId <cluster_id>]
|
||||
[hdfs]$ $HADOOP_PREFIX/bin/hdfs namenode -format [-clusterId <cluster_id>]
|
||||
----
|
||||
Choose a unique cluster_id, which will not conflict other clusters in
|
||||
your environment. If it is not provided, then a unique ClusterID is
|
||||
Choose a unique cluster_id, which will not conflict other clusters in
|
||||
your environment. If it is not provided, then a unique ClusterID is
|
||||
auto generated.
|
||||
|
||||
<<Step 2>>: Format additional namenode using the following command:
|
||||
|
||||
----
|
||||
> $HADOOP_PREFIX_HOME/bin/hdfs namenode -format -clusterId <cluster_id>
|
||||
[hdfs]$ $HADOOP_PREFIX/bin/hdfs namenode -format -clusterId <cluster_id>
|
||||
----
|
||||
Note that the cluster_id in step 2 must be same as that of the
|
||||
cluster_id in step 1. If they are different, the additional Namenodes
|
||||
Note that the cluster_id in step 2 must be same as that of the
|
||||
cluster_id in step 1. If they are different, the additional Namenodes
|
||||
will not be part of the federated cluster.
|
||||
|
||||
** Upgrading from an older release and configuring federation
|
||||
|
||||
Older releases supported a single Namenode.
|
||||
Older releases supported a single Namenode.
|
||||
Upgrade the cluster to newer release to enable federation
|
||||
During upgrade you can provide a ClusterID as follows:
|
||||
|
||||
----
|
||||
> $HADOOP_PREFIX_HOME/bin/hdfs start namenode --config $HADOOP_CONF_DIR -upgrade -clusterId <cluster_ID>
|
||||
[hdfs]$ $HADOOP_PREFIX/bin/hdfs start namenode --config $HADOOP_CONF_DIR -upgrade -clusterId <cluster_ID>
|
||||
----
|
||||
If ClusterID is not provided, it is auto generated.
|
||||
|
||||
|
@ -234,8 +234,8 @@ HDFS Federation
|
|||
|
||||
* Add configuration parameter <<<dfs.nameservices>>> to the configuration.
|
||||
|
||||
* Update the configuration with NameServiceID suffix. Configuration
|
||||
key names have changed post release 0.20. You must use new configuration
|
||||
* Update the configuration with NameServiceID suffix. Configuration
|
||||
key names have changed post release 0.20. You must use new configuration
|
||||
parameter names, for federation.
|
||||
|
||||
* Add new Namenode related config to the configuration files.
|
||||
|
@ -244,11 +244,11 @@ HDFS Federation
|
|||
|
||||
* Start the new Namenode, Secondary/Backup.
|
||||
|
||||
* Refresh the datanodes to pickup the newly added Namenode by running
|
||||
* Refresh the datanodes to pickup the newly added Namenode by running
|
||||
the following command:
|
||||
|
||||
----
|
||||
> $HADOOP_PREFIX_HOME/bin/hdfs dfadmin -refreshNameNode <datanode_host_name>:<datanode_rpc_port>
|
||||
[hdfs]$ $HADOOP_PREFIX/bin/hdfs dfadmin -refreshNameNode <datanode_host_name>:<datanode_rpc_port>
|
||||
----
|
||||
|
||||
* The above command must be run against all the datanodes in the cluster.
|
||||
|
@ -260,37 +260,37 @@ HDFS Federation
|
|||
To start the cluster run the following command:
|
||||
|
||||
----
|
||||
> $HADOOP_PREFIX_HOME/bin/start-dfs.sh
|
||||
[hdfs]$ $HADOOP_PREFIX/sbin/start-dfs.sh
|
||||
----
|
||||
|
||||
To stop the cluster run the following command:
|
||||
|
||||
----
|
||||
> $HADOOP_PREFIX_HOME/bin/stop-dfs.sh
|
||||
[hdfs]$ $HADOOP_PREFIX/sbin/stop-dfs.sh
|
||||
----
|
||||
|
||||
These commands can be run from any node where the HDFS configuration is
|
||||
available. The command uses configuration to determine the Namenodes
|
||||
in the cluster and starts the Namenode process on those nodes. The
|
||||
datanodes are started on nodes specified in the <<<slaves>>> file. The
|
||||
script can be used as reference for building your own scripts for
|
||||
These commands can be run from any node where the HDFS configuration is
|
||||
available. The command uses configuration to determine the Namenodes
|
||||
in the cluster and starts the Namenode process on those nodes. The
|
||||
datanodes are started on nodes specified in the <<<slaves>>> file. The
|
||||
script can be used as reference for building your own scripts for
|
||||
starting and stopping the cluster.
|
||||
|
||||
** Balancer
|
||||
|
||||
Balancer has been changed to work with multiple Namenodes in the cluster to
|
||||
Balancer has been changed to work with multiple Namenodes in the cluster to
|
||||
balance the cluster. Balancer can be run using the command:
|
||||
|
||||
----
|
||||
"$HADOOP_PREFIX"/bin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script "$bin"/hdfs start balancer [-policy <policy>]
|
||||
[hdfs]$ $HADOOP_PREFIX/bin/hdfs --daemon start balancer [-policy <policy>]
|
||||
----
|
||||
|
||||
Policy could be:
|
||||
|
||||
* <<<datanode>>> - this is the <default> policy. This balances the storage at
|
||||
* <<<datanode>>> - this is the <default> policy. This balances the storage at
|
||||
the datanode level. This is similar to balancing policy from prior releases.
|
||||
|
||||
* <<<blockpool>>> - this balances the storage at the block pool level.
|
||||
* <<<blockpool>>> - this balances the storage at the block pool level.
|
||||
Balancing at block pool level balances storage at the datanode level also.
|
||||
|
||||
Note that Balancer only balances the data and does not balance the namespace.
|
||||
|
@ -298,43 +298,43 @@ HDFS Federation
|
|||
|
||||
** Decommissioning
|
||||
|
||||
Decommissioning is similar to prior releases. The nodes that need to be
|
||||
decomissioned are added to the exclude file at all the Namenode. Each
|
||||
Namenode decommissions its Block Pool. When all the Namenodes finish
|
||||
Decommissioning is similar to prior releases. The nodes that need to be
|
||||
decomissioned are added to the exclude file at all the Namenode. Each
|
||||
Namenode decommissions its Block Pool. When all the Namenodes finish
|
||||
decommissioning a datanode, the datanode is considered to be decommissioned.
|
||||
|
||||
<<Step 1>>: To distributed an exclude file to all the Namenodes, use the
|
||||
<<Step 1>>: To distributed an exclude file to all the Namenodes, use the
|
||||
following command:
|
||||
|
||||
----
|
||||
"$HADOOP_PREFIX"/bin/distributed-exclude.sh <exclude_file>
|
||||
[hdfs]$ $HADOOP_PREFIX/sbin/distributed-exclude.sh <exclude_file>
|
||||
----
|
||||
|
||||
<<Step 2>>: Refresh all the Namenodes to pick up the new exclude file.
|
||||
|
||||
----
|
||||
"$HADOOP_PREFIX"/bin/refresh-namenodes.sh
|
||||
[hdfs]$ $HADOOP_PREFIX/sbin/refresh-namenodes.sh
|
||||
----
|
||||
|
||||
The above command uses HDFS configuration to determine the Namenodes
|
||||
configured in the cluster and refreshes all the Namenodes to pick up
|
||||
|
||||
The above command uses HDFS configuration to determine the Namenodes
|
||||
configured in the cluster and refreshes all the Namenodes to pick up
|
||||
the new exclude file.
|
||||
|
||||
** Cluster Web Console
|
||||
|
||||
Similar to Namenode status web page, a Cluster Web Console is added in
|
||||
federation to monitor the federated cluster at
|
||||
Similar to Namenode status web page, a Cluster Web Console is added in
|
||||
federation to monitor the federated cluster at
|
||||
<<<http://<any_nn_host:port>/dfsclusterhealth.jsp>>>.
|
||||
Any Namenode in the cluster can be used to access this web page.
|
||||
|
||||
The web page provides the following information:
|
||||
|
||||
* Cluster summary that shows number of files, number of blocks and
|
||||
total configured storage capacity, available and used storage information
|
||||
* Cluster summary that shows number of files, number of blocks and
|
||||
total configured storage capacity, available and used storage information
|
||||
for the entire cluster.
|
||||
|
||||
* Provides list of Namenodes and summary that includes number of files,
|
||||
blocks, missing blocks, number of live and dead data nodes for each
|
||||
blocks, missing blocks, number of live and dead data nodes for each
|
||||
Namenode. It also provides a link to conveniently access Namenode web UI.
|
||||
|
||||
* It also provides decommissioning status of datanodes.
|
||||
|
|
|
@ -18,7 +18,7 @@
|
|||
|
||||
HDFS Commands Guide
|
||||
|
||||
%{toc|section=1|fromDepth=2|toDepth=4}
|
||||
%{toc|section=1|fromDepth=2|toDepth=3}
|
||||
|
||||
* Overview
|
||||
|
||||
|
@ -26,39 +26,37 @@ HDFS Commands Guide
|
|||
hdfs script without any arguments prints the description for all
|
||||
commands.
|
||||
|
||||
Usage: <<<hdfs [--config confdir] [--loglevel loglevel] [COMMAND]
|
||||
[GENERIC_OPTIONS] [COMMAND_OPTIONS]>>>
|
||||
Usage: <<<hdfs [SHELL_OPTIONS] COMMAND [GENERIC_OPTIONS] [COMMAND_OPTIONS]>>>
|
||||
|
||||
Hadoop has an option parsing framework that employs parsing generic options
|
||||
as well as running classes.
|
||||
Hadoop has an option parsing framework that employs parsing generic options as
|
||||
well as running classes.
|
||||
|
||||
*-----------------------+---------------+
|
||||
|| COMMAND_OPTION || Description
|
||||
*-----------------------+---------------+
|
||||
| <<<--config confdir>>>| Overwrites the default Configuration directory.
|
||||
| | Default is <<<${HADOOP_HOME}/conf>>>.
|
||||
*-----------------------+---------------+
|
||||
| <<<--loglevel loglevel>>>| Overwrites the log level. Valid log levels are
|
||||
| | FATAL, ERROR, WARN, INFO, DEBUG, and TRACE.
|
||||
| | Default is INFO.
|
||||
*-----------------------+---------------+
|
||||
| GENERIC_OPTIONS | The common set of options supported by multiple
|
||||
| | commands. Full list is
|
||||
| | {{{../hadoop-common/CommandsManual.html#Generic_Options}here}}.
|
||||
*-----------------------+---------------+
|
||||
| COMMAND_OPTIONS | Various commands with their options are described in
|
||||
| | the following sections. The commands have been
|
||||
| | grouped into {{{User Commands}}} and
|
||||
| | {{{Administration Commands}}}.
|
||||
*-----------------------+---------------+
|
||||
*---------------+--------------+
|
||||
|| COMMAND_OPTIONS || Description |
|
||||
*-------------------------+-------------+
|
||||
| SHELL_OPTIONS | The common set of shell options. These are documented on the {{{../../hadoop-project-dist/hadoop-common/CommandsManual.html#Shell Options}Commands Manual}} page.
|
||||
*-------------------------+----+
|
||||
| GENERIC_OPTIONS | The common set of options supported by multiple commands. See the Hadoop {{{../../hadoop-project-dist/hadoop-common/CommandsManual.html#Generic Options}Commands Manual}} for more information.
|
||||
*------------------+---------------+
|
||||
| COMMAND COMMAND_OPTIONS | Various commands with their options are described
|
||||
| | in the following sections. The commands have been
|
||||
| | grouped into {{User Commands}} and
|
||||
| | {{Administration Commands}}.
|
||||
*-------------------------+--------------+
|
||||
|
||||
* User Commands
|
||||
* {User Commands}
|
||||
|
||||
Commands useful for users of a hadoop cluster.
|
||||
|
||||
** <<<classpath>>>
|
||||
|
||||
Usage: <<<hdfs classpath>>>
|
||||
|
||||
Prints the class path needed to get the Hadoop jar and the required libraries
|
||||
|
||||
** <<<dfs>>>
|
||||
|
||||
Usage: <<<hdfs dfs [GENERIC_OPTIONS] [COMMAND_OPTIONS]>>>
|
||||
Usage: <<<hdfs dfs [COMMAND [COMMAND_OPTIONS]]>>>
|
||||
|
||||
Run a filesystem command on the file system supported in Hadoop.
|
||||
The various COMMAND_OPTIONS can be found at
|
||||
|
@ -66,97 +64,307 @@ HDFS Commands Guide
|
|||
|
||||
** <<<fetchdt>>>
|
||||
|
||||
Gets Delegation Token from a NameNode.
|
||||
See {{{./HdfsUserGuide.html#fetchdt}fetchdt}} for more info.
|
||||
|
||||
Usage: <<<hdfs fetchdt [GENERIC_OPTIONS]
|
||||
[--webservice <namenode_http_addr>] <path> >>>
|
||||
Usage: <<<hdfs fetchdt [--webservice <namenode_http_addr>] <path> >>>
|
||||
|
||||
*------------------------------+---------------------------------------------+
|
||||
|| COMMAND_OPTION || Description
|
||||
*------------------------------+---------------------------------------------+
|
||||
| <fileName> | File name to store the token into.
|
||||
*------------------------------+---------------------------------------------+
|
||||
| --webservice <https_address> | use http protocol instead of RPC
|
||||
*------------------------------+---------------------------------------------+
|
||||
| <fileName> | File name to store the token into.
|
||||
*------------------------------+---------------------------------------------+
|
||||
|
||||
|
||||
Gets Delegation Token from a NameNode.
|
||||
See {{{./HdfsUserGuide.html#fetchdt}fetchdt}} for more info.
|
||||
|
||||
** <<<fsck>>>
|
||||
|
||||
Runs a HDFS filesystem checking utility.
|
||||
See {{{./HdfsUserGuide.html#fsck}fsck}} for more info.
|
||||
Usage:
|
||||
|
||||
Usage: <<<hdfs fsck [GENERIC_OPTIONS] <path>
|
||||
[-list-corruptfileblocks |
|
||||
---
|
||||
hdfs fsck <path>
|
||||
[-list-corruptfileblocks |
|
||||
[-move | -delete | -openforwrite]
|
||||
[-files [-blocks [-locations | -racks]]]
|
||||
[-includeSnapshots] [-showprogress]>>>
|
||||
[-includeSnapshots] [-showprogress]
|
||||
---
|
||||
|
||||
*------------------------+---------------------------------------------+
|
||||
|| COMMAND_OPTION || Description
|
||||
*------------------------+---------------------------------------------+
|
||||
| <path> | Start checking from this path.
|
||||
*------------------------+---------------------------------------------+
|
||||
| -move | Move corrupted files to /lost+found.
|
||||
*------------------------+---------------------------------------------+
|
||||
| -delete | Delete corrupted files.
|
||||
*------------------------+---------------------------------------------+
|
||||
| -files | Print out files being checked.
|
||||
*------------------------+---------------------------------------------+
|
||||
| -openforwrite | Print out files opened for write.
|
||||
| -files -blocks | Print out the block report
|
||||
*------------------------+---------------------------------------------+
|
||||
| | Include snapshot data if the given path
|
||||
| -includeSnapshots | indicates a snapshottable directory or
|
||||
| -files -blocks -locations | Print out locations for every block.
|
||||
*------------------------+---------------------------------------------+
|
||||
| -files -blocks -racks | Print out network topology for data-node locations.
|
||||
*------------------------+---------------------------------------------+
|
||||
| | Include snapshot data if the given path
|
||||
| -includeSnapshots | indicates a snapshottable directory or
|
||||
| | there are snapshottable directories under it.
|
||||
*------------------------+---------------------------------------------+
|
||||
| -list-corruptfileblocks| Print out list of missing blocks and
|
||||
| -list-corruptfileblocks| Print out list of missing blocks and
|
||||
| | files they belong to.
|
||||
*------------------------+---------------------------------------------+
|
||||
| -blocks | Print out block report.
|
||||
| -move | Move corrupted files to /lost+found.
|
||||
*------------------------+---------------------------------------------+
|
||||
| -locations | Print out locations for every block.
|
||||
*------------------------+---------------------------------------------+
|
||||
| -racks | Print out network topology for data-node locations.
|
||||
| -openforwrite | Print out files opened for write.
|
||||
*------------------------+---------------------------------------------+
|
||||
| -showprogress | Print out dots for progress in output. Default is OFF
|
||||
| | (no progress).
|
||||
*------------------------+---------------------------------------------+
|
||||
|
||||
|
||||
Runs the HDFS filesystem checking utility.
|
||||
See {{{./HdfsUserGuide.html#fsck}fsck}} for more info.
|
||||
|
||||
** <<<getconf>>>
|
||||
|
||||
Usage:
|
||||
|
||||
---
|
||||
hdfs getconf -namenodes
|
||||
hdfs getconf -secondaryNameNodes
|
||||
hdfs getconf -backupNodes
|
||||
hdfs getconf -includeFile
|
||||
hdfs getconf -excludeFile
|
||||
hdfs getconf -nnRpcAddresses
|
||||
hdfs getconf -confKey [key]
|
||||
---
|
||||
|
||||
*------------------------+---------------------------------------------+
|
||||
|| COMMAND_OPTION || Description
|
||||
*------------------------+---------------------------------------------+
|
||||
| -namenodes | gets list of namenodes in the cluster.
|
||||
*------------------------+---------------------------------------------+
|
||||
| -secondaryNameNodes | gets list of secondary namenodes in the cluster.
|
||||
*------------------------+---------------------------------------------+
|
||||
| -backupNodes | gets list of backup nodes in the cluster.
|
||||
*------------------------+---------------------------------------------+
|
||||
| -includeFile | gets the include file path that defines the datanodes that can join the cluster.
|
||||
*------------------------+---------------------------------------------+
|
||||
| -excludeFile | gets the exclude file path that defines the datanodes that need to decommissioned.
|
||||
*------------------------+---------------------------------------------+
|
||||
| -nnRpcAddresses | gets the namenode rpc addresses
|
||||
*------------------------+---------------------------------------------+
|
||||
| -confKey [key] | gets a specific key from the configuration
|
||||
*------------------------+---------------------------------------------+
|
||||
|
||||
Gets configuration information from the configuration directory, post-processing.
|
||||
|
||||
** <<<groups>>>
|
||||
|
||||
Usage: <<<hdfs groups [username ...]>>>
|
||||
|
||||
Returns the group information given one or more usernames.
|
||||
|
||||
** <<<lsSnapshottableDir>>>
|
||||
|
||||
Usage: <<<hdfs lsSnapshottableDir [-help]>>>
|
||||
|
||||
*------------------------+---------------------------------------------+
|
||||
|| COMMAND_OPTION || Description
|
||||
*------------------------+---------------------------------------------+
|
||||
| -help | print help
|
||||
*------------------------+---------------------------------------------+
|
||||
|
||||
Get the list of snapshottable directories. When this is run as a super user,
|
||||
it returns all snapshottable directories. Otherwise it returns those directories
|
||||
that are owned by the current user.
|
||||
|
||||
** <<<jmxget>>>
|
||||
|
||||
Usage: <<<hdfs jmxget [-localVM ConnectorURL | -port port | -server mbeanserver | -service service]>>>
|
||||
|
||||
*------------------------+---------------------------------------------+
|
||||
|| COMMAND_OPTION || Description
|
||||
*------------------------+---------------------------------------------+
|
||||
| -help | print help
|
||||
*------------------------+---------------------------------------------+
|
||||
| -localVM ConnectorURL | connect to the VM on the same machine
|
||||
*------------------------+---------------------------------------------+
|
||||
| -port <mbean server port> | specify mbean server port, if missing
|
||||
| | it will try to connect to MBean Server in
|
||||
| | the same VM
|
||||
*------------------------+---------------------------------------------+
|
||||
| -service | specify jmx service, either DataNode or NameNode, the default
|
||||
*------------------------+---------------------------------------------+
|
||||
|
||||
Dump JMX information from a service.
|
||||
|
||||
** <<<oev>>>
|
||||
|
||||
Usage: <<<hdfs oev [OPTIONS] -i INPUT_FILE -o OUTPUT_FILE>>>
|
||||
|
||||
*** Required command line arguments:
|
||||
|
||||
*------------------------+---------------------------------------------+
|
||||
|| COMMAND_OPTION || Description
|
||||
*------------------------+---------------------------------------------+
|
||||
|-i,--inputFile <arg> | edits file to process, xml (case
|
||||
| insensitive) extension means XML format,
|
||||
| any other filename means binary format
|
||||
*------------------------+---------------------------------------------+
|
||||
| -o,--outputFile <arg> | Name of output file. If the specified
|
||||
| file exists, it will be overwritten,
|
||||
| format of the file is determined
|
||||
| by -p option
|
||||
*------------------------+---------------------------------------------+
|
||||
|
||||
*** Optional command line arguments:
|
||||
|
||||
*------------------------+---------------------------------------------+
|
||||
|| COMMAND_OPTION || Description
|
||||
*------------------------+---------------------------------------------+
|
||||
| -f,--fix-txids | Renumber the transaction IDs in the input,
|
||||
| so that there are no gaps or invalid transaction IDs.
|
||||
*------------------------+---------------------------------------------+
|
||||
| -h,--help | Display usage information and exit
|
||||
*------------------------+---------------------------------------------+
|
||||
| -r,--recover | When reading binary edit logs, use recovery
|
||||
| mode. This will give you the chance to skip
|
||||
| corrupt parts of the edit log.
|
||||
*------------------------+---------------------------------------------+
|
||||
| -p,--processor <arg> | Select which type of processor to apply
|
||||
| against image file, currently supported
|
||||
| processors are: binary (native binary format
|
||||
| that Hadoop uses), xml (default, XML
|
||||
| format), stats (prints statistics about
|
||||
| edits file)
|
||||
*------------------------+---------------------------------------------+
|
||||
| -v,--verbose | More verbose output, prints the input and
|
||||
| output filenames, for processors that write
|
||||
| to a file, also output to screen. On large
|
||||
| image files this will dramatically increase
|
||||
| processing time (default is false).
|
||||
*------------------------+---------------------------------------------+
|
||||
|
||||
Hadoop offline edits viewer.
|
||||
|
||||
** <<<oiv>>>
|
||||
|
||||
Usage: <<<hdfs oiv [OPTIONS] -i INPUT_FILE>>>
|
||||
|
||||
*** Required command line arguments:
|
||||
|
||||
*------------------------+---------------------------------------------+
|
||||
|| COMMAND_OPTION || Description
|
||||
*------------------------+---------------------------------------------+
|
||||
|-i,--inputFile <arg> | edits file to process, xml (case
|
||||
| insensitive) extension means XML format,
|
||||
| any other filename means binary format
|
||||
*------------------------+---------------------------------------------+
|
||||
|
||||
*** Optional command line arguments:
|
||||
|
||||
*------------------------+---------------------------------------------+
|
||||
|| COMMAND_OPTION || Description
|
||||
*------------------------+---------------------------------------------+
|
||||
| -h,--help | Display usage information and exit
|
||||
*------------------------+---------------------------------------------+
|
||||
| -o,--outputFile <arg> | Name of output file. If the specified
|
||||
| file exists, it will be overwritten,
|
||||
| format of the file is determined
|
||||
| by -p option
|
||||
*------------------------+---------------------------------------------+
|
||||
| -p,--processor <arg> | Select which type of processor to apply
|
||||
| against image file, currently supported
|
||||
| processors are: binary (native binary format
|
||||
| that Hadoop uses), xml (default, XML
|
||||
| format), stats (prints statistics about
|
||||
| edits file)
|
||||
*------------------------+---------------------------------------------+
|
||||
|
||||
Hadoop Offline Image Viewer for newer image files.
|
||||
|
||||
** <<<oiv_legacy>>>
|
||||
|
||||
Usage: <<<hdfs oiv_legacy [OPTIONS] -i INPUT_FILE -o OUTPUT_FILE>>>
|
||||
|
||||
*------------------------+---------------------------------------------+
|
||||
|| COMMAND_OPTION || Description
|
||||
*------------------------+---------------------------------------------+
|
||||
| -h,--help | Display usage information and exit
|
||||
*------------------------+---------------------------------------------+
|
||||
|-i,--inputFile <arg> | edits file to process, xml (case
|
||||
| insensitive) extension means XML format,
|
||||
| any other filename means binary format
|
||||
*------------------------+---------------------------------------------+
|
||||
| -o,--outputFile <arg> | Name of output file. If the specified
|
||||
| file exists, it will be overwritten,
|
||||
| format of the file is determined
|
||||
| by -p option
|
||||
*------------------------+---------------------------------------------+
|
||||
|
||||
Hadoop offline image viewer for older versions of Hadoop.
|
||||
|
||||
|
||||
** <<<snapshotDiff>>>
|
||||
|
||||
Usage: <<<hdfs snapshotDiff <path> <fromSnapshot> <toSnapshot> >>>
|
||||
|
||||
Determine the difference between HDFS snapshots. See the
|
||||
{{{./HdfsSnapshots.html#Get_Snapshots_Difference_Report}HDFS Snapshot Documentation}} for more information.
|
||||
|
||||
** <<<version>>>
|
||||
|
||||
Prints the version.
|
||||
|
||||
Usage: <<<hdfs version>>>
|
||||
|
||||
Prints the version.
|
||||
|
||||
* Administration Commands
|
||||
|
||||
Commands useful for administrators of a hadoop cluster.
|
||||
|
||||
** <<<balancer>>>
|
||||
|
||||
Runs a cluster balancing utility. An administrator can simply press Ctrl-C
|
||||
to stop the rebalancing process. See
|
||||
{{{./HdfsUserGuide.html#Balancer}Balancer}} for more details.
|
||||
|
||||
Usage: <<<hdfs balancer [-threshold <threshold>] [-policy <policy>]>>>
|
||||
|
||||
*------------------------+----------------------------------------------------+
|
||||
|| COMMAND_OPTION | Description
|
||||
*------------------------+----------------------------------------------------+
|
||||
| -threshold <threshold> | Percentage of disk capacity. This overwrites the
|
||||
| | default threshold.
|
||||
*------------------------+----------------------------------------------------+
|
||||
| -policy <policy> | <<<datanode>>> (default): Cluster is balanced if
|
||||
| | each datanode is balanced. \
|
||||
| | <<<blockpool>>>: Cluster is balanced if each block
|
||||
| | pool in each datanode is balanced.
|
||||
*------------------------+----------------------------------------------------+
|
||||
| -threshold <threshold> | Percentage of disk capacity. This overwrites the
|
||||
| | default threshold.
|
||||
*------------------------+----------------------------------------------------+
|
||||
|
||||
Runs a cluster balancing utility. An administrator can simply press Ctrl-C
|
||||
to stop the rebalancing process. See
|
||||
{{{./HdfsUserGuide.html#Balancer}Balancer}} for more details.
|
||||
|
||||
Note that the <<<blockpool>>> policy is more strict than the <<<datanode>>>
|
||||
policy.
|
||||
|
||||
** <<<datanode>>>
|
||||
** <<<cacheadmin>>>
|
||||
|
||||
Runs a HDFS datanode.
|
||||
Usage: <<<hdfs cacheadmin -addDirective -path <path> -pool <pool-name> [-force] [-replication <replication>] [-ttl <time-to-live>]>>>
|
||||
|
||||
See the {{{./CentralizedCacheManagement.html#cacheadmin_command-line_interface}HDFS Cache Administration Documentation}} for more information.
|
||||
|
||||
** <<<crypto>>>
|
||||
|
||||
Usage:
|
||||
|
||||
---
|
||||
hdfs crypto -createZone -keyName <keyName> -path <path>
|
||||
hdfs crypto -help <command-name>
|
||||
hdfs crypto -listZones
|
||||
---
|
||||
|
||||
See the {{{./TransparentEncryption.html#crypto_command-line_interface}HDFS Transparent Encryption Documentation}} for more information.
|
||||
|
||||
|
||||
** <<<datanode>>>
|
||||
|
||||
Usage: <<<hdfs datanode [-regular | -rollback | -rollingupgrace rollback]>>>
|
||||
|
||||
|
@ -172,12 +380,14 @@ HDFS Commands Guide
|
|||
| -rollingupgrade rollback | Rollback a rolling upgrade operation.
|
||||
*-----------------+-----------------------------------------------------------+
|
||||
|
||||
Runs a HDFS datanode.
|
||||
|
||||
** <<<dfsadmin>>>
|
||||
|
||||
Runs a HDFS dfsadmin client.
|
||||
Usage:
|
||||
|
||||
+------------------------------------------+
|
||||
Usage: hdfs dfsadmin [GENERIC_OPTIONS]
|
||||
------------------------------------------
|
||||
hdfs dfsadmin [GENERIC_OPTIONS]
|
||||
[-report [-live] [-dead] [-decommissioning]]
|
||||
[-safemode enter | leave | get | wait]
|
||||
[-saveNamespace]
|
||||
|
@ -210,7 +420,7 @@ HDFS Commands Guide
|
|||
[-getDatanodeInfo <datanode_host:ipc_port>]
|
||||
[-triggerBlockReport [-incremental] <datanode_host:ipc_port>]
|
||||
[-help [cmd]]
|
||||
+------------------------------------------+
|
||||
------------------------------------------
|
||||
|
||||
*-----------------+-----------------------------------------------------------+
|
||||
|| COMMAND_OPTION || Description
|
||||
|
@ -323,11 +533,11 @@ HDFS Commands Guide
|
|||
*-----------------+-----------------------------------------------------------+
|
||||
| -allowSnapshot \<snapshotDir\> | Allowing snapshots of a directory to be
|
||||
| created. If the operation completes successfully, the
|
||||
| directory becomes snapshottable.
|
||||
| directory becomes snapshottable. See the {{{./HdfsSnapshots.html}HDFS Snapshot Documentation}} for more information.
|
||||
*-----------------+-----------------------------------------------------------+
|
||||
| -disallowSnapshot \<snapshotDir\> | Disallowing snapshots of a directory to
|
||||
| be created. All snapshots of the directory must be deleted
|
||||
| before disallowing snapshots.
|
||||
| before disallowing snapshots. See the {{{./HdfsSnapshots.html}HDFS Snapshot Documentation}} for more information.
|
||||
*-----------------+-----------------------------------------------------------+
|
||||
| -fetchImage \<local directory\> | Downloads the most recent fsimage from the
|
||||
| NameNode and saves it in the specified local directory.
|
||||
|
@ -351,30 +561,68 @@ HDFS Commands Guide
|
|||
| is specified.
|
||||
*-----------------+-----------------------------------------------------------+
|
||||
|
||||
** <<<mover>>>
|
||||
Runs a HDFS dfsadmin client.
|
||||
|
||||
Runs the data migration utility.
|
||||
See {{{./ArchivalStorage.html#Mover_-_A_New_Data_Migration_Tool}Mover}} for more details.
|
||||
** <<<haadmin>>>
|
||||
|
||||
Usage:
|
||||
|
||||
---
|
||||
hdfs haadmin -checkHealth <serviceId>
|
||||
hdfs haadmin -failover [--forcefence] [--forceactive] <serviceId> <serviceId>
|
||||
hdfs haadmin -getServiceState <serviceId>
|
||||
hdfs haadmin -help <command>
|
||||
hdfs haadmin -transitionToActive <serviceId> [--forceactive]
|
||||
hdfs haadmin -transitionToStandby <serviceId>
|
||||
---
|
||||
|
||||
*--------------------+--------------------------------------------------------+
|
||||
|| COMMAND_OPTION || Description
|
||||
*--------------------+--------------------------------------------------------+
|
||||
| -checkHealth | check the health of the given NameNode
|
||||
*--------------------+--------------------------------------------------------+
|
||||
| -failover | initiate a failover between two NameNodes
|
||||
*--------------------+--------------------------------------------------------+
|
||||
| -getServiceState | determine whether the given NameNode is Active or Standby
|
||||
*--------------------+--------------------------------------------------------+
|
||||
| -transitionToActive | transition the state of the given NameNode to Active (Warning: No fencing is done)
|
||||
*--------------------+--------------------------------------------------------+
|
||||
| -transitionToStandby | transition the state of the given NameNode to Standby (Warning: No fencing is done)
|
||||
*--------------------+--------------------------------------------------------+
|
||||
|
||||
See {{{./HDFSHighAvailabilityWithNFS.html#Administrative_commands}HDFS HA with NFS}} or
|
||||
{{{./HDFSHighAvailabilityWithQJM.html#Administrative_commands}HDFS HA with QJM}} for more
|
||||
information on this command.
|
||||
|
||||
** <<<journalnode>>>
|
||||
|
||||
Usage: <<<hdfs journalnode>>>
|
||||
|
||||
This comamnd starts a journalnode for use with {{{./HDFSHighAvailabilityWithQJM.html#Administrative_commands}HDFS HA with QJM}}.
|
||||
|
||||
** <<<mover>>>
|
||||
|
||||
Usage: <<<hdfs mover [-p <files/dirs> | -f <local file name>]>>>
|
||||
|
||||
*--------------------+--------------------------------------------------------+
|
||||
|| COMMAND_OPTION || Description
|
||||
*--------------------+--------------------------------------------------------+
|
||||
| -p \<files/dirs\> | Specify a space separated list of HDFS files/dirs to migrate.
|
||||
*--------------------+--------------------------------------------------------+
|
||||
| -f \<local file\> | Specify a local file containing a list of HDFS files/dirs to migrate.
|
||||
*--------------------+--------------------------------------------------------+
|
||||
| -p \<files/dirs\> | Specify a space separated list of HDFS files/dirs to migrate.
|
||||
*--------------------+--------------------------------------------------------+
|
||||
|
||||
Runs the data migration utility.
|
||||
See {{{./ArchivalStorage.html#Mover_-_A_New_Data_Migration_Tool}Mover}} for more details.
|
||||
|
||||
Note that, when both -p and -f options are omitted, the default path is the root directory.
|
||||
|
||||
** <<<namenode>>>
|
||||
|
||||
Runs the namenode. More info about the upgrade, rollback and finalize is at
|
||||
{{{./HdfsUserGuide.html#Upgrade_and_Rollback}Upgrade Rollback}}.
|
||||
Usage:
|
||||
|
||||
+------------------------------------------+
|
||||
Usage: hdfs namenode [-backup] |
|
||||
------------------------------------------
|
||||
hdfs namenode [-backup] |
|
||||
[-checkpoint] |
|
||||
[-format [-clusterid cid ] [-force] [-nonInteractive] ] |
|
||||
[-upgrade [-clusterid cid] [-renameReserved<k-v pairs>] ] |
|
||||
|
@ -387,7 +635,7 @@ HDFS Commands Guide
|
|||
[-bootstrapStandby] |
|
||||
[-recover [-force] ] |
|
||||
[-metadataVersion ]
|
||||
+------------------------------------------+
|
||||
------------------------------------------
|
||||
|
||||
*--------------------+--------------------------------------------------------+
|
||||
|| COMMAND_OPTION || Description
|
||||
|
@ -443,11 +691,23 @@ HDFS Commands Guide
|
|||
| metadata versions of the software and the image.
|
||||
*--------------------+--------------------------------------------------------+
|
||||
|
||||
** <<<secondarynamenode>>>
|
||||
Runs the namenode. More info about the upgrade, rollback and finalize is at
|
||||
{{{./HdfsUserGuide.html#Upgrade_and_Rollback}Upgrade Rollback}}.
|
||||
|
||||
Runs the HDFS secondary namenode.
|
||||
See {{{./HdfsUserGuide.html#Secondary_NameNode}Secondary Namenode}}
|
||||
for more info.
|
||||
|
||||
** <<<nfs3>>>
|
||||
|
||||
Usage: <<<hdfs nfs3>>>
|
||||
|
||||
This comamnd starts the NFS3 gateway for use with the {{{./HdfsNfsGateway.html#Start_and_stop_NFS_gateway_service}HDFS NFS3 Service}}.
|
||||
|
||||
** <<<portmap>>>
|
||||
|
||||
Usage: <<<hdfs portmap>>>
|
||||
|
||||
This comamnd starts the RPC portmap for use with the {{{./HdfsNfsGateway.html#Start_and_stop_NFS_gateway_service}HDFS NFS3 Service}}.
|
||||
|
||||
** <<<secondarynamenode>>>
|
||||
|
||||
Usage: <<<hdfs secondarynamenode [-checkpoint [force]] | [-format] |
|
||||
[-geteditsize]>>>
|
||||
|
@ -465,6 +725,33 @@ HDFS Commands Guide
|
|||
| the NameNode.
|
||||
*----------------------+------------------------------------------------------+
|
||||
|
||||
Runs the HDFS secondary namenode.
|
||||
See {{{./HdfsUserGuide.html#Secondary_NameNode}Secondary Namenode}}
|
||||
for more info.
|
||||
|
||||
|
||||
** <<<storagepolicies>>>
|
||||
|
||||
Usage: <<<hdfs storagepolicies>>>
|
||||
|
||||
Lists out all storage policies. See the {{{./ArchivalStorage.html}HDFS Storage Policy Documentation}} for more information.
|
||||
|
||||
** <<<zkfc>>>
|
||||
|
||||
|
||||
Usage: <<<hdfs zkfc [-formatZK [-force] [-nonInteractive]]>>>
|
||||
|
||||
*----------------------+------------------------------------------------------+
|
||||
|| COMMAND_OPTION || Description
|
||||
*----------------------+------------------------------------------------------+
|
||||
| -formatZK | Format the Zookeeper instance
|
||||
*----------------------+------------------------------------------------------+
|
||||
| -h | Display help
|
||||
*----------------------+------------------------------------------------------+
|
||||
|
||||
This comamnd starts a Zookeeper Failover Controller process for use with {{{./HDFSHighAvailabilityWithQJM.html#Administrative_commands}HDFS HA with QJM}}.
|
||||
|
||||
|
||||
* Debug Commands
|
||||
|
||||
Useful commands to help administrators debug HDFS issues, like validating
|
||||
|
@ -472,30 +759,25 @@ HDFS Commands Guide
|
|||
|
||||
** <<<verify>>>
|
||||
|
||||
Verify HDFS metadata and block files. If a block file is specified, we
|
||||
will verify that the checksums in the metadata file match the block
|
||||
file.
|
||||
|
||||
Usage: <<<hdfs dfs verify [-meta <metadata-file>] [-block <block-file>]>>>
|
||||
|
||||
*------------------------+----------------------------------------------------+
|
||||
|| COMMAND_OPTION | Description
|
||||
*------------------------+----------------------------------------------------+
|
||||
| -meta <metadata-file> | Absolute path for the metadata file on the local file
|
||||
| | system of the data node.
|
||||
*------------------------+----------------------------------------------------+
|
||||
| -block <block-file> | Optional parameter to specify the absolute path for
|
||||
| | the block file on the local file system of the data
|
||||
| | node.
|
||||
*------------------------+----------------------------------------------------+
|
||||
| -meta <metadata-file> | Absolute path for the metadata file on the local file
|
||||
| | system of the data node.
|
||||
*------------------------+----------------------------------------------------+
|
||||
|
||||
|
||||
Verify HDFS metadata and block files. If a block file is specified, we
|
||||
will verify that the checksums in the metadata file match the block
|
||||
file.
|
||||
|
||||
** <<<recoverLease>>>
|
||||
|
||||
Recover the lease on the specified path. The path must reside on an
|
||||
HDFS filesystem. The default number of retries is 1.
|
||||
|
||||
Usage: <<<hdfs dfs recoverLease [-path <path>] [-retries <num-retries>]>>>
|
||||
|
||||
*-------------------------------+--------------------------------------------+
|
||||
|
@ -507,3 +789,6 @@ HDFS Commands Guide
|
|||
| | recoverLease. The default number of retries
|
||||
| | is 1.
|
||||
*-------------------------------+---------------------------------------------+
|
||||
|
||||
Recover the lease on the specified path. The path must reside on an
|
||||
HDFS filesystem. The default number of retries is 1.
|
|
@ -25,7 +25,7 @@ HDFS High Availability
|
|||
This guide provides an overview of the HDFS High Availability (HA) feature and
|
||||
how to configure and manage an HA HDFS cluster, using NFS for the shared
|
||||
storage required by the NameNodes.
|
||||
|
||||
|
||||
This document assumes that the reader has a general understanding of
|
||||
general components and node types in an HDFS cluster. Please refer to the
|
||||
HDFS Architecture guide for details.
|
||||
|
@ -44,7 +44,7 @@ HDFS High Availability
|
|||
an HDFS cluster. Each cluster had a single NameNode, and if that machine or
|
||||
process became unavailable, the cluster as a whole would be unavailable
|
||||
until the NameNode was either restarted or brought up on a separate machine.
|
||||
|
||||
|
||||
This impacted the total availability of the HDFS cluster in two major ways:
|
||||
|
||||
* In the case of an unplanned event such as a machine crash, the cluster would
|
||||
|
@ -52,7 +52,7 @@ HDFS High Availability
|
|||
|
||||
* Planned maintenance events such as software or hardware upgrades on the
|
||||
NameNode machine would result in windows of cluster downtime.
|
||||
|
||||
|
||||
The HDFS High Availability feature addresses the above problems by providing
|
||||
the option of running two redundant NameNodes in the same cluster in an
|
||||
Active/Passive configuration with a hot standby. This allows a fast failover to
|
||||
|
@ -67,7 +67,7 @@ HDFS High Availability
|
|||
for all client operations in the cluster, while the Standby is simply acting
|
||||
as a slave, maintaining enough state to provide a fast failover if
|
||||
necessary.
|
||||
|
||||
|
||||
In order for the Standby node to keep its state synchronized with the Active
|
||||
node, the current implementation requires that the two nodes both have access
|
||||
to a directory on a shared storage device (eg an NFS mount from a NAS). This
|
||||
|
@ -80,12 +80,12 @@ HDFS High Availability
|
|||
a failover, the Standby will ensure that it has read all of the edits from the
|
||||
shared storage before promoting itself to the Active state. This ensures that
|
||||
the namespace state is fully synchronized before a failover occurs.
|
||||
|
||||
|
||||
In order to provide a fast failover, it is also necessary that the Standby node
|
||||
have up-to-date information regarding the location of blocks in the cluster.
|
||||
In order to achieve this, the DataNodes are configured with the location of
|
||||
both NameNodes, and send block location information and heartbeats to both.
|
||||
|
||||
|
||||
It is vital for the correct operation of an HA cluster that only one of the
|
||||
NameNodes be Active at a time. Otherwise, the namespace state would quickly
|
||||
diverge between the two, risking data loss or other incorrect results. In
|
||||
|
@ -116,7 +116,7 @@ HDFS High Availability
|
|||
network, and power). Beacuse of this, it is recommended that the shared storage
|
||||
server be a high-quality dedicated NAS appliance rather than a simple Linux
|
||||
server.
|
||||
|
||||
|
||||
Note that, in an HA cluster, the Standby NameNode also performs checkpoints of
|
||||
the namespace state, and thus it is not necessary to run a Secondary NameNode,
|
||||
CheckpointNode, or BackupNode in an HA cluster. In fact, to do so would be an
|
||||
|
@ -133,7 +133,7 @@ HDFS High Availability
|
|||
The new configuration is designed such that all the nodes in the cluster may
|
||||
have the same configuration without the need for deploying different
|
||||
configuration files to different machines based on the type of the node.
|
||||
|
||||
|
||||
Like HDFS Federation, HA clusters reuse the <<<nameservice ID>>> to identify a
|
||||
single HDFS instance that may in fact consist of multiple HA NameNodes. In
|
||||
addition, a new abstraction called <<<NameNode ID>>> is added with HA. Each
|
||||
|
@ -330,7 +330,7 @@ HDFS High Availability
|
|||
<<dfs_namenode_rpc-address>> will contain the RPC address of the target node, even
|
||||
though the configuration may specify that variable as
|
||||
<<dfs.namenode.rpc-address.ns1.nn1>>.
|
||||
|
||||
|
||||
Additionally, the following variables referring to the target node to be fenced
|
||||
are also available:
|
||||
|
||||
|
@ -345,7 +345,7 @@ HDFS High Availability
|
|||
*-----------------------:-----------------------------------+
|
||||
| $target_namenodeid | the namenode ID of the NN to be fenced |
|
||||
*-----------------------:-----------------------------------+
|
||||
|
||||
|
||||
These environment variables may also be used as substitutions in the shell
|
||||
command itself. For example:
|
||||
|
||||
|
@ -355,7 +355,7 @@ HDFS High Availability
|
|||
<value>shell(/path/to/my/script.sh --nameservice=$target_nameserviceid $target_host:$target_port)</value>
|
||||
</property>
|
||||
---
|
||||
|
||||
|
||||
If the shell command returns an exit
|
||||
code of 0, the fencing is determined to be successful. If it returns any other
|
||||
exit code, the fencing was not successful and the next fencing method in the
|
||||
|
@ -386,7 +386,7 @@ HDFS High Availability
|
|||
|
||||
* If you are setting up a fresh HDFS cluster, you should first run the format
|
||||
command (<hdfs namenode -format>) on one of NameNodes.
|
||||
|
||||
|
||||
* If you have already formatted the NameNode, or are converting a
|
||||
non-HA-enabled cluster to be HA-enabled, you should now copy over the
|
||||
contents of your NameNode metadata directories to the other, unformatted
|
||||
|
@ -394,7 +394,7 @@ HDFS High Availability
|
|||
unformatted NameNode. Running this command will also ensure that the shared
|
||||
edits directory (as configured by <<dfs.namenode.shared.edits.dir>>) contains
|
||||
sufficient edits transactions to be able to start both NameNodes.
|
||||
|
||||
|
||||
* If you are converting a non-HA NameNode to be HA, you should run the
|
||||
command "<hdfs -initializeSharedEdits>", which will initialize the shared
|
||||
edits directory with the edits data from the local NameNode edits directories.
|
||||
|
@ -484,7 +484,7 @@ Usage: DFSHAAdmin [-ns <nameserviceId>]
|
|||
of coordination data, notifying clients of changes in that data, and
|
||||
monitoring clients for failures. The implementation of automatic HDFS failover
|
||||
relies on ZooKeeper for the following things:
|
||||
|
||||
|
||||
* <<Failure detection>> - each of the NameNode machines in the cluster
|
||||
maintains a persistent session in ZooKeeper. If the machine crashes, the
|
||||
ZooKeeper session will expire, notifying the other NameNode that a failover
|
||||
|
@ -585,7 +585,7 @@ Usage: DFSHAAdmin [-ns <nameserviceId>]
|
|||
from one of the NameNode hosts.
|
||||
|
||||
----
|
||||
$ hdfs zkfc -formatZK
|
||||
[hdfs]$ $HADOOP_PREFIX/bin/zkfc -formatZK
|
||||
----
|
||||
|
||||
This will create a znode in ZooKeeper inside of which the automatic failover
|
||||
|
@ -605,7 +605,7 @@ $ hdfs zkfc -formatZK
|
|||
can start the daemon by running:
|
||||
|
||||
----
|
||||
$ hadoop-daemon.sh start zkfc
|
||||
[hdfs]$ $HADOOP_PREFIX/bin/hdfs --daemon start zkfc
|
||||
----
|
||||
|
||||
** Securing access to ZooKeeper
|
||||
|
@ -646,7 +646,7 @@ digest:hdfs-zkfcs:mypassword
|
|||
a command like the following:
|
||||
|
||||
----
|
||||
$ java -cp $ZK_HOME/lib/*:$ZK_HOME/zookeeper-3.4.2.jar org.apache.zookeeper.server.auth.DigestAuthenticationProvider hdfs-zkfcs:mypassword
|
||||
[hdfs]$ java -cp $ZK_HOME/lib/*:$ZK_HOME/zookeeper-3.4.2.jar org.apache.zookeeper.server.auth.DigestAuthenticationProvider hdfs-zkfcs:mypassword
|
||||
output: hdfs-zkfcs:mypassword->hdfs-zkfcs:P/OQvnYyU/nF/mGYvB/xurX8dYs=
|
||||
----
|
||||
|
||||
|
@ -726,24 +726,24 @@ digest:hdfs-zkfcs:vlUvLnd8MlacsE80rDuu6ONESbM=:rwcda
|
|||
using the same <<<hdfs haadmin>>> command. It will perform a coordinated
|
||||
failover.
|
||||
|
||||
|
||||
|
||||
* BookKeeper as a Shared storage (EXPERIMENTAL)
|
||||
|
||||
One option for shared storage for the NameNode is BookKeeper.
|
||||
One option for shared storage for the NameNode is BookKeeper.
|
||||
BookKeeper achieves high availability and strong durability guarantees by replicating
|
||||
edit log entries across multiple storage nodes. The edit log can be striped across
|
||||
the storage nodes for high performance. Fencing is supported in the protocol, i.e,
|
||||
edit log entries across multiple storage nodes. The edit log can be striped across
|
||||
the storage nodes for high performance. Fencing is supported in the protocol, i.e,
|
||||
BookKeeper will not allow two writers to write the single edit log.
|
||||
|
||||
The meta data for BookKeeper is stored in ZooKeeper.
|
||||
In current HA architecture, a Zookeeper cluster is required for ZKFC. The same cluster can be
|
||||
for BookKeeper metadata.
|
||||
|
||||
For more details on building a BookKeeper cluster, please refer to the
|
||||
For more details on building a BookKeeper cluster, please refer to the
|
||||
{{{http://zookeeper.apache.org/bookkeeper/docs/trunk/bookkeeperConfig.html }BookKeeper documentation}}
|
||||
|
||||
The BookKeeperJournalManager is an implementation of the HDFS JournalManager interface, which allows custom write ahead logging implementations to be plugged into the HDFS NameNode.
|
||||
|
||||
|
||||
**<<BookKeeper Journal Manager>>
|
||||
|
||||
To use BookKeeperJournalManager, add the following to hdfs-site.xml.
|
||||
|
@ -772,12 +772,12 @@ digest:hdfs-zkfcs:vlUvLnd8MlacsE80rDuu6ONESbM=:rwcda
|
|||
classpath. We explain how to generate a jar file with the journal manager and
|
||||
its dependencies, and how to put it into the classpath below.
|
||||
|
||||
*** <<More configuration options>>
|
||||
*** <<More configuration options>>
|
||||
|
||||
* <<dfs.namenode.bookkeeperjournal.output-buffer-size>> -
|
||||
* <<dfs.namenode.bookkeeperjournal.output-buffer-size>> -
|
||||
Number of bytes a bookkeeper journal stream will buffer before
|
||||
forcing a flush. Default is 1024.
|
||||
|
||||
|
||||
----
|
||||
<property>
|
||||
<name>dfs.namenode.bookkeeperjournal.output-buffer-size</name>
|
||||
|
@ -785,7 +785,7 @@ digest:hdfs-zkfcs:vlUvLnd8MlacsE80rDuu6ONESbM=:rwcda
|
|||
</property>
|
||||
----
|
||||
|
||||
* <<dfs.namenode.bookkeeperjournal.ensemble-size>> -
|
||||
* <<dfs.namenode.bookkeeperjournal.ensemble-size>> -
|
||||
Number of bookkeeper servers in edit log ensembles. This
|
||||
is the number of bookkeeper servers which need to be available
|
||||
for the edit log to be writable. Default is 3.
|
||||
|
@ -797,7 +797,7 @@ digest:hdfs-zkfcs:vlUvLnd8MlacsE80rDuu6ONESbM=:rwcda
|
|||
</property>
|
||||
----
|
||||
|
||||
* <<dfs.namenode.bookkeeperjournal.quorum-size>> -
|
||||
* <<dfs.namenode.bookkeeperjournal.quorum-size>> -
|
||||
Number of bookkeeper servers in the write quorum. This is the
|
||||
number of bookkeeper servers which must have acknowledged the
|
||||
write of an entry before it is considered written. Default is 2.
|
||||
|
@ -809,7 +809,7 @@ digest:hdfs-zkfcs:vlUvLnd8MlacsE80rDuu6ONESbM=:rwcda
|
|||
</property>
|
||||
----
|
||||
|
||||
* <<dfs.namenode.bookkeeperjournal.digestPw>> -
|
||||
* <<dfs.namenode.bookkeeperjournal.digestPw>> -
|
||||
Password to use when creating edit log segments.
|
||||
|
||||
----
|
||||
|
@ -819,9 +819,9 @@ digest:hdfs-zkfcs:vlUvLnd8MlacsE80rDuu6ONESbM=:rwcda
|
|||
</property>
|
||||
----
|
||||
|
||||
* <<dfs.namenode.bookkeeperjournal.zk.session.timeout>> -
|
||||
* <<dfs.namenode.bookkeeperjournal.zk.session.timeout>> -
|
||||
Session timeout for Zookeeper client from BookKeeper Journal Manager.
|
||||
Hadoop recommends that this value should be less than the ZKFC
|
||||
Hadoop recommends that this value should be less than the ZKFC
|
||||
session timeout value. Default value is 3000.
|
||||
|
||||
----
|
||||
|
@ -838,22 +838,22 @@ digest:hdfs-zkfcs:vlUvLnd8MlacsE80rDuu6ONESbM=:rwcda
|
|||
|
||||
$ mvn clean package -Pdist
|
||||
|
||||
This will generate a jar with the BookKeeperJournalManager,
|
||||
This will generate a jar with the BookKeeperJournalManager,
|
||||
hadoop-hdfs/src/contrib/bkjournal/target/hadoop-hdfs-bkjournal-<VERSION>.jar
|
||||
|
||||
Note that the -Pdist part of the build command is important, this would
|
||||
copy the dependent bookkeeper-server jar under
|
||||
copy the dependent bookkeeper-server jar under
|
||||
hadoop-hdfs/src/contrib/bkjournal/target/lib.
|
||||
|
||||
*** <<Putting the BookKeeperJournalManager in the NameNode classpath>>
|
||||
|
||||
To run a HDFS namenode using BookKeeper as a backend, copy the bkjournal and
|
||||
bookkeeper-server jar, mentioned above, into the lib directory of hdfs. In the
|
||||
bookkeeper-server jar, mentioned above, into the lib directory of hdfs. In the
|
||||
standard distribution of HDFS, this is at $HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/
|
||||
|
||||
cp hadoop-hdfs/src/contrib/bkjournal/target/hadoop-hdfs-bkjournal-<VERSION>.jar $HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/
|
||||
|
||||
*** <<Current limitations>>
|
||||
*** <<Current limitations>>
|
||||
|
||||
1) Security in BookKeeper. BookKeeper does not support SASL nor SSL for
|
||||
connections between the NameNode and BookKeeper storage nodes.
|
|
@ -25,7 +25,7 @@ HDFS High Availability Using the Quorum Journal Manager
|
|||
This guide provides an overview of the HDFS High Availability (HA) feature
|
||||
and how to configure and manage an HA HDFS cluster, using the Quorum Journal
|
||||
Manager (QJM) feature.
|
||||
|
||||
|
||||
This document assumes that the reader has a general understanding of
|
||||
general components and node types in an HDFS cluster. Please refer to the
|
||||
HDFS Architecture guide for details.
|
||||
|
@ -44,7 +44,7 @@ HDFS High Availability Using the Quorum Journal Manager
|
|||
an HDFS cluster. Each cluster had a single NameNode, and if that machine or
|
||||
process became unavailable, the cluster as a whole would be unavailable
|
||||
until the NameNode was either restarted or brought up on a separate machine.
|
||||
|
||||
|
||||
This impacted the total availability of the HDFS cluster in two major ways:
|
||||
|
||||
* In the case of an unplanned event such as a machine crash, the cluster would
|
||||
|
@ -52,7 +52,7 @@ HDFS High Availability Using the Quorum Journal Manager
|
|||
|
||||
* Planned maintenance events such as software or hardware upgrades on the
|
||||
NameNode machine would result in windows of cluster downtime.
|
||||
|
||||
|
||||
The HDFS High Availability feature addresses the above problems by providing
|
||||
the option of running two redundant NameNodes in the same cluster in an
|
||||
Active/Passive configuration with a hot standby. This allows a fast failover to
|
||||
|
@ -67,7 +67,7 @@ HDFS High Availability Using the Quorum Journal Manager
|
|||
for all client operations in the cluster, while the Standby is simply acting
|
||||
as a slave, maintaining enough state to provide a fast failover if
|
||||
necessary.
|
||||
|
||||
|
||||
In order for the Standby node to keep its state synchronized with the Active
|
||||
node, both nodes communicate with a group of separate daemons called
|
||||
"JournalNodes" (JNs). When any namespace modification is performed by the
|
||||
|
@ -78,12 +78,12 @@ HDFS High Availability Using the Quorum Journal Manager
|
|||
failover, the Standby will ensure that it has read all of the edits from the
|
||||
JounalNodes before promoting itself to the Active state. This ensures that the
|
||||
namespace state is fully synchronized before a failover occurs.
|
||||
|
||||
|
||||
In order to provide a fast failover, it is also necessary that the Standby node
|
||||
have up-to-date information regarding the location of blocks in the cluster.
|
||||
In order to achieve this, the DataNodes are configured with the location of
|
||||
both NameNodes, and send block location information and heartbeats to both.
|
||||
|
||||
|
||||
It is vital for the correct operation of an HA cluster that only one of the
|
||||
NameNodes be Active at a time. Otherwise, the namespace state would quickly
|
||||
diverge between the two, risking data loss or other incorrect results. In
|
||||
|
@ -113,7 +113,7 @@ HDFS High Availability Using the Quorum Journal Manager
|
|||
you should run an odd number of JNs, (i.e. 3, 5, 7, etc.). Note that when
|
||||
running with N JournalNodes, the system can tolerate at most (N - 1) / 2
|
||||
failures and continue to function normally.
|
||||
|
||||
|
||||
Note that, in an HA cluster, the Standby NameNode also performs checkpoints of
|
||||
the namespace state, and thus it is not necessary to run a Secondary NameNode,
|
||||
CheckpointNode, or BackupNode in an HA cluster. In fact, to do so would be an
|
||||
|
@ -130,7 +130,7 @@ HDFS High Availability Using the Quorum Journal Manager
|
|||
The new configuration is designed such that all the nodes in the cluster may
|
||||
have the same configuration without the need for deploying different
|
||||
configuration files to different machines based on the type of the node.
|
||||
|
||||
|
||||
Like HDFS Federation, HA clusters reuse the <<<nameservice ID>>> to identify a
|
||||
single HDFS instance that may in fact consist of multiple HA NameNodes. In
|
||||
addition, a new abstraction called <<<NameNode ID>>> is added with HA. Each
|
||||
|
@ -347,7 +347,7 @@ HDFS High Availability Using the Quorum Journal Manager
|
|||
<<dfs_namenode_rpc-address>> will contain the RPC address of the target node, even
|
||||
though the configuration may specify that variable as
|
||||
<<dfs.namenode.rpc-address.ns1.nn1>>.
|
||||
|
||||
|
||||
Additionally, the following variables referring to the target node to be fenced
|
||||
are also available:
|
||||
|
||||
|
@ -362,7 +362,7 @@ HDFS High Availability Using the Quorum Journal Manager
|
|||
*-----------------------:-----------------------------------+
|
||||
| $target_namenodeid | the namenode ID of the NN to be fenced |
|
||||
*-----------------------:-----------------------------------+
|
||||
|
||||
|
||||
These environment variables may also be used as substitutions in the shell
|
||||
command itself. For example:
|
||||
|
||||
|
@ -372,7 +372,7 @@ HDFS High Availability Using the Quorum Journal Manager
|
|||
<value>shell(/path/to/my/script.sh --nameservice=$target_nameserviceid $target_host:$target_port)</value>
|
||||
</property>
|
||||
---
|
||||
|
||||
|
||||
If the shell command returns an exit
|
||||
code of 0, the fencing is determined to be successful. If it returns any other
|
||||
exit code, the fencing was not successful and the next fencing method in the
|
||||
|
@ -424,7 +424,7 @@ HDFS High Availability Using the Quorum Journal Manager
|
|||
|
||||
* If you are setting up a fresh HDFS cluster, you should first run the format
|
||||
command (<hdfs namenode -format>) on one of NameNodes.
|
||||
|
||||
|
||||
* If you have already formatted the NameNode, or are converting a
|
||||
non-HA-enabled cluster to be HA-enabled, you should now copy over the
|
||||
contents of your NameNode metadata directories to the other, unformatted
|
||||
|
@ -432,7 +432,7 @@ HDFS High Availability Using the Quorum Journal Manager
|
|||
unformatted NameNode. Running this command will also ensure that the
|
||||
JournalNodes (as configured by <<dfs.namenode.shared.edits.dir>>) contain
|
||||
sufficient edits transactions to be able to start both NameNodes.
|
||||
|
||||
|
||||
* If you are converting a non-HA NameNode to be HA, you should run the
|
||||
command "<hdfs -initializeSharedEdits>", which will initialize the
|
||||
JournalNodes with the edits data from the local NameNode edits directories.
|
||||
|
@ -522,7 +522,7 @@ Usage: DFSHAAdmin [-ns <nameserviceId>]
|
|||
of coordination data, notifying clients of changes in that data, and
|
||||
monitoring clients for failures. The implementation of automatic HDFS failover
|
||||
relies on ZooKeeper for the following things:
|
||||
|
||||
|
||||
* <<Failure detection>> - each of the NameNode machines in the cluster
|
||||
maintains a persistent session in ZooKeeper. If the machine crashes, the
|
||||
ZooKeeper session will expire, notifying the other NameNode that a failover
|
||||
|
@ -623,7 +623,7 @@ Usage: DFSHAAdmin [-ns <nameserviceId>]
|
|||
from one of the NameNode hosts.
|
||||
|
||||
----
|
||||
$ hdfs zkfc -formatZK
|
||||
[hdfs]$ $HADOOP_PREFIX/bin/hdfs zkfc -formatZK
|
||||
----
|
||||
|
||||
This will create a znode in ZooKeeper inside of which the automatic failover
|
||||
|
@ -643,7 +643,7 @@ $ hdfs zkfc -formatZK
|
|||
can start the daemon by running:
|
||||
|
||||
----
|
||||
$ hadoop-daemon.sh start zkfc
|
||||
[hdfs]$ $HADOOP_PREFIX/bin/hdfs --daemon start zkfc
|
||||
----
|
||||
|
||||
** Securing access to ZooKeeper
|
||||
|
@ -684,7 +684,7 @@ digest:hdfs-zkfcs:mypassword
|
|||
a command like the following:
|
||||
|
||||
----
|
||||
$ java -cp $ZK_HOME/lib/*:$ZK_HOME/zookeeper-3.4.2.jar org.apache.zookeeper.server.auth.DigestAuthenticationProvider hdfs-zkfcs:mypassword
|
||||
[hdfs]$ java -cp $ZK_HOME/lib/*:$ZK_HOME/zookeeper-3.4.2.jar org.apache.zookeeper.server.auth.DigestAuthenticationProvider hdfs-zkfcs:mypassword
|
||||
output: hdfs-zkfcs:mypassword->hdfs-zkfcs:P/OQvnYyU/nF/mGYvB/xurX8dYs=
|
||||
----
|
||||
|
||||
|
@ -786,18 +786,18 @@ digest:hdfs-zkfcs:vlUvLnd8MlacsE80rDuu6ONESbM=:rwcda
|
|||
operations, the operation will fail.
|
||||
|
||||
[[3]] Start one of the NNs with the <<<'-upgrade'>>> flag.
|
||||
|
||||
|
||||
[[4]] On start, this NN will not enter the standby state as usual in an HA
|
||||
setup. Rather, this NN will immediately enter the active state, perform an
|
||||
upgrade of its local storage dirs, and also perform an upgrade of the shared
|
||||
edit log.
|
||||
|
||||
|
||||
[[5]] At this point the other NN in the HA pair will be out of sync with
|
||||
the upgraded NN. In order to bring it back in sync and once again have a highly
|
||||
available setup, you should re-bootstrap this NameNode by running the NN with
|
||||
the <<<'-bootstrapStandby'>>> flag. It is an error to start this second NN with
|
||||
the <<<'-upgrade'>>> flag.
|
||||
|
||||
|
||||
Note that if at any time you want to restart the NameNodes before finalizing
|
||||
or rolling back the upgrade, you should start the NNs as normal, i.e. without
|
||||
any special startup flag.
|
||||
|
|
|
@ -36,15 +36,15 @@ HDFS NFS Gateway
|
|||
HDFS file system.
|
||||
|
||||
* Users can stream data directly to HDFS through the mount point. File
|
||||
append is supported but random write is not supported.
|
||||
append is supported but random write is not supported.
|
||||
|
||||
The NFS gateway machine needs the same thing to run an HDFS client like Hadoop JAR files, HADOOP_CONF directory.
|
||||
The NFS gateway can be on the same host as DataNode, NameNode, or any HDFS client.
|
||||
The NFS gateway can be on the same host as DataNode, NameNode, or any HDFS client.
|
||||
|
||||
|
||||
* {Configuration}
|
||||
|
||||
The NFS-gateway uses proxy user to proxy all the users accessing the NFS mounts.
|
||||
The NFS-gateway uses proxy user to proxy all the users accessing the NFS mounts.
|
||||
In non-secure mode, the user running the gateway is the proxy user, while in secure mode the
|
||||
user in Kerberos keytab is the proxy user. Suppose the proxy user is 'nfsserver'
|
||||
and users belonging to the groups 'users-group1'
|
||||
|
@ -57,10 +57,10 @@ HDFS NFS Gateway
|
|||
<name>hadoop.proxyuser.nfsserver.groups</name>
|
||||
<value>root,users-group1,users-group2</value>
|
||||
<description>
|
||||
The 'nfsserver' user is allowed to proxy all members of the 'users-group1' and
|
||||
The 'nfsserver' user is allowed to proxy all members of the 'users-group1' and
|
||||
'users-group2' groups. Note that in most cases you will need to include the
|
||||
group "root" because the user "root" (which usually belonges to "root" group) will
|
||||
generally be the user that initially executes the mount on the NFS client system.
|
||||
generally be the user that initially executes the mount on the NFS client system.
|
||||
Set this to '*' to allow nfsserver user to proxy any group.
|
||||
</description>
|
||||
</property>
|
||||
|
@ -78,7 +78,7 @@ HDFS NFS Gateway
|
|||
----
|
||||
|
||||
The above are the only required configuration for the NFS gateway in non-secure mode. For Kerberized
|
||||
hadoop clusters, the following configurations need to be added to hdfs-site.xml for the gateway (NOTE: replace
|
||||
hadoop clusters, the following configurations need to be added to hdfs-site.xml for the gateway (NOTE: replace
|
||||
string "nfsserver" with the proxy user name and ensure the user contained in the keytab is
|
||||
also the same proxy user):
|
||||
|
||||
|
@ -95,7 +95,7 @@ HDFS NFS Gateway
|
|||
<value>nfsserver/_HOST@YOUR-REALM.COM</value>
|
||||
</property>
|
||||
----
|
||||
|
||||
|
||||
The rest of the NFS gateway configurations are optional for both secure and non-secure mode.
|
||||
|
||||
The AIX NFS client has a {{{https://issues.apache.org/jira/browse/HDFS-6549}few known issues}}
|
||||
|
@ -119,11 +119,11 @@ HDFS NFS Gateway
|
|||
|
||||
It's strongly recommended for the users to update a few configuration properties based on their use
|
||||
cases. All the following configuration properties can be added or updated in hdfs-site.xml.
|
||||
|
||||
* If the client mounts the export with access time update allowed, make sure the following
|
||||
property is not disabled in the configuration file. Only NameNode needs to restart after
|
||||
|
||||
* If the client mounts the export with access time update allowed, make sure the following
|
||||
property is not disabled in the configuration file. Only NameNode needs to restart after
|
||||
this property is changed. On some Unix systems, the user can disable access time update
|
||||
by mounting the export with "noatime". If the export is mounted with "noatime", the user
|
||||
by mounting the export with "noatime". If the export is mounted with "noatime", the user
|
||||
doesn't need to change the following property and thus no need to restart namenode.
|
||||
|
||||
----
|
||||
|
@ -149,11 +149,11 @@ HDFS NFS Gateway
|
|||
this property is updated.
|
||||
|
||||
----
|
||||
<property>
|
||||
<property>
|
||||
<name>nfs.dump.dir</name>
|
||||
<value>/tmp/.hdfs-nfs</value>
|
||||
</property>
|
||||
----
|
||||
----
|
||||
|
||||
* By default, the export can be mounted by any client. To better control the access,
|
||||
users can update the following property. The value string contains machine name and
|
||||
|
@ -161,7 +161,7 @@ HDFS NFS Gateway
|
|||
characters. The machine name format can be a single host, a Java regular expression, or an IPv4 address. The
|
||||
access privilege uses rw or ro to specify read/write or read-only access of the machines to exports. If the access
|
||||
privilege is not provided, the default is read-only. Entries are separated by ";".
|
||||
For example: "192.168.0.0/22 rw ; host.*\.example\.com ; host1.test.org ro;". Only the NFS gateway needs to restart after
|
||||
For example: "192.168.0.0/22 rw ; host.*\.example\.com ; host1.test.org ro;". Only the NFS gateway needs to restart after
|
||||
this property is updated.
|
||||
|
||||
----
|
||||
|
@ -171,22 +171,22 @@ HDFS NFS Gateway
|
|||
</property>
|
||||
----
|
||||
|
||||
* JVM and log settings. You can export JVM settings (e.g., heap size and GC log) in
|
||||
HADOOP_NFS3_OPTS. More NFS related settings can be found in hadoop-env.sh.
|
||||
To get NFS debug trace, you can edit the log4j.property file
|
||||
* JVM and log settings. You can export JVM settings (e.g., heap size and GC log) in
|
||||
HADOOP_NFS3_OPTS. More NFS related settings can be found in hadoop-env.sh.
|
||||
To get NFS debug trace, you can edit the log4j.property file
|
||||
to add the following. Note, debug trace, especially for ONCRPC, can be very verbose.
|
||||
|
||||
To change logging level:
|
||||
|
||||
-----------------------------------------------
|
||||
-----------------------------------------------
|
||||
log4j.logger.org.apache.hadoop.hdfs.nfs=DEBUG
|
||||
-----------------------------------------------
|
||||
-----------------------------------------------
|
||||
|
||||
To get more details of ONCRPC requests:
|
||||
|
||||
-----------------------------------------------
|
||||
-----------------------------------------------
|
||||
log4j.logger.org.apache.hadoop.oncrpc=DEBUG
|
||||
-----------------------------------------------
|
||||
-----------------------------------------------
|
||||
|
||||
|
||||
* {Start and stop NFS gateway service}
|
||||
|
@ -195,53 +195,39 @@ HDFS NFS Gateway
|
|||
The NFS gateway process has both nfsd and mountd. It shares the HDFS root "/" as the
|
||||
only export. It is recommended to use the portmap included in NFS gateway package. Even
|
||||
though NFS gateway works with portmap/rpcbind provide by most Linux distributions, the
|
||||
package included portmap is needed on some Linux systems such as REHL6.2 due to an
|
||||
package included portmap is needed on some Linux systems such as REHL6.2 due to an
|
||||
{{{https://bugzilla.redhat.com/show_bug.cgi?id=731542}rpcbind bug}}. More detailed discussions can
|
||||
be found in {{{https://issues.apache.org/jira/browse/HDFS-4763}HDFS-4763}}.
|
||||
|
||||
[[1]] Stop nfs/rpcbind/portmap services provided by the platform (commands can be different on various Unix platforms):
|
||||
|
||||
-------------------------
|
||||
service nfs stop
|
||||
|
||||
service rpcbind stop
|
||||
-------------------------
|
||||
|
||||
|
||||
[[2]] Start package included portmap (needs root privileges):
|
||||
[[1]] Stop nfsv3 and rpcbind/portmap services provided by the platform (commands can be different on various Unix platforms):
|
||||
|
||||
-------------------------
|
||||
hdfs portmap
|
||||
|
||||
OR
|
||||
[root]> service nfs stop
|
||||
[root]> service rpcbind stop
|
||||
-------------------------
|
||||
|
||||
hadoop-daemon.sh start portmap
|
||||
[[2]] Start Hadoop's portmap (needs root privileges):
|
||||
|
||||
-------------------------
|
||||
[root]> $HADOOP_PREFIX/bin/hdfs --daemon start portmap
|
||||
-------------------------
|
||||
|
||||
[[3]] Start mountd and nfsd.
|
||||
|
||||
|
||||
No root privileges are required for this command. In non-secure mode, the NFS gateway
|
||||
should be started by the proxy user mentioned at the beginning of this user guide.
|
||||
While in secure mode, any user can start NFS gateway
|
||||
should be started by the proxy user mentioned at the beginning of this user guide.
|
||||
While in secure mode, any user can start NFS gateway
|
||||
as long as the user has read access to the Kerberos keytab defined in "nfs.keytab.file".
|
||||
|
||||
-------------------------
|
||||
hdfs nfs3
|
||||
|
||||
OR
|
||||
|
||||
hadoop-daemon.sh start nfs3
|
||||
[hdfs]$ $HADOOP_PREFIX/bin/hdfs --daemon start nfs3
|
||||
-------------------------
|
||||
|
||||
Note, if the hadoop-daemon.sh script starts the NFS gateway, its log can be found in the hadoop log folder.
|
||||
|
||||
|
||||
[[4]] Stop NFS gateway services.
|
||||
|
||||
-------------------------
|
||||
hadoop-daemon.sh stop nfs3
|
||||
|
||||
hadoop-daemon.sh stop portmap
|
||||
[hdfs]$ $HADOOP_PREFIX/bin/hdfs --daemon stop nfs3
|
||||
[root]> $HADOOP_PREFIX/bin/hdfs --daemon stop portmap
|
||||
-------------------------
|
||||
|
||||
Optionally, you can forgo running the Hadoop-provided portmap daemon and
|
||||
|
@ -263,7 +249,7 @@ HDFS NFS Gateway
|
|||
[[1]] Execute the following command to verify if all the services are up and running:
|
||||
|
||||
-------------------------
|
||||
rpcinfo -p $nfs_server_ip
|
||||
[root]> rpcinfo -p $nfs_server_ip
|
||||
-------------------------
|
||||
|
||||
You should see output similar to the following:
|
||||
|
@ -293,11 +279,11 @@ HDFS NFS Gateway
|
|||
[[2]] Verify if the HDFS namespace is exported and can be mounted.
|
||||
|
||||
-------------------------
|
||||
showmount -e $nfs_server_ip
|
||||
[root]> showmount -e $nfs_server_ip
|
||||
-------------------------
|
||||
|
||||
You should see output similar to the following:
|
||||
|
||||
|
||||
-------------------------
|
||||
Exports list on $nfs_server_ip :
|
||||
|
||||
|
@ -307,22 +293,22 @@ HDFS NFS Gateway
|
|||
|
||||
* {Mount the export “/”}
|
||||
|
||||
Currently NFS v3 only uses TCP as the transportation protocol.
|
||||
Currently NFS v3 only uses TCP as the transportation protocol.
|
||||
NLM is not supported so mount option "nolock" is needed. It's recommended to use
|
||||
hard mount. This is because, even after the client sends all data to
|
||||
NFS gateway, it may take NFS gateway some extra time to transfer data to HDFS
|
||||
hard mount. This is because, even after the client sends all data to
|
||||
NFS gateway, it may take NFS gateway some extra time to transfer data to HDFS
|
||||
when writes were reorderd by NFS client Kernel.
|
||||
|
||||
If soft mount has to be used, the user should give it a relatively
|
||||
|
||||
If soft mount has to be used, the user should give it a relatively
|
||||
long timeout (at least no less than the default timeout on the host) .
|
||||
|
||||
The users can mount the HDFS namespace as shown below:
|
||||
|
||||
-------------------------------------------------------------------
|
||||
mount -t nfs -o vers=3,proto=tcp,nolock,noacl $server:/ $mount_point
|
||||
-------------------------------------------------------------------
|
||||
[root]>mount -t nfs -o vers=3,proto=tcp,nolock,noacl $server:/ $mount_point
|
||||
-------------------------------------------------------------------
|
||||
|
||||
Then the users can access HDFS as part of the local file system except that,
|
||||
Then the users can access HDFS as part of the local file system except that,
|
||||
hard link and random write are not supported yet. To optimize the performance
|
||||
of large file I/O, one can increase the NFS transfer size(rsize and wsize) during mount.
|
||||
By default, NFS gateway supports 1MB as the maximum transfer size. For larger data
|
||||
|
@ -347,7 +333,7 @@ HDFS NFS Gateway
|
|||
* {User authentication and mapping}
|
||||
|
||||
NFS gateway in this release uses AUTH_UNIX style authentication. When the user on NFS client
|
||||
accesses the mount point, NFS client passes the UID to NFS gateway.
|
||||
accesses the mount point, NFS client passes the UID to NFS gateway.
|
||||
NFS gateway does a lookup to find user name from the UID, and then passes the
|
||||
username to the HDFS along with the HDFS requests.
|
||||
For example, if the NFS client has current user as "admin", when the user accesses
|
||||
|
@ -358,7 +344,7 @@ HDFS NFS Gateway
|
|||
The system administrator must ensure that the user on NFS client host has the same
|
||||
name and UID as that on the NFS gateway host. This is usually not a problem if
|
||||
the same user management system (e.g., LDAP/NIS) is used to create and deploy users on
|
||||
HDFS nodes and NFS client node. In case the user account is created manually on different hosts, one might need to
|
||||
HDFS nodes and NFS client node. In case the user account is created manually on different hosts, one might need to
|
||||
modify UID (e.g., do "usermod -u 123 myusername") on either NFS client or NFS gateway host
|
||||
in order to make it the same on both sides. More technical details of RPC AUTH_UNIX can be found
|
||||
in {{{http://tools.ietf.org/html/rfc1057}RPC specification}}.
|
||||
|
|
Loading…
Reference in New Issue