HDFS-6712. Document HDFS Multihoming Settings. (Contributed by Arpit Agarwal)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1612695 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
60b1e835e0
commit
fee737eced
|
@ -602,6 +602,8 @@ Release 2.5.0 - UNRELEASED
|
|||
HDFS-6680. BlockPlacementPolicyDefault does not choose favored nodes
|
||||
correctly. (szetszwo)
|
||||
|
||||
HDFS-6712. Document HDFS Multihoming Settings. (Arpit Agarwal)
|
||||
|
||||
OPTIMIZATIONS
|
||||
|
||||
HDFS-6214. Webhdfs has poor throughput for files >2GB (daryn)
|
||||
|
|
|
@ -0,0 +1,145 @@
|
|||
~~ Licensed under the Apache License, Version 2.0 (the "License");
|
||||
~~ you may not use this file except in compliance with the License.
|
||||
~~ You may obtain a copy of the License at
|
||||
~~
|
||||
~~ http://www.apache.org/licenses/LICENSE-2.0
|
||||
~~
|
||||
~~ Unless required by applicable law or agreed to in writing, software
|
||||
~~ distributed under the License is distributed on an "AS IS" BASIS,
|
||||
~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
~~ See the License for the specific language governing permissions and
|
||||
~~ limitations under the License. See accompanying LICENSE file.
|
||||
|
||||
---
|
||||
Hadoop Distributed File System-${project.version} - Support for Multi-Homed Networks
|
||||
---
|
||||
---
|
||||
${maven.build.timestamp}
|
||||
|
||||
HDFS Support for Multihomed Networks
|
||||
|
||||
This document is targetted to cluster administrators deploying <<<HDFS>>> in
|
||||
multihomed networks. Similar support for <<<YARN>>>/<<<MapReduce>>> is
|
||||
work in progress and will be documented when available.
|
||||
|
||||
%{toc|section=1|fromDepth=0}
|
||||
|
||||
* Multihoming Background
|
||||
|
||||
In multihomed networks the cluster nodes are connected to more than one
|
||||
network interface. There could be multiple reasons for doing so.
|
||||
|
||||
[[1]] <<Security>>: Security requirements may dictate that intra-cluster
|
||||
traffic be confined to a different network than the network used to
|
||||
transfer data in and out of the cluster.
|
||||
|
||||
[[2]] <<Performance>>: Intra-cluster traffic may use one or more high bandwidth
|
||||
interconnects like Fiber Channel, Infiniband or 10GbE.
|
||||
|
||||
[[3]] <<Failover/Redundancy>>: The nodes may have multiple network adapters
|
||||
connected to a single network to handle network adapter failure.
|
||||
|
||||
|
||||
Note that NIC Bonding (also known as NIC Teaming or Link
|
||||
Aggregation) is a related but separate topic. The following settings
|
||||
are usually not applicable to a NIC bonding configuration which handles
|
||||
multiplexing and failover transparently while presenting a single 'logical
|
||||
network' to applications.
|
||||
|
||||
* Fixing Hadoop Issues In Multihomed Environments
|
||||
|
||||
** Ensuring HDFS Daemons Bind All Interfaces
|
||||
|
||||
By default <<<HDFS>>> endpoints are specified as either hostnames or IP addresses.
|
||||
In either case <<<HDFS>>> daemons will bind to a single IP address making
|
||||
the daemons unreachable from other networks.
|
||||
|
||||
The solution is to have separate setting for server endpoints to force binding
|
||||
the wildcard IP address <<<INADDR_ANY>>> i.e. <<<0.0.0.0>>>. Do NOT supply a port
|
||||
number with any of these settings.
|
||||
|
||||
----
|
||||
<property>
|
||||
<name>dfs.namenode.rpc-bind-host</name>
|
||||
<value>0.0.0.0</value>
|
||||
<description>
|
||||
The actual address the RPC server will bind to. If this optional address is
|
||||
set, it overrides only the hostname portion of dfs.namenode.rpc-address.
|
||||
It can also be specified per name node or name service for HA/Federation.
|
||||
This is useful for making the name node listen on all interfaces by
|
||||
setting it to 0.0.0.0.
|
||||
</description>
|
||||
</property>
|
||||
|
||||
<property>
|
||||
<name>dfs.namenode.servicerpc-bind-host</name>
|
||||
<value>0.0.0.0</value>
|
||||
<description>
|
||||
The actual address the service RPC server will bind to. If this optional address is
|
||||
set, it overrides only the hostname portion of dfs.namenode.servicerpc-address.
|
||||
It can also be specified per name node or name service for HA/Federation.
|
||||
This is useful for making the name node listen on all interfaces by
|
||||
setting it to 0.0.0.0.
|
||||
</description>
|
||||
</property>
|
||||
|
||||
<property>
|
||||
<name>dfs.namenode.http-bind-host</name>
|
||||
<value>0.0.0.0</value>
|
||||
<description>
|
||||
The actual adress the HTTP server will bind to. If this optional address
|
||||
is set, it overrides only the hostname portion of dfs.namenode.http-address.
|
||||
It can also be specified per name node or name service for HA/Federation.
|
||||
This is useful for making the name node HTTP server listen on all
|
||||
interfaces by setting it to 0.0.0.0.
|
||||
</description>
|
||||
</property>
|
||||
|
||||
<property>
|
||||
<name>dfs.namenode.https-bind-host</name>
|
||||
<value>0.0.0.0</value>
|
||||
<description>
|
||||
The actual adress the HTTPS server will bind to. If this optional address
|
||||
is set, it overrides only the hostname portion of dfs.namenode.https-address.
|
||||
It can also be specified per name node or name service for HA/Federation.
|
||||
This is useful for making the name node HTTPS server listen on all
|
||||
interfaces by setting it to 0.0.0.0.
|
||||
</description>
|
||||
</property>
|
||||
----
|
||||
|
||||
** Clients use Hostnames when connecting to DataNodes
|
||||
|
||||
By default <<<HDFS>>> clients connect to DataNodes using the IP address
|
||||
provided by the NameNode. Depending on the network configuration this
|
||||
IP address may be unreachable by the clients. The fix is letting clients perform
|
||||
their own DNS resolution of the DataNode hostname. The following setting
|
||||
enables this behavior.
|
||||
|
||||
----
|
||||
<property>
|
||||
<name>dfs.client.use.datanode.hostname</name>
|
||||
<value>true</value>
|
||||
<description>Whether clients should use datanode hostnames when
|
||||
connecting to datanodes.
|
||||
</description>
|
||||
</property>
|
||||
----
|
||||
|
||||
** DataNodes use HostNames when connecting to other DataNodes
|
||||
|
||||
Rarely, the NameNode-resolved IP address for a DataNode may be unreachable
|
||||
from other DataNodes. The fix is to force DataNodes to perform their own
|
||||
DNS resolution for inter-DataNode connections. The following setting enables
|
||||
this behavior.
|
||||
|
||||
----
|
||||
<property>
|
||||
<name>dfs.datanode.use.datanode.hostname</name>
|
||||
<value>true</value>
|
||||
<description>Whether datanodes should use datanode hostnames when
|
||||
connecting to other datanodes for data transfer.
|
||||
</description>
|
||||
</property>
|
||||
----
|
||||
|
|
@ -89,6 +89,7 @@
|
|||
<item name="HDFS NFS Gateway" href="hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html"/>
|
||||
<item name="HDFS Rolling Upgrade" href="hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html"/>
|
||||
<item name="Extended Attributes" href="hadoop-project-dist/hadoop-hdfs/ExtendedAttributes.html"/>
|
||||
<item name="HDFS Support for Multihoming" href="hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html"/>
|
||||
</menu>
|
||||
|
||||
<menu name="MapReduce" inherit="top">
|
||||
|
|
Loading…
Reference in New Issue