HDFS-5347. Merging change r1534377 from trunk
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-2@1534379 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
7a45ef2300
commit
fcc912838d
|
@ -179,6 +179,8 @@ Release 2.2.1 - UNRELEASED
|
||||||
|
|
||||||
HDFS-5365. Fix libhdfs compile error on FreeBSD9. (Radim Kolar via cnauroth)
|
HDFS-5365. Fix libhdfs compile error on FreeBSD9. (Radim Kolar via cnauroth)
|
||||||
|
|
||||||
|
HDFS-5347. Add HDFS NFS user guide. (brandonli)
|
||||||
|
|
||||||
Release 2.2.0 - 2013-10-13
|
Release 2.2.0 - 2013-10-13
|
||||||
|
|
||||||
INCOMPATIBLE CHANGES
|
INCOMPATIBLE CHANGES
|
||||||
|
|
|
@ -0,0 +1,258 @@
|
||||||
|
|
||||||
|
~~ Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
|
~~ you may not use this file except in compliance with the License.
|
||||||
|
~~ You may obtain a copy of the License at
|
||||||
|
~~
|
||||||
|
~~ http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
~~
|
||||||
|
~~ Unless required by applicable law or agreed to in writing, software
|
||||||
|
~~ distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
~~ See the License for the specific language governing permissions and
|
||||||
|
~~ limitations under the License. See accompanying LICENSE file.
|
||||||
|
|
||||||
|
---
|
||||||
|
Hadoop Distributed File System-${project.version} - HDFS NFS Gateway
|
||||||
|
---
|
||||||
|
---
|
||||||
|
${maven.build.timestamp}
|
||||||
|
|
||||||
|
HDFS NFS Gateway
|
||||||
|
|
||||||
|
\[ {{{./index.html}Go Back}} \]
|
||||||
|
|
||||||
|
%{toc|section=1|fromDepth=0}
|
||||||
|
|
||||||
|
* {Overview}
|
||||||
|
|
||||||
|
The NFS Gateway supports NFSv3 and allows HDFS to be mounted as part of the client's local file system.
|
||||||
|
Currently NFS Gateway supports and enables the following usage patterns:
|
||||||
|
|
||||||
|
* Users can browse the HDFS file system through their local file system
|
||||||
|
on NFSv3 client compatible operating systems.
|
||||||
|
|
||||||
|
* Users can download files from the the HDFS file system on to their
|
||||||
|
local file system.
|
||||||
|
|
||||||
|
* Users can upload files from their local file system directly to the
|
||||||
|
HDFS file system.
|
||||||
|
|
||||||
|
* Users can stream data directly to HDFS through the mount point. File
|
||||||
|
append is supported but random write is not supported.
|
||||||
|
|
||||||
|
The NFS gateway machine needs the same thing to run an HDFS client like Hadoop JAR files, HADOOP_CONF directory.
|
||||||
|
The NFS gateway can be on the same host as DataNode, NameNode, or any HDFS client.
|
||||||
|
|
||||||
|
|
||||||
|
* {Configuration}
|
||||||
|
|
||||||
|
NFS gateway can work with its default settings in most cases. However, it's
|
||||||
|
strongly recommended for the users to update a few configuration properties based on their use
|
||||||
|
cases. All the related configuration properties can be added or updated in hdfs-site.xml.
|
||||||
|
|
||||||
|
* If the client mounts the export with access time update allowed, make sure the following
|
||||||
|
property is not disabled in the configuration file. Only NameNode needs to restart after
|
||||||
|
this property is changed. On some Unix systems, the user can disable access time update
|
||||||
|
by mounting the export with "noatime".
|
||||||
|
|
||||||
|
----
|
||||||
|
<property>
|
||||||
|
<name>dfs.access.time.precision</name>
|
||||||
|
<value>3600000</value>
|
||||||
|
<description>The access time for HDFS file is precise upto this value.
|
||||||
|
The default value is 1 hour. Setting a value of 0 disables
|
||||||
|
access times for HDFS.
|
||||||
|
</description>
|
||||||
|
</property>
|
||||||
|
----
|
||||||
|
|
||||||
|
* Users are expected to update the file dump directory. NFS client often
|
||||||
|
reorders writes. Sequential writes can arrive at the NFS gateway at random
|
||||||
|
order. This directory is used to temporarily save out-of-order writes
|
||||||
|
before writing to HDFS. For each file, the out-of-order writes are dumped after
|
||||||
|
they are accumulated to exceed certain threshold (e.g., 1MB) in memory.
|
||||||
|
One needs to make sure the directory has enough
|
||||||
|
space. For example, if the application uploads 10 files with each having
|
||||||
|
100MB, it is recommended for this directory to have roughly 1GB space in case if a
|
||||||
|
worst-case write reorder happens to every file. Only NFS gateway needs to restart after
|
||||||
|
this property is updated.
|
||||||
|
|
||||||
|
----
|
||||||
|
<property>
|
||||||
|
<name>dfs.nfs3.dump.dir</name>
|
||||||
|
<value>/tmp/.hdfs-nfs</value>
|
||||||
|
</property>
|
||||||
|
----
|
||||||
|
|
||||||
|
* By default, the export can be mounted by any client. To better control the access,
|
||||||
|
users can update the following property. The value string contains machine name and
|
||||||
|
access privilege, separated by whitespace
|
||||||
|
characters. Machine name format can be single host, wildcards, and IPv4 networks.The
|
||||||
|
access privilege uses rw or ro to specify readwrite or readonly access of the machines to exports. If the access
|
||||||
|
privilege is not provided, the default is read-only. Entries are separated by ";".
|
||||||
|
For example: "192.168.0.0/22 rw ; host*.example.com ; host1.test.org ro;". Only NFS gateway needs to restart after
|
||||||
|
this property is updated.
|
||||||
|
|
||||||
|
----
|
||||||
|
<property>
|
||||||
|
<name>dfs.nfs.exports.allowed.hosts</name>
|
||||||
|
<value>* rw</value>
|
||||||
|
</property>
|
||||||
|
----
|
||||||
|
|
||||||
|
* Customize log settings. To get NFS debug trace, users can edit the log4j.property file
|
||||||
|
to add the following. Note, debug trace, especially for ONCRPC, can be very verbose.
|
||||||
|
|
||||||
|
To change logging level:
|
||||||
|
|
||||||
|
-----------------------------------------------
|
||||||
|
log4j.logger.org.apache.hadoop.hdfs.nfs=DEBUG
|
||||||
|
-----------------------------------------------
|
||||||
|
|
||||||
|
To get more details of ONCRPC requests:
|
||||||
|
|
||||||
|
-----------------------------------------------
|
||||||
|
log4j.logger.org.apache.hadoop.oncrpc=DEBUG
|
||||||
|
-----------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
* {Start and stop NFS gateway service}
|
||||||
|
|
||||||
|
Three daemons are required to provide NFS service: rpcbind (or portmap), mountd and nfsd.
|
||||||
|
The NFS gateway process has both nfsd and mountd. It shares the HDFS root "/" as the
|
||||||
|
only export. It is recommended to use the portmap included in NFS gateway package. Even
|
||||||
|
though NFS gateway works with portmap/rpcbind provide by most Linux distributions, the
|
||||||
|
package included portmap is needed on some Linux systems such as REHL6.2 due to an
|
||||||
|
{{{https://bugzilla.redhat.com/show_bug.cgi?id=731542}rpcbind bug}}. More detailed discussions can
|
||||||
|
be found in {{{https://issues.apache.org/jira/browse/HDFS-4763}HDFS-4763}}.
|
||||||
|
|
||||||
|
[[1]] Stop nfs/rpcbind/portmap services provided by the platform (commands can be different on various Unix platforms):
|
||||||
|
|
||||||
|
-------------------------
|
||||||
|
service nfs stop
|
||||||
|
|
||||||
|
service rpcbind stop
|
||||||
|
-------------------------
|
||||||
|
|
||||||
|
|
||||||
|
[[2]] Start package included portmap (needs root privileges):
|
||||||
|
|
||||||
|
-------------------------
|
||||||
|
hadoop portmap
|
||||||
|
|
||||||
|
OR
|
||||||
|
|
||||||
|
hadoop-daemon.sh start portmap
|
||||||
|
-------------------------
|
||||||
|
|
||||||
|
[[3]] Start mountd and nfsd.
|
||||||
|
|
||||||
|
No root privileges are required for this command. However, ensure that the user starting
|
||||||
|
the Hadoop cluster and the user starting the NFS gateway are same.
|
||||||
|
|
||||||
|
-------------------------
|
||||||
|
hadoop nfs3
|
||||||
|
|
||||||
|
OR
|
||||||
|
|
||||||
|
hadoop-daemon.sh start nfs3
|
||||||
|
-------------------------
|
||||||
|
|
||||||
|
Note, if the hadoop-daemon.sh script starts the NFS gateway, its log can be found in the hadoop log folder.
|
||||||
|
|
||||||
|
|
||||||
|
[[4]] Stop NFS gateway services.
|
||||||
|
|
||||||
|
-------------------------
|
||||||
|
hadoop-daemon.sh stop nfs3
|
||||||
|
|
||||||
|
hadoop-daemon.sh stop portmap
|
||||||
|
-------------------------
|
||||||
|
|
||||||
|
|
||||||
|
* {Verify validity of NFS related services}
|
||||||
|
|
||||||
|
[[1]] Execute the following command to verify if all the services are up and running:
|
||||||
|
|
||||||
|
-------------------------
|
||||||
|
rpcinfo -p $nfs_server_ip
|
||||||
|
-------------------------
|
||||||
|
|
||||||
|
You should see output similar to the following:
|
||||||
|
|
||||||
|
-------------------------
|
||||||
|
program vers proto port
|
||||||
|
|
||||||
|
100005 1 tcp 4242 mountd
|
||||||
|
|
||||||
|
100005 2 udp 4242 mountd
|
||||||
|
|
||||||
|
100005 2 tcp 4242 mountd
|
||||||
|
|
||||||
|
100000 2 tcp 111 portmapper
|
||||||
|
|
||||||
|
100000 2 udp 111 portmapper
|
||||||
|
|
||||||
|
100005 3 udp 4242 mountd
|
||||||
|
|
||||||
|
100005 1 udp 4242 mountd
|
||||||
|
|
||||||
|
100003 3 tcp 2049 nfs
|
||||||
|
|
||||||
|
100005 3 tcp 4242 mountd
|
||||||
|
-------------------------
|
||||||
|
|
||||||
|
[[2]] Verify if the HDFS namespace is exported and can be mounted.
|
||||||
|
|
||||||
|
-------------------------
|
||||||
|
showmount -e $nfs_server_ip
|
||||||
|
-------------------------
|
||||||
|
|
||||||
|
You should see output similar to the following:
|
||||||
|
|
||||||
|
-------------------------
|
||||||
|
Exports list on $nfs_server_ip :
|
||||||
|
|
||||||
|
/ (everyone)
|
||||||
|
-------------------------
|
||||||
|
|
||||||
|
|
||||||
|
* {Mount the export “/”}
|
||||||
|
|
||||||
|
Currently NFS v3 only uses TCP as the transportation protocol.
|
||||||
|
NLM is not supported so mount option "nolock" is needed. It's recommended to use
|
||||||
|
hard mount. This is because, even after the client sends all data to
|
||||||
|
NFS gateway, it may take NFS gateway some extra time to transfer data to HDFS
|
||||||
|
when writes were reorderd by NFS client Kernel.
|
||||||
|
|
||||||
|
If soft mount has to be used, the user should give it a relatively
|
||||||
|
long timeout (at least no less than the default timeout on the host) .
|
||||||
|
|
||||||
|
The users can mount the HDFS namespace as shown below:
|
||||||
|
|
||||||
|
-------------------------------------------------------------------
|
||||||
|
mount -t nfs -o vers=3,proto=tcp,nolock $server:/ $mount_point
|
||||||
|
-------------------------------------------------------------------
|
||||||
|
|
||||||
|
Then the users can access HDFS as part of the local file system except that,
|
||||||
|
hard link and random write are not supported yet.
|
||||||
|
|
||||||
|
* {User authentication and mapping}
|
||||||
|
|
||||||
|
NFS gateway in this release uses AUTH_UNIX style authentication. When the user on NFS client
|
||||||
|
accesses the mount point, NFS client passes the UID to NFS gateway.
|
||||||
|
NFS gateway does a lookup to find user name from the UID, and then passes the
|
||||||
|
username to the HDFS along with the HDFS requests.
|
||||||
|
For example, if the NFS client has current user as "admin", when the user accesses
|
||||||
|
the mounted directory, NFS gateway will access HDFS as user "admin". To access HDFS
|
||||||
|
as the user "hdfs", one needs to switch the current user to "hdfs" on the client system
|
||||||
|
when accessing the mounted directory.
|
||||||
|
|
||||||
|
The system administrator must ensure that the user on NFS client host has the same
|
||||||
|
name and UID as that on the NFS gateway host. This is usually not a problem if
|
||||||
|
the same user management system (e.g., LDAP/NIS) is used to create and deploy users on
|
||||||
|
HDFS nodes and NFS client node. In case the user account is created manually in different hosts, one might need to
|
||||||
|
modify UID (e.g., do "usermod -u 123 myusername") on either NFS client or NFS gateway host
|
||||||
|
in order to make it the same on both sides. More technical details of RPC AUTH_UNIX can be found
|
||||||
|
in {{{http://tools.ietf.org/html/rfc1057}RPC specification}}.
|
||||||
|
|
|
@ -80,6 +80,7 @@
|
||||||
<item name="HttpFS Gateway" href="hadoop-hdfs-httpfs/index.html"/>
|
<item name="HttpFS Gateway" href="hadoop-hdfs-httpfs/index.html"/>
|
||||||
<item name="Short Circuit Local Reads"
|
<item name="Short Circuit Local Reads"
|
||||||
href="hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html"/>
|
href="hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html"/>
|
||||||
|
<item name="HDFS NFS Gateway" href="hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html"/>
|
||||||
</menu>
|
</menu>
|
||||||
|
|
||||||
<menu name="MapReduce" inherit="top">
|
<menu name="MapReduce" inherit="top">
|
||||||
|
|
Loading…
Reference in New Issue