HDFS-4453. Make a simple doc to describe the usage and design of the shortcircuit read feature. Contributed by Colin Patrick McCabe.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/HDFS-347@1444963 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
1132f51a9c
commit
aa92072b54
|
@ -43,3 +43,5 @@ HDFS-4440. Avoid annoying log message when dfs.domain.socket.path is not set. (C
|
|||
HDFS-4473. Don't create domain socket unless we need it. (Colin Patrick McCabe via atm)
|
||||
|
||||
HDFS-4485. DN should chmod socket path a+w. (Colin Patrick McCabe via atm)
|
||||
|
||||
HDFS-4453. Make a simple doc to describe the usage and design of the shortcircuit read feature. (Colin Patrick McCabe via atm)
|
||||
|
|
|
@ -0,0 +1,68 @@
|
|||
|
||||
~~ Licensed under the Apache License, Version 2.0 (the "License");
|
||||
~~ you may not use this file except in compliance with the License.
|
||||
~~ You may obtain a copy of the License at
|
||||
~~
|
||||
~~ http://www.apache.org/licenses/LICENSE-2.0
|
||||
~~
|
||||
~~ Unless required by applicable law or agreed to in writing, software
|
||||
~~ distributed under the License is distributed on an "AS IS" BASIS,
|
||||
~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
~~ See the License for the specific language governing permissions and
|
||||
~~ limitations under the License. See accompanying LICENSE file.
|
||||
|
||||
---
|
||||
Hadoop Distributed File System-${project.version} - Short-Circuit Local Reads
|
||||
---
|
||||
---
|
||||
${maven.build.timestamp}
|
||||
|
||||
HDFS Short-Circuit Local Reads
|
||||
|
||||
\[ {{{./index.html}Go Back}} \]
|
||||
|
||||
%{toc|section=1|fromDepth=0}
|
||||
|
||||
* {Background}
|
||||
|
||||
In <<<HDFS>>>, reads normally go through the <<<DataNode>>>. Thus, when the
|
||||
client asks the <<<DataNode>>> to read a file, the <<<DataNode>>> reads that
|
||||
file off of the disk and sends the data to the client over a TCP socket.
|
||||
So-called "short-circuit" reads bypass the <<<DataNode>>>, allowing the client
|
||||
to read the file directly. Obviously, this is only possible in cases where
|
||||
the client is co-located with the data. Short-circuit reads provide a
|
||||
substantial performance boost to many applications.
|
||||
|
||||
* {Configuration}
|
||||
|
||||
To configure short-circuit local reads, you will need to enable
|
||||
<<<libhadoop.so>>>. See
|
||||
{{{../hadoop-common/NativeLibraries.html}Native
|
||||
Libraries}} for details on enabling this library.
|
||||
|
||||
Short-circuit reads make use of a UNIX domain socket. This is a special path
|
||||
in the filesystem that allows the client and the DataNodes to communicate.
|
||||
You will need to set a path to this socket. The DataNode needs to be able to
|
||||
create this path. On the other hand, it should not be possible for any user
|
||||
except the hdfs user or root to create this path. For this reason, paths
|
||||
under <<</var/run>>> or <<</var/lib>>> are often used.
|
||||
|
||||
Short-circuit local reads need to be configured on both the <<<DataNode>>>
|
||||
and the client.
|
||||
|
||||
* {Example Configuration}
|
||||
|
||||
Here is an example configuration.
|
||||
|
||||
----
|
||||
<configuration>
|
||||
<property>
|
||||
<name>dfs.client.read.shortcircuit</name>
|
||||
<value>true</value>
|
||||
</property>
|
||||
<property>
|
||||
<name>dfs.domain.socket.path</name>
|
||||
<value>/var/lib/hadoop-hdfs/dn_socket</value>
|
||||
</property>
|
||||
</configuration>
|
||||
----
|
|
@ -61,6 +61,8 @@
|
|||
<item name="Federation" href="hadoop-project-dist/hadoop-hdfs/Federation.html"/>
|
||||
<item name="WebHDFS REST API" href="hadoop-project-dist/hadoop-hdfs/WebHDFS.html"/>
|
||||
<item name="HttpFS Gateway" href="hadoop-hdfs-httpfs/index.html"/>
|
||||
<item name="Short Circuit Local Reads"
|
||||
href="hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html"/>
|
||||
</menu>
|
||||
|
||||
<menu name="MapReduce" inherit="top">
|
||||
|
|
Loading…
Reference in New Issue