HADOOP-10618. Remove SingleNodeSetup.apt.vm (Contributed by Akira Ajisaka)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1596964 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
3671a5e16f
commit
3c4c16a4f7
|
@ -413,6 +413,9 @@ Release 2.5.0 - UNRELEASED
|
|||
HADOOP-10614. CBZip2InputStream is not threadsafe (Xiangrui Meng via
|
||||
Sandy Ryza)
|
||||
|
||||
HADOOP-10618. Remove SingleNodeSetup.apt.vm. (Akira Ajisaka via
|
||||
Arpit Agarwal)
|
||||
|
||||
OPTIMIZATIONS
|
||||
|
||||
BUG FIXES
|
||||
|
|
|
@ -18,210 +18,7 @@
|
|||
|
||||
Single Node Setup
|
||||
|
||||
%{toc|section=1|fromDepth=0}
|
||||
This page will be removed in the next major release.
|
||||
|
||||
* Purpose
|
||||
|
||||
This document describes how to set up and configure a single-node
|
||||
Hadoop installation so that you can quickly perform simple operations
|
||||
using Hadoop MapReduce and the Hadoop Distributed File System (HDFS).
|
||||
|
||||
* Prerequisites
|
||||
|
||||
** Supported Platforms
|
||||
|
||||
* GNU/Linux is supported as a development and production platform.
|
||||
Hadoop has been demonstrated on GNU/Linux clusters with 2000 nodes.
|
||||
|
||||
* Windows is also a supported platform.
|
||||
|
||||
** Required Software
|
||||
|
||||
Required software for Linux and Windows include:
|
||||
|
||||
[[1]] Java^TM 1.6.x, preferably from Sun, must be installed.
|
||||
|
||||
[[2]] ssh must be installed and sshd must be running to use the Hadoop
|
||||
scripts that manage remote Hadoop daemons.
|
||||
|
||||
** Installing Software
|
||||
|
||||
If your cluster doesn't have the requisite software you will need to
|
||||
install it.
|
||||
|
||||
For example on Ubuntu Linux:
|
||||
|
||||
----
|
||||
$ sudo apt-get install ssh
|
||||
$ sudo apt-get install rsync
|
||||
----
|
||||
|
||||
* Download
|
||||
|
||||
To get a Hadoop distribution, download a recent stable release from one
|
||||
of the Apache Download Mirrors.
|
||||
|
||||
* Prepare to Start the Hadoop Cluster
|
||||
|
||||
Unpack the downloaded Hadoop distribution. In the distribution, edit
|
||||
the file <<<conf/hadoop-env.sh>>> to define at least <<<JAVA_HOME>>> to be the root
|
||||
of your Java installation.
|
||||
|
||||
Try the following command:
|
||||
|
||||
----
|
||||
$ bin/hadoop
|
||||
----
|
||||
|
||||
This will display the usage documentation for the hadoop script.
|
||||
|
||||
Now you are ready to start your Hadoop cluster in one of the three
|
||||
supported modes:
|
||||
|
||||
* Local (Standalone) Mode
|
||||
|
||||
* Pseudo-Distributed Mode
|
||||
|
||||
* Fully-Distributed Mode
|
||||
|
||||
* Standalone Operation
|
||||
|
||||
By default, Hadoop is configured to run in a non-distributed mode, as a
|
||||
single Java process. This is useful for debugging.
|
||||
|
||||
The following example copies the unpacked conf directory to use as
|
||||
input and then finds and displays every match of the given regular
|
||||
expression. Output is written to the given output directory.
|
||||
|
||||
----
|
||||
$ mkdir input
|
||||
$ cp conf/*.xml input
|
||||
$ bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
|
||||
$ cat output/*
|
||||
---
|
||||
|
||||
* Pseudo-Distributed Operation
|
||||
|
||||
Hadoop can also be run on a single-node in a pseudo-distributed mode
|
||||
where each Hadoop daemon runs in a separate Java process.
|
||||
|
||||
** Configuration
|
||||
|
||||
Use the following:
|
||||
|
||||
conf/core-site.xml:
|
||||
|
||||
----
|
||||
<configuration>
|
||||
<property>
|
||||
<name>fs.defaultFS</name>
|
||||
<value>hdfs://localhost:9000</value>
|
||||
</property>
|
||||
</configuration>
|
||||
----
|
||||
|
||||
conf/hdfs-site.xml:
|
||||
|
||||
----
|
||||
<configuration>
|
||||
<property>
|
||||
<name>dfs.replication</name>
|
||||
<value>1</value>
|
||||
</property>
|
||||
</configuration>
|
||||
----
|
||||
|
||||
conf/mapred-site.xml:
|
||||
|
||||
----
|
||||
<configuration>
|
||||
<property>
|
||||
<name>mapred.job.tracker</name>
|
||||
<value>localhost:9001</value>
|
||||
</property>
|
||||
</configuration>
|
||||
----
|
||||
|
||||
** Setup passphraseless ssh
|
||||
|
||||
Now check that you can ssh to the localhost without a passphrase:
|
||||
|
||||
----
|
||||
$ ssh localhost
|
||||
----
|
||||
|
||||
If you cannot ssh to localhost without a passphrase, execute the
|
||||
following commands:
|
||||
|
||||
----
|
||||
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
|
||||
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
|
||||
----
|
||||
|
||||
** Execution
|
||||
|
||||
Format a new distributed-filesystem:
|
||||
|
||||
----
|
||||
$ bin/hadoop namenode -format
|
||||
----
|
||||
|
||||
Start the hadoop daemons:
|
||||
|
||||
----
|
||||
$ bin/start-all.sh
|
||||
----
|
||||
|
||||
The hadoop daemon log output is written to the <<<${HADOOP_LOG_DIR}>>>
|
||||
directory (defaults to <<<${HADOOP_PREFIX}/logs>>>).
|
||||
|
||||
Browse the web interface for the NameNode and the JobTracker; by
|
||||
default they are available at:
|
||||
|
||||
* NameNode - <<<http://localhost:50070/>>>
|
||||
|
||||
* JobTracker - <<<http://localhost:50030/>>>
|
||||
|
||||
Copy the input files into the distributed filesystem:
|
||||
|
||||
----
|
||||
$ bin/hadoop fs -put conf input
|
||||
----
|
||||
|
||||
Run some of the examples provided:
|
||||
|
||||
----
|
||||
$ bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
|
||||
----
|
||||
|
||||
Examine the output files:
|
||||
|
||||
Copy the output files from the distributed filesystem to the local
|
||||
filesytem and examine them:
|
||||
|
||||
----
|
||||
$ bin/hadoop fs -get output output
|
||||
$ cat output/*
|
||||
----
|
||||
|
||||
or
|
||||
|
||||
View the output files on the distributed filesystem:
|
||||
|
||||
----
|
||||
$ bin/hadoop fs -cat output/*
|
||||
----
|
||||
|
||||
When you're done, stop the daemons with:
|
||||
|
||||
----
|
||||
$ bin/stop-all.sh
|
||||
----
|
||||
|
||||
* Fully-Distributed Operation
|
||||
|
||||
For information on setting up fully-distributed, non-trivial clusters
|
||||
see {{{./ClusterSetup.html}Cluster Setup}}.
|
||||
|
||||
Java and JNI are trademarks or registered trademarks of Sun
|
||||
Microsystems, Inc. in the United States and other countries.
|
||||
See {{{./SingleCluster.html}Single Cluster Setup}} to set up and configure a
|
||||
single-node Hadoop installation.
|
||||
|
|
Loading…
Reference in New Issue