HBASE-1632 Write documentation for configuring/managing ZooKeeper with HBase

git-svn-id: https://svn.apache.org/repos/asf/hadoop/hbase/trunk@794331 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Nitay Joffe 2009-07-15 17:40:39 +00:00
parent 4dee3ff2e4
commit 0175729a01
1 changed files with 85 additions and 37 deletions

View File

@ -174,51 +174,99 @@ The <code>regionserver</code> file lists all hosts running HRegionServers, one h
</p>
<p>
A distributed HBase depends on a running ZooKeeper cluster.
The ZooKeeper configuration file for HBase is stored at <code>${HBASE_HOME}/conf/zoo.cfg</code>.
See the ZooKeeper <a href="http://hadoop.apache.org/zookeeper/docs/current/zookeeperStarted.html"> Getting Started Guide</a>
for information about the format and options of that file. Specifically, look at the
<a href="http://hadoop.apache.org/zookeeper/docs/current/zookeeperStarted.html#sc_RunningReplicatedZooKeeper">Running Replicated ZooKeeper</a> section.
HBase can manage a ZooKeeper cluster for you, or you can manage it on your own
and point HBase to it.
To toggle this option, use the <code>HBASE_MANAGES_ZK</code> variable in <code>
${HBASE_HOME}/conf/hbase-env.sh</code>.
This variable, which defaults to <code>true</code>, tells HBase whether to
start/stop the ZooKeeper quorum servers alongside the rest of the servers.
</p>
<p>
Though not recommended, it can be convenient having HBase continue to manage
ZooKeeper even when in distributed mode (It can be good when testing or taking
hbase for a testdrive). Change <code>${HBASE_HOME}/conf/zoo.cfg</code> and
set the server.0 property to the IP of the node that will be running ZooKeeper
(Leaving the default value of "localhost" will make it impossible to start HBase).
<pre>
...
server.0=example.org:2888:3888
<blockquote>
</pre>
Then on the example.org server do the following <i>before</i> running HBase.
<pre>
${HBASE_HOME}/bin/hbase-daemon.sh start zookeeper
</pre>
</blockquote>
<p>To stop ZooKeeper, after you've shut down hbase, do:
<blockquote>
<pre>
${HBASE_HOME}/bin/hbase-daemon.sh stop zookeeper
</pre>
</blockquote>
Be aware that this option is only recommanded for testing purposes as a failure
on that node would render HBase <b>unusable</b>.
</p>
<p>
To tell HBase to stop managing a ZooKeeper instance, after configuring
<code>zoo.cfg</code> to point at the ZooKeeper Quorum you'd like HBase to
use, in <code>${HBASE_HOME}/conf/hbase-env.sh</code>,
set the following to tell HBase to STOP managing its instance of ZooKeeper.
<blockquote>
To point HBase at an existing ZooKeeper cluster, add your <code>zoo.cfg</code>
to the <code>CLASSPATH</code>.
HBase will see this file and use it to figure out where ZooKeeper is.
Additionally set <code>HBASE_MANAGES_ZK</code> in <code> ${HBASE_HOME}/conf/hbase-env.sh</code>
to <code>false</code> so that HBase doesn't mess with your ZooKeeper setup:
<pre>
...
# Tell HBase whether it should manage it's own instance of Zookeeper or not.
export HBASE_MANAGES_ZK=false
</pre>
</blockquote>
For more information about setting up a ZooKeeper cluster on your own, see
the ZooKeeper <a href="http://hadoop.apache.org/zookeeper/docs/current/zookeeperStarted.html">Getting Started Guide</a>.
HBase currently uses ZooKeeper version 3.2.0, so any cluster setup with a 3.x.x
version of ZooKeeper should work.
</p>
<p>
To have HBase manage the ZooKeeper cluster, you can use a <code>zoo.cfg</code>
file as above, or edit the options directly in the <code>${HBASE_HOME}/conf/hbase-site.xml</code>.
Every option from the <code>zoo.cfg</code> has a corresponding property in the
XML configuration file named <code>hbase.zookeeper.property.OPTION</code>.
For example, the <code>clientPort</code> setting in ZooKeeper can be changed by
setting the <code>hbase.zookeeper.property.clientPort</code> property.
For the full list of available properties, see ZooKeeper's <code>zoo.cfg</code>.
For the default values used by HBase, see <code>${HBASE_HOME}/conf/hbase-default.xml</code>.
</p>
<p>
At minimum, you should set the list of servers that you want ZooKeeper to run
on using the <code>hbase.zookeeper.quorum</code> property.
This property defaults to <code>localhost</code> which is not suitable for a
fully distributed HBase.
It is recommended to run a ZooKeeper quorum of 5 or 7 machines, and give each
server around 1GB to ensure that they don't swap.
It is also recommended to run the ZooKeeper servers on separate machines from
the Region Servers with their own disks.
If this is not easily doable for you, choose 5 of your region servers to run the
ZooKeeper servers on.
</p>
<p>
As an example, to have HBase manage a ZooKeeper quorum on nodes
rs{1,2,3,4,5}.example.com, bound to port 2222 (the default is 2181), use:
<pre>
${HBASE_HOME}/conf/hbase-env.sh:
...
# Tell HBase whether it should manage it's own instance of Zookeeper or not.
export HBASE_MANAGES_ZK=true
${HBASE_HOME}/conf/hbase-site.xml:
&lt;configuration&gt;
...
&lt;property&gt;
&lt;name&gt;hbase.zookeeper.property.clientPort&lt;/name&gt;
&lt;value&gt;2222&lt;/value&gt;
&lt;description&gt;Property from ZooKeeper's config zoo.cfg.
The port at which the clients will connect.
&lt;/description&gt;
&lt;/property&gt;
...
&lt;property&gt;
&lt;name&gt;hbase.zookeeper.quorum&lt;/name&gt;
&lt;value&gt;rs1.example.com,rs2.example.com,rs3.example.com,rs4.example.com,rs5.example.com&lt;/value&gt;
&lt;description&gt;Comma separated list of servers in the ZooKeeper Quorum.
For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com".
By default this is set to localhost for local and pseudo-distributed modes
of operation. For a fully-distributed setup, this should be set to a full
list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh
this is the list of servers which we will start/stop ZooKeeper on.
&lt;/description&gt;
&lt;/property&gt;
...
&lt;/configuration&gt;
</pre>
</p>
<p>
When HBase manages ZooKeeper, it will start/stop the ZooKeeper servers as a part
of the regular start/stop scripts. If you would like to run it yourself, you can
do:
<pre>
${HBASE_HOME}/bin/hbase-daemons.sh {start,stop} zookeeper
</pre>
Note that you can use HBase in this manner to spin up a ZooKeeper cluster,
unrelated to HBase. Just make sure to set <code>HBASE_MANAGES_ZK</code> to
<code>false</code> if you want it to stay up so that when HBase shuts down it
doesn't take ZooKeeper with it.
</p>
<p>Of note, if you have made <i>HDFS client configuration</i> on your hadoop cluster, HBase will not