From 0175729a01a720d6976416cea3164755cb17cd6d Mon Sep 17 00:00:00 2001 From: Nitay Joffe Date: Wed, 15 Jul 2009 17:40:39 +0000 Subject: [PATCH] HBASE-1632 Write documentation for configuring/managing ZooKeeper with HBase git-svn-id: https://svn.apache.org/repos/asf/hadoop/hbase/trunk@794331 13f79535-47bb-0310-9956-ffa450edef68 --- src/java/overview.html | 122 ++++++++++++++++++++++++++++------------- 1 file changed, 85 insertions(+), 37 deletions(-) diff --git a/src/java/overview.html b/src/java/overview.html index 7b0451ac342..4820e3279f8 100644 --- a/src/java/overview.html +++ b/src/java/overview.html @@ -174,51 +174,99 @@ The regionserver file lists all hosts running HRegionServers, one h

A distributed HBase depends on a running ZooKeeper cluster. -The ZooKeeper configuration file for HBase is stored at ${HBASE_HOME}/conf/zoo.cfg. -See the ZooKeeper Getting Started Guide -for information about the format and options of that file. Specifically, look at the -Running Replicated ZooKeeper section. +HBase can manage a ZooKeeper cluster for you, or you can manage it on your own +and point HBase to it. +To toggle this option, use the HBASE_MANAGES_ZK variable in +${HBASE_HOME}/conf/hbase-env.sh. +This variable, which defaults to true, tells HBase whether to +start/stop the ZooKeeper quorum servers alongside the rest of the servers.

- -

-Though not recommended, it can be convenient having HBase continue to manage -ZooKeeper even when in distributed mode (It can be good when testing or taking -hbase for a testdrive). Change ${HBASE_HOME}/conf/zoo.cfg and -set the server.0 property to the IP of the node that will be running ZooKeeper -(Leaving the default value of "localhost" will make it impossible to start HBase). +To point HBase at an existing ZooKeeper cluster, add your zoo.cfg +to the CLASSPATH. +HBase will see this file and use it to figure out where ZooKeeper is. +Additionally set HBASE_MANAGES_ZK in ${HBASE_HOME}/conf/hbase-env.sh + to false so that HBase doesn't mess with your ZooKeeper setup:

-  ...
-server.0=example.org:2888:3888
-
+ ... + # Tell HBase whether it should manage it's own instance of Zookeeper or not. + export HBASE_MANAGES_ZK=false
-Then on the example.org server do the following before running HBase. -
-${HBASE_HOME}/bin/hbase-daemon.sh start zookeeper
-
- -

To stop ZooKeeper, after you've shut down hbase, do: -

-
-${HBASE_HOME}/bin/hbase-daemon.sh stop zookeeper
-
-
-Be aware that this option is only recommanded for testing purposes as a failure -on that node would render HBase unusable. +For more information about setting up a ZooKeeper cluster on your own, see +the ZooKeeper Getting Started Guide. +HBase currently uses ZooKeeper version 3.2.0, so any cluster setup with a 3.x.x +version of ZooKeeper should work.

-

-To tell HBase to stop managing a ZooKeeper instance, after configuring -zoo.cfg to point at the ZooKeeper Quorum you'd like HBase to -use, in ${HBASE_HOME}/conf/hbase-env.sh, -set the following to tell HBase to STOP managing its instance of ZooKeeper. -

+To have HBase manage the ZooKeeper cluster, you can use a zoo.cfg + file as above, or edit the options directly in the ${HBASE_HOME}/conf/hbase-site.xml. +Every option from the zoo.cfg has a corresponding property in the +XML configuration file named hbase.zookeeper.property.OPTION. +For example, the clientPort setting in ZooKeeper can be changed by +setting the hbase.zookeeper.property.clientPort property. +For the full list of available properties, see ZooKeeper's zoo.cfg. +For the default values used by HBase, see ${HBASE_HOME}/conf/hbase-default.xml. +

+

+At minimum, you should set the list of servers that you want ZooKeeper to run +on using the hbase.zookeeper.quorum property. +This property defaults to localhost which is not suitable for a +fully distributed HBase. +It is recommended to run a ZooKeeper quorum of 5 or 7 machines, and give each +server around 1GB to ensure that they don't swap. +It is also recommended to run the ZooKeeper servers on separate machines from +the Region Servers with their own disks. +If this is not easily doable for you, choose 5 of your region servers to run the +ZooKeeper servers on. +

+

+As an example, to have HBase manage a ZooKeeper quorum on nodes +rs{1,2,3,4,5}.example.com, bound to port 2222 (the default is 2181), use:

-  ...
-# Tell HBase whether it should manage it's own instance of Zookeeper or not.
-export HBASE_MANAGES_ZK=false
+  ${HBASE_HOME}/conf/hbase-env.sh:
+
+       ...
+      # Tell HBase whether it should manage it's own instance of Zookeeper or not.
+      export HBASE_MANAGES_ZK=true
+
+  ${HBASE_HOME}/conf/hbase-site.xml:
+
+  <configuration>
+    ...
+    <property>
+      <name>hbase.zookeeper.property.clientPort</name>
+      <value>2222</value>
+      <description>Property from ZooKeeper's config zoo.cfg.
+      The port at which the clients will connect.
+      </description>
+    </property>
+    ...
+    <property>
+      <name>hbase.zookeeper.quorum</name>
+      <value>rs1.example.com,rs2.example.com,rs3.example.com,rs4.example.com,rs5.example.com</value>
+      <description>Comma separated list of servers in the ZooKeeper Quorum.
+      For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com".
+      By default this is set to localhost for local and pseudo-distributed modes
+      of operation. For a fully-distributed setup, this should be set to a full
+      list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh
+      this is the list of servers which we will start/stop ZooKeeper on.
+      </description>
+    </property>
+    ...
+  </configuration>
 
-
+

+

+When HBase manages ZooKeeper, it will start/stop the ZooKeeper servers as a part +of the regular start/stop scripts. If you would like to run it yourself, you can +do: +

+  ${HBASE_HOME}/bin/hbase-daemons.sh {start,stop} zookeeper
+
+Note that you can use HBase in this manner to spin up a ZooKeeper cluster, +unrelated to HBase. Just make sure to set HBASE_MANAGES_ZK to +false if you want it to stay up so that when HBase shuts down it +doesn't take ZooKeeper with it.

Of note, if you have made HDFS client configuration on your hadoop cluster, HBase will not