HADOOP-1920 Wrapper scripts broken when hadoop in one location and

hbase in another git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@577355 13f79535-47bb-0310-9956-ffa450edef68
2007-09-19 16:45:01 +00:00 · 2007-09-19 16:45:01 +00:00 · 3d7bec584c
commit 3d7bec584c
parent 1657302ec1
9 changed files with 44 additions and 298 deletions
--- a/CHANGES.txt
+++ b/CHANGES.txt
@ -48,8 +48,10 @@ Trunk (unreleased changes)
    HADOOP-1870 Once file system failure has been detected, don't check it again
                and get on with shutting down the hbase cluster.
    HADOOP-1888 NullPointerException in HMemcacheScanner (reprise)
-    HADOOP-1903 Possible data loss if Exception happens between snapshot and flush
-                to disk.
+    HADOOP-1903 Possible data loss if Exception happens between snapshot and
+                flush to disk.
+    HADOOP-1920 Wrapper scripts broken when hadoop in one location and hbase in
+                another

  IMPROVEMENTS
    HADOOP-1737 Make HColumnDescriptor data publically members settable
--- a/README.txt
+++ b/README.txt
@ -1,285 +1 @@
-HBASE
-Michael Cafarella
-
-
-This document gives a quick overview of HBase, the Hadoop simple
-database. It is extremely similar to Google's BigTable, with a just a
-few differences. If you understand BigTable, great. If not, you should
-still be able to understand this document.
-
---------------------------------------------------------------
-I.
-
-HBase uses a data model very similar to that of BigTable. Users store
-data rows in labelled tables. A data row has a sortable key and an
-arbitrary number of columns. The table is stored sparsely, so that
-rows in the same table can have crazily-varying columns, if the user
-likes.
-
-A column name has the form "<group>:<label>" where <group> and <label>
-can be any string you like. A single table enforces its set of
-<group>s (called "column groups"). You can only adjust this set of
-groups by performing administrative operations on the table. However,
-you can use new <label> strings at any write without preannouncing
-it. HBase stores column groups physically close on disk. So the items
-in a given column group should have roughly the same write/read
-behavior.
-
-Writes are row-locked only. You cannot lock multiple rows at once. All
-row-writes are atomic by default.
-
-All updates to the database have an associated timestamp. The HBase
-will store a configurable number of versions of a given cell. Clients
-can get data by asking for the "most recent value as of a certain
-time". Or, clients can fetch all available versions at once.
-
---------------------------------------------------------------
-II.
-
-To the user, a table seems like a list of data tuples, sorted by row
-key. Physically, tables are broken into HRegions. An HRegion is
-identified by its tablename plus a start/end-key pair. A given HRegion
-with keys <start> and <end> will store all the rows from (<start>,
-<end>]. A set of HRegions, sorted appropriately, forms an entire
-table.
-
-All data is physically stored using Hadoop's DFS. Data is served to
-clients by a set of HRegionServers, usually one per machine. A given
-HRegion is served by only one HRegionServer at a time.
-
-When a client wants to make updates, it contacts the relevant
-HRegionServer and commits the update to an HRegion. Upon commit, the
-data is added to the HRegion's HMemcache and to the HRegionServer's
-HLog. The HMemcache is a memory buffer that stores and serves the
-most-recent updates. The HLog is an on-disk log file that tracks all
-updates. The commit() call will not return to the client until the
-update has been written to the HLog.
-
-When serving data, the HRegion will first check its HMemcache. If not
-available, it will then check its on-disk HStores. There is an HStore
-for each column family in an HRegion. An HStore might consist of
-multiple on-disk HStoreFiles. Each HStoreFile is a B-Tree-like
-structure that allow for relatively fast access.
-
-Periodically, we invoke HRegion.flushcache() to write the contents of
-the HMemcache to an on-disk HStore's files. This adds a new HStoreFile
-to each HStore. The HMemcache is then emptied, and we write a special
-token to the HLog, indicating the HMemcache has been flushed.
-
-On startup, each HRegion checks to see if there have been any writes
-to the HLog since the most-recent invocation of flushcache(). If not,
-then all relevant HRegion data is reflected in the on-disk HStores. If
-yes, the HRegion reconstructs the updates from the HLog, writes them
-to the HMemcache, and then calls flushcache(). Finally, it deletes the
-HLog and is now available for serving data.
-
-Thus, calling flushcache() infrequently will be less work, but
-HMemcache will consume more memory and the HLog will take a longer
-time to reconstruct upon restart. If flushcache() is called
-frequently, the HMemcache will take less memory, and the HLog will be
-faster to reconstruct, but each flushcache() call imposes some
-overhead.
-
-The HLog is periodically rolled, so it consists of multiple
-time-sorted files. Whenever we roll the HLog, the HLog will delete all
-old log files that contain only flushed data. Rolling the HLog takes
-very little time and is generally a good idea to do from time to time.
-
-Each call to flushcache() will add an additional HStoreFile to each
-HStore. Fetching a file from an HStore can potentially access all of
-its HStoreFiles. This is time-consuming, so we want to periodically
-compact these HStoreFiles into a single larger one. This is done by
-calling HStore.compact().
-
-Compaction is a very expensive operation. It's done automatically at
-startup, and should probably be done periodically during operation.
-
-The Google BigTable paper has a slightly-confusing hierarchy of major
-and minor compactions. We have just two things to keep in mind:
-
-1) A "flushcache()" drives all updates out of the memory buffer into
-   on-disk structures. Upon flushcache, the log-reconstruction time
-   goes to zero. Each flushcache() will add a new HStoreFile to each
-   HStore.
-
-2) a "compact()" consolidates all the HStoreFiles into a single
-   one. It's expensive, and is always done at startup.
-
-Unlike BigTable, Hadoop's HBase allows no period where updates have
-been "committed" but have not been written to the log. This is not
-hard to add, if it's really wanted.
-
-We can merge two HRegions into a single new HRegion by calling
-HRegion.closeAndMerge(). We can split an HRegion into two smaller
-HRegions by calling HRegion.closeAndSplit().
-
-OK, to sum up so far:
-
-1) Clients access data in tables.
-2) tables are broken into HRegions.
-3) HRegions are served by HRegionServers. Clients contact an
-   HRegionServer to access the data within its row-range.
-4) HRegions store data in:
-  a) HMemcache, a memory buffer for recent writes
-  b) HLog, a write-log for recent writes
-  c) HStores, an efficient on-disk set of files. One per col-group.
-     (HStores use HStoreFiles.)
-
---------------------------------------------------------------
-III.
-
-Each HRegionServer stays in contact with the single HBaseMaster. The
-HBaseMaster is responsible for telling each HRegionServer what
-HRegions it should load and make available.
-
-The HBaseMaster keeps a constant tally of which HRegionServers are
-alive at any time. If the connection between an HRegionServer and the
-HBaseMaster times out, then:
-
- a) The HRegionServer kills itself and restarts in an empty state.
-
- b) The HBaseMaster assumes the HRegionServer has died and reallocates
-    its HRegions to other HRegionServers
-
-Note that this is unlike Google's BigTable, where a TabletServer can
-still serve Tablets after its connection to the Master has died. We
-tie them together, because we do not use an external lock-management
-system like BigTable. With BigTable, there's a Master that allocates
-tablets and a lock manager (Chubby) that guarantees atomic access by
-TabletServers to tablets. HBase uses just a single central point for
-all HRegionServers to access: the HBaseMaster.
-
-(This is no more dangerous than what BigTable does. Each system is
-reliant on a network structure (whether HBaseMaster or Chubby) that
-must survive for the data system to survive. There may be some
-Chubby-specific advantages, but that's outside HBase's goals right
-now.)
-
-As HRegionServers check in with a new HBaseMaster, the HBaseMaster
-asks each HRegionServer to load in zero or more HRegions. When the
-HRegionServer dies, the HBaseMaster marks those HRegions as
-unallocated, and attempts to give them to different HRegionServers.
-
-Recall that each HRegion is identified by its table name and its
-key-range. Since key ranges are contiguous, and they always start and
-end with NULL, it's enough to simply indicate the end-key.
-
-Unfortunately, this is not quite enough. Because of merge() and
-split(), we may (for just a moment) have two quite different HRegions
-with the same name. If the system dies at an inopportune moment, both
-HRegions may exist on disk simultaneously. The arbiter of which
-HRegion is "correct" is the HBase meta-information (to be discussed
-shortly). In order to distinguish between different versions of the
-same HRegion, we also add a unique 'regionId' to the HRegion name.
-
-Thus, we finally get to this identifier for an HRegion:
-
-tablename + endkey + regionId.
-
-You can see this identifier being constructed in
-HRegion.buildRegionName().
-
-We can also use this identifier as a row-label in a different
-HRegion. Thus, the HRegion meta-info is itself stored in an
-HRegion. We call this table, which maps from HRegion identifiers to
-physical HRegionServer locations, the META table.
-
-The META table itself can grow large, and may be broken into separate
-HRegions. To locate all components of the META table, we list all META
-HRegions in a ROOT table. The ROOT table is always contained in a
-single HRegion.
-
-Upon startup, the HRegionServer immediately attempts to scan the ROOT
-table (because there is only one HRegion for the ROOT table, that
-HRegion's name is hard-coded). It may have to wait for the ROOT table
-to be allocated to an HRegionServer.
-
-Once the ROOT table is available, the HBaseMaster can scan it and
-learn of all the META HRegions. It then scans the META table. Again,
-the HBaseMaster may have to wait for all the META HRegions to be
-allocated to different HRegionServers.
-
-Finally, when the HBaseMaster has scanned the META table, it knows the
-entire set of HRegions. It can then allocate these HRegions to the set
-of HRegionServers.
-
-The HBaseMaster keeps the set of currently-available HRegionServers in
-memory. Since the death of the HBaseMaster means the death of the
-entire system, there's no reason to store this information on
-disk. All information about the HRegion->HRegionServer mapping is
-stored physically on different tables. Thus, a client does not need to
-contact the HBaseMaster after it learns the location of the ROOT
-HRegion. The load on HBaseMaster should be relatively small: it deals
-with timing out HRegionServers, scanning the ROOT and META upon
-startup, and serving the location of the ROOT HRegion.
-
-The HClient is fairly complicated, and often needs to navigate the
-ROOT and META HRegions when serving a user's request to scan a
-specific user table. If an HRegionServer is unavailable or it does not
-have an HRegion it should have, the HClient will wait and retry. At
-startup or in case of a recent HRegionServer failure, the correct
-mapping info from HRegion to HRegionServer may not always be
-available.
-
-In summary:
-
-1) HRegionServers offer access to HRegions (an HRegion lives at one
-   HRegionServer)
-2) HRegionServers check in with the HBaseMaster
-3) If the HBaseMaster dies, the whole system dies
-4) The set of current HRegionServers is known only to the HBaseMaster
-5) The mapping between HRegions and HRegionServers is stored in two
-   special HRegions, which are allocated to HRegionServers like any
-   other.
-6) The ROOT HRegion is a special one, the location of which the
-   HBaseMaster always knows.
-7) It's the HClient's responsibility to navigate all this.
-
-
---------------------------------------------------------------
-IV.
-
-What's the current status of all this code?
-
-As of this writing, there is just shy of 7000 lines of code in the
-"hbase" directory.
-
-All of the single-machine operations (safe-committing, merging,
-splitting, versioning, flushing, compacting, log-recovery) are
-complete, have been tested, and seem to work great.
-
-The multi-machine stuff (the HBaseMaster, the HRegionServer, and the
-HClient) have not been fully tested. The reason is that the HClient is
-still incomplete, so the rest of the distributed code cannot be
-fully-tested. I think it's good, but can't be sure until the HClient
-is done. However, the code is now very clean and in a state where
-other people can understand it and contribute.
-
-Other related features and TODOs:
-
-1) Single-machine log reconstruction works great, but distributed log
-   recovery is not yet implemented. This is relatively easy, involving
-   just a sort of the log entries, placing the shards into the right
-   DFS directories
-
-2) Data compression is not yet implemented, but there is an obvious
-   place to do so in the HStore.
-
-3) We need easy interfaces to MapReduce jobs, so they can scan tables
-
-4) The HMemcache lookup structure is relatively inefficient
-
-5) File compaction is relatively slow; we should have a more
-   conservative algorithm for deciding when to apply compaction.
-
-6) For the getFull() operation, use of Bloom filters would speed
-   things up
-
-7) We need stress-test and performance-number tools for the whole
-   system
-
-8) There's some HRegion-specific testing code that worked fine during
-   development, but it has to be rewritten so it works against an
-   HRegion while it's hosted by an HRegionServer, and connected to an
-   HBaseMaster. This code is at the bottom of the HRegion.java file.
-
+See http://wiki.apache.org/lucene-hadoop/Hbase
--- a/bin/hbase
+++ b/bin/hbase
@ -127,8 +127,16 @@ fi
 IFS=

 # for releases, add core hbase, hadoop jar & webapps to CLASSPATH
+# Look in two places for our hbase jar.
+for f in $HBASE_HOME/hadoop-*-hbase*.jar; do
+  if [ -f $f ]; then
+    CLASSPATH=${CLASSPATH}:$f;
+  fi
+done
 for f in $HADOOP_HOME/contrib/hadoop-*-hbase*.jar; do
-  CLASSPATH=${CLASSPATH}:$f;
+  if [ -f $f ]; then
+    CLASSPATH=${CLASSPATH}:$f;
+  fi
 done
 if [ -d "$HADOOP_HOME/webapps" ]; then
  CLASSPATH=${CLASSPATH}:$HADOOP_HOME
--- a/bin/hbase-config.sh
+++ b/bin/hbase-config.sh
@ -81,4 +81,4 @@ HADOOP_CONF_DIR="${HADOOP_CONF_DIR:-$HADOOP_HOME/conf}"
 # Allow alternate hbase conf dir location.
 HBASE_CONF_DIR="${HBASE_CONF_DIR:-$HBASE_HOME/conf}"
 # List of hbase regions servers.
-HBASE_REGIONSERVERS="${HBASE_REGIONSERVERS:-$HBASE_HOME/conf/regionservers}"
+HBASE_REGIONSERVERS="${HBASE_REGIONSERVERS:-$HBASE_CONF_DIR/regionservers}"
--- a/bin/hbase-daemon.sh
+++ b/bin/hbase-daemon.sh
@ -33,7 +33,9 @@
 #
 # Modelled after $HADOOP_HOME/bin/hadoop-daemon.sh

-usage="Usage: hbase-daemon.sh [--config=<hadoop-conf-dir>] [--hbaseconfig=<hbase-conf-dir>] <hbase-command> (start|stop) <args...>"
+usage="Usage: hbase-daemon.sh [--config=<hadoop-conf-dir>]\
+ [--hbaseconfig=<hbase-conf-dir>] <hbase-command> (start|stop)\
+ <args...>"

 # if no args specified, show usage
 if [ $# -le 1 ]; then
@ -114,7 +116,10 @@ case $startStop in

    hbase_rotate_log $log
    echo starting $command, logging to $log
-    nohup nice -n $HADOOP_NICENESS "$HBASE_HOME"/bin/hbase --config="${HADOOP_CONF_DIR}" --hbaseconfig="${HBASE_CONF_DIR}" $command $startStop "$@" > "$log" 2>&1 < /dev/null &
+    nohup nice -n $HADOOP_NICENESS "$HBASE_HOME"/bin/hbase \
+        --hadoop="${HADOOP_HOME}" \
+        --config="${HADOOP_CONF_DIR}" --hbaseconfig="${HBASE_CONF_DIR}" \
+        $command $startStop "$@" > "$log" 2>&1 < /dev/null &
    echo $! > $pid
    sleep 1; head "$log"
    ;;
@ -123,7 +128,10 @@ case $startStop in
    if [ -f $pid ]; then
      if kill -0 `cat $pid` > /dev/null 2>&1; then
        echo -n stopping $command
-        nohup nice -n $HADOOP_NICENESS "$HBASE_HOME"/bin/hbase --config="${HADOOP_CONF_DIR}" --hbaseconfig="${HBASE_CONF_DIR}" $command $startStop "$@" > "$log" 2>&1 < /dev/null &
+        nohup nice -n $HADOOP_NICENESS "$HBASE_HOME"/bin/hbase \
+            --hadoop="${HADOOP_HOME}" \
+            --config="${HADOOP_CONF_DIR}" --hbaseconfig="${HBASE_CONF_DIR}" \
+            $command $startStop "$@" > "$log" 2>&1 < /dev/null &
        while kill -0 `cat $pid` > /dev/null 2>&1; do
          echo -n "."
          sleep 1;
--- a/bin/hbase-daemons.sh
+++ b/bin/hbase-daemons.sh
@ -23,7 +23,10 @@
 # Run a Hadoop hbase command on all slave hosts.
 # Modelled after $HADOOP_HOME/bin/hadoop-daemons.sh

-usage="Usage: hbase-daemons.sh [--config=<confdir>] [--hbaseconfig=<hbase-confdir>] [--hosts=regionserversfile] command [start|stop] args..."
+usage="Usage: hbase-daemons.sh [--hadoop=<hadoop-home>]
+ [--config=<hadoop-confdir>] [--hbase=<hbase-home>]\
+ [--hbaseconfig=<hbase-confdir>] [--hosts=regionserversfile]\
+ command [start|stop] args..."

 # if no args specified, show usage
 if [ $# -le 1 ]; then
@ -36,4 +39,8 @@ bin=`cd "$bin"; pwd`

 . $bin/hbase-config.sh

-exec "$bin/regionservers.sh" --config="${HADOOP_CONF_DIR}" --hbaseconfig="${HBASE_CONF_DIR}" cd "$HBASE_HOME" \; "$bin/hbase-daemon.sh" --config="${HADOOP_CONF_DIR}" --hbaseconfig="${HBASE_CONF_DIR}" "$@"
+exec "$bin/regionservers.sh" --config="${HADOOP_CONF_DIR}" \
+ --hbaseconfig="${HBASE_CONF_DIR}" --hadoop="${HADOOP_HOME}" \
+ cd "${HBASE_HOME}" \; \
+ "$bin/hbase-daemon.sh" --config="${HADOOP_CONF_DIR}" \
+ --hbaseconfig="${HBASE_CONF_DIR}" --hadoop="${HADOOP_HOME}" "$@"
--- a/bin/regionservers.sh
+++ b/bin/regionservers.sh
@ -33,7 +33,8 @@
 #
 # Modelled after $HADOOP_HOME/bin/slaves.sh.

-usage="Usage: regionservers [--config=<confdir>] [--hbaseconfig=<hbase-confdir>] command..."
+usage="Usage: regionservers [--config=<hadoop-confdir>]\
+ [--hbaseconfig=<hbase-confdir>] command..."

 # if no args specified, show usage
 if [ $# -le 0 ]; then
--- a/bin/start-hbase.sh
+++ b/bin/start-hbase.sh
@ -38,5 +38,8 @@ if [ $errCode -ne 0 ]
 then
  exit $errCode
 fi
-"$bin"/hbase-daemon.sh --config="${HADOOP_CONF_DIR}" --hbaseconfig="${HBASE_CONF_DIR}" master start
-"$bin"/hbase-daemons.sh --config="${HADOOP_CONF_DIR}" --hbaseconfig="${HBASE_CONF_DIR}" regionserver start
+"$bin"/hbase-daemon.sh --config="${HADOOP_CONF_DIR}" \
+    --hbaseconfig="${HBASE_CONF_DIR}" master start
+"$bin"/hbase-daemons.sh --config="${HADOOP_CONF_DIR}" \
+    --hbaseconfig="${HBASE_CONF_DIR}" --hadoop="${HADOOP_HOME}" \
+    --hosts="${HBASE_REGIONSERVERS}" regionserver start
--- a/bin/stop-hbase.sh
+++ b/bin/stop-hbase.sh
@ -29,4 +29,5 @@ bin=`cd "$bin"; pwd`

 . "$bin"/hbase-config.sh

-"$bin"/hbase-daemon.sh --config="${HADOOP_CONF_DIR}" --hbaseconfig="${HBASE_CONF_DIR}" master stop
+"$bin"/hbase-daemon.sh --config="${HADOOP_CONF_DIR}" \
+    --hbaseconfig="${HBASE_CONF_DIR}" master stop