Commit Graph

18208 Commits

Author SHA1 Message Date
Michael Stack ea418aebbd HADOOP-1835 Updated Documentation for HBase setup/installation
M    hbase/conf/hbase-env.sh
    Removed JAVA_HOME references.
M    hbase/src/java/org/apache/hadoop/hbase/package.html
    Improved setup instruction


git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@572983 13f79535-47bb-0310-9956-ffa450edef68
2007-09-05 16:12:39 +00:00
Michael Stack 1d19158ac5 HADOOP-1834 Scanners ignore timestamp passed on creation
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/HBaseTestCase.java
    (addContents): Added overrides that allow specifying a timestamp.
M  src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestTimestamp.java
    Made it so test inherits from HBaseTestCase instead of from HBaseClusterTestCase
    so could add in tests that do not use cluster.
    (testTimestampScanning): Added test for hadoop-1834 bug.
    (testTimestamp): Refactored to remove duplicated code.
M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStore.java
    (getNext): Make it respect the timestamp set on construction.
M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMemcache.java
    Removed eclipse yellow flag warnings around empty parens and
    auto-boxing longs.
    (getNext): Make it respect the timestamp set on construction.


git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@572980 13f79535-47bb-0310-9956-ffa450edef68
2007-09-05 16:00:01 +00:00
Jim Kellerman 49d4f333f4 HADOOP-1832 listTables() returns duplicate tables
git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@572826 13f79535-47bb-0310-9956-ffa450edef68
2007-09-04 22:26:02 +00:00
Jim Kellerman 844e56e704 HADOOP-1821 Replace all String.getBytes() with String.getBytes("UTF-8")
git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@571711 13f79535-47bb-0310-9956-ffa450edef68
2007-09-01 06:22:01 +00:00
Jim Kellerman a1689adf0e HADOOP-1760 Use new MapWritable and SortedMapWritable classes from org.apache.hadoop.io
git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@571350 13f79535-47bb-0310-9956-ffa450edef68
2007-08-31 00:37:46 +00:00
Jim Kellerman aba565a228 HADOOP-1797 Fix NPEs in MetaScanner constructor
git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@571333 13f79535-47bb-0310-9956-ffa450edef68
2007-08-30 22:12:45 +00:00
Jim Kellerman 12a62a6333 HADOOP-1814 TestCleanRegionServerExit fails too often on Hudson
git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@571332 13f79535-47bb-0310-9956-ffa450edef68
2007-08-30 22:09:13 +00:00
Michael Stack 7d48b5f7d5 HADOOP-1800 [hbaseshell] output should default utf8 encoding
M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/ConsoleTable.java
    Make a PrintStream that outputs utf8. Have all printing use it.
M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/SelectCommand.java
    Fix few places where we make Strings w/o stipulating UTF-8 encoding.



git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@571317 13f79535-47bb-0310-9956-ffa450edef68
2007-08-30 21:28:25 +00:00
Michael Stack 463f847bd4 HADOOP-1802 Startup scripts should wait until hdfs as cleared 'safe mode'
M    bin/start-hbase.sh
    Wait to exit 'safe mode' before proceeding w/ startup. If the wait
    fails, usually because there is no fs, exit with error.



git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@571211 13f79535-47bb-0310-9956-ffa450edef68
2007-08-30 15:39:11 +00:00
Michael Stack 59825ac875 HADOOP-1785 TableInputFormat.TableRecordReader.next has a bug
M  src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestTableMapReduce.java
    (localTestSingleRegionTable, localTestMultiRegionTable, verify): Added.
M  src/contrib/hbase/src/test/org/apache/hadoop/hbase/HBaseTestCase.java
    Javadoc for addContents and Loader interface and implementations.
    Methods have been made static so accessible w/o subclassing.
M  src/contrib/hbase/src/test/org/apache/hadoop/hbase/MultiRegionTable.java
    Guts of TestSplit has been moved here so other tests can have
    access to a multiregion table.
M  src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestSplit.java
    Bulk moved to MultiRegionTable utility class.  Use this new class
    instead.
M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HTable.java
    Added '@deprecated' javadoc.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMaster.java
    Was throwing RuntimeException when a msgQueue.put was interrupted
    but this is a likely event on shutdown.  Log a message instead.
M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/mapred/TableInputFormat.java
    Actually fix for HADOOP-1785... reverse test of row comparison.


git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@570983 13f79535-47bb-0310-9956-ffa450edef68
2007-08-29 23:39:52 +00:00
Jim Kellerman 5dc1ee32c0 HADOOP-1805 Region server hang on exit
Catch runtime exceptions in HMemcacheScanner constructor to ensure that read lock is released.

git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@570918 13f79535-47bb-0310-9956-ffa450edef68
2007-08-29 20:06:41 +00:00
Michael Stack 47587a272f HADOOP-1799 Incorrect classpath in binary version of Hadoop
M    bin/hbase
    Had a hard-coded name for the hbase jar.  Fix so allows
    for version in jar name.


git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@570633 13f79535-47bb-0310-9956-ffa450edef68
2007-08-29 04:30:58 +00:00
Jim Kellerman 660bce1d27 HADOOP-1797 Fix NPEs in MetaScanner constructor
git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@570583 13f79535-47bb-0310-9956-ffa450edef68
2007-08-28 22:08:56 +00:00
Jim Kellerman f56ee6b375 HADOOP-1757 Bloomfilters: single argument constructor, use enum for bloom filter types
git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@570270 13f79535-47bb-0310-9956-ffa450edef68
2007-08-27 23:35:15 +00:00
Michael Stack e9aafde1f1 HADOOP-1780 Regions are still being doubly assigned
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStore.java
    Fix outputing fail message on each compaction though there was none.
M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStoreFile.java
    (rename): Refactor so return only happens on end..
M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMaster.java
    (assignRegions): Make synchronized.  In presence of concurrent visits
    by regionservers, both visiting threads could grab same set of regions
    for assignment.



git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@569589 13f79535-47bb-0310-9956-ffa450edef68
2007-08-25 00:42:45 +00:00
Michael Stack 07224fe303 HADOOP-1768 Hbase shell FS command using Hadoop FsShell operations
M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/HelpContents.java
    Add mention of new 'fs' operator.
A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/FsCommand.java
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/HBaseShell.jj
    Added support of new 'fs' operator.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/generated/ParserTokenManager.java
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/generated/ParserConstants.java
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/generated/Parser.java
    Generated files.  Changes come of mods to HBaseShell.jj
    


git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@569484 13f79535-47bb-0310-9956-ffa450edef68
2007-08-24 19:32:29 +00:00
Michael Stack 5e3c5037b4 HADOOP-1776 Fix for sporadic compaction failures closing and moving compaction
result

M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HConnectionManager.java
    Minor fix of a log message.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStore.java
    (COMPACTION_DIR, WORKING_COMPACTION): Removed.
    (compactdir): Renamed compactionDir.
    Removed from constructor our checking if a compaction was left undone.
    Instead, just ignore it.  When compaction reruns whatever as left on 
    filesystem will just be cleaned up and we'll rerun the compaction 
    (Likelihood of a crash mid-compaction in exactly the area where
    the compaction was recoverable are low -- more robust just redoing
    the compaction from scratch).
    (compactHelper): We were deleting HBaseRoot/compaction.tmp dir
    after a compaction completed. Usually fine but on a cluster of
    more than one machine, if two compactions were near-concurrent, one
    machine could remove the compaction working directory while another
    was mid-way through its compaction.  Result was odd failures
    during compaction of result file, during the move of the resulting
    compacting file or subsequently trying to open reader on the
    resulting compaction file (See HADOOP-1765).
    a region fsck tool).
    (getFilesToCompact): Added.
    (processReadyCompaction): Added.  Reorganized compaction so that the
    window during which loss-of-data is possible is narrowed and even
    then, we log a message with how a restore might be performed manually
    (TODO: Add a repair tool).
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStoreFile.java
    (rename): More checking around rename that it was successful.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HLog.java
    An empty-log gives HLog trouble.  Added handling.
M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMaster.java
    Cleanup of debug level logging.
M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java
    Minor javadoc and changed a log from info to debug.


git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@569446 13f79535-47bb-0310-9956-ffa450edef68
2007-08-24 16:24:40 +00:00
Doug Cutting d5cc43d394 HADOOP-1689. Make shell scripts more portable.
git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@568809 13f79535-47bb-0310-9956-ffa450edef68
2007-08-23 02:19:18 +00:00
Jim Kellerman 1756a22f03 HADOOP-1746 Clean up findbugs warnings
git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@568776 13f79535-47bb-0310-9956-ffa450edef68
2007-08-22 23:59:30 +00:00
Doug Cutting cf43495417 HADOOP-1731. Add Hadoop's version number to contrib jar file names.
git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@568706 13f79535-47bb-0310-9956-ffa450edef68
2007-08-22 17:13:08 +00:00
Jim Kellerman ccd9248e63 HADOOP-1527 Region server won't start because logdir exists
git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@568700 13f79535-47bb-0310-9956-ffa450edef68
2007-08-22 16:59:43 +00:00
Michael Stack ec2d29c902 HADOOP-1747 On a cluster, on restart, regions multiply assigned
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMaster.java
    Removed some empty lines so I can squeeze more code into a screenful.
    (assignedRegions): Factored out some code into own methods so
    this method is made a bit shorter.  Added early returns near
    top -- if nothing to assign, etc. -- so less nesting.
    Added fix: Instead of iterating over unassignedRegions after
    all the loadings have been calculated, instead iterate over
    the locally calculated  map, regionsToAssign (Otherwise, we
    were running over the same territory each time through the
    loop and were thus giving out same region multiple times).
    (regionsPerServer, assignRegionsToOneServer,
      getRegionsToAssign): Added.


git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@568404 13f79535-47bb-0310-9956-ffa450edef68
2007-08-22 04:05:08 +00:00
Michael Stack e41859593b HADOOP-1737 Make HColumnDescriptor data publically members settable
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStore.java
  Use new HColumnDescriptor accessors rather than make direct accesses
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HColumnDescriptor.java
  (COMPRESSION_NONE, COMPRESSION_RECORD, COMPRESSION_BLOCK): Removed.
  Use enum ordinals instead. Removed mapping between these defines and
  enum equivalents. Made data members private and added accessors.
  (DEFAULT_IN_MEMORY, DEFAULT_COMPRESSION_TYPE,
  DEFAULT_BLOOM_FILTER_DESCRIPTOR, DEFAULT_MAX_VALUE_LENGTH): Added.
M hbase/src/test/org/apache/hadoop/hbase/TestToString.java
  Fix because enum strings are upper-case (was expecting lowercase).


git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@568275 13f79535-47bb-0310-9956-ffa450edef68
2007-08-21 20:53:24 +00:00
Jim Kellerman 4ebd558b41 HADOOP-1723 If master asks region server to shut down, by-pass return of shutdown message
git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@567864 13f79535-47bb-0310-9956-ffa450edef68
2007-08-20 22:35:27 +00:00
Michael Stack 00c1b877e8 HADOOP-1730 unexpected null value causes META scanner to exit (silently)
Added handling for legal null value scanning META table and added
logging of unexpected exceptions that arise scanning.

M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestSplit.java
    Refactored to do a staged removal of daughter references.
    (compact, recalibrate): Added.
    (getSplitParent): Refactored as getSplitParentInfo.
M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HConnectionManager.java
    Added formatting of the find table result string so shorter
    (when 30-odd regions fills page with its output).
M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HTable.java
    Formatting to clean eclipse warnings.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMaster.java
    The split column in a parent meta table entry can be null (Happens
    if a daughter split no longer has references -- it removes its
    entry from parent).  Add handling and clean up around split
    management code.  Added logging of unexpected exceptions
    scanning a region.
M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java
    Added fix for NPE when client asks for scanner but passes
    non-existent columns.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/util/Writables.java
    (getHRegionInfo, getHRegionInfoOrNull): Added.:



git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@567308 13f79535-47bb-0310-9956-ffa450edef68
2007-08-18 18:04:53 +00:00
Michael Stack 17cc1759fc HADOOP-1729 Recent renaming or META tables breaks hbase shell
M src/java/org/apache/hadoop/hbase/shell/HBaseShell.jj
    Add '.' to list of ID characters.



git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@567292 13f79535-47bb-0310-9956-ffa450edef68
2007-08-18 16:29:09 +00:00
Jim Kellerman b7f01dce98 HADOOP-1709 Make HRegionInterface more like that of HTable
HADOOP-1725 Client find of table regions should not include offlined, split parents

Changes:

New class MapWritable replaces KeyedData and KeyedDataArrayWritable

HBaseAdmin, HConnectionManager, HMaster, HRegionInterface,
HRegionServer, HTable, TestScanner2:
- getRow returns MapWritable instead of array of KeyedData
- next returns MapWritable instead of array of KeyedData

GroupingTableMap, IdentityTableMap, IdentityTableReduce,
TableInputFormat, TableMap, TableOutputCollector, TableOutputFormat,
TestTableMapReduce:
- use MapWritable instead of KeyedData and KeyedDataArrayWritable


git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@566878 13f79535-47bb-0310-9956-ffa450edef68
2007-08-16 22:51:03 +00:00
Doug Cutting e220809017 HADOOP-1231. Add generics to Mapper and Reducer interfaces. Contributed by Tom White.
git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@566798 13f79535-47bb-0310-9956-ffa450edef68
2007-08-16 18:45:49 +00:00
Michael Stack 9dedf26b0f HADOOP-1672 HBase Shell should use new client classes
Use HTable and HTableAdmin to do what HClient used (Removed all
references to HClient).



git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@566467 13f79535-47bb-0310-9956-ffa450edef68
2007-08-16 01:56:29 +00:00
Michael Stack be33a241ce HADOOP-1644 [hbase] Compactions should not block updates
Disentangles flushes and compactions; flushes can proceed while a
compaction is happening.  Also, don't compact unless we hit
compaction threshold: i.e. don't automatically compact on HRegion
startup so regions can come online the faster.

M src/contrib/hbase/conf/hbase-default.xml
    (hbase.hregion.compactionThreashold): Moved to be a hstore property
    as part of encapsulating compaction decision inside hstore.
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/HBaseTestCase.java
    Refactored.  Moved here generalized content loading code that can
    be shared by tests.  Add to setup and teardown the setup and removal
    of local test dir (if it exists).
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestCompare.java
    Added test of HStoreKey compare (It works other than one would at
    first expect).
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestSplit.java
    Bulk of content loading code has been moved up into the parent class.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HConnectionManager.java
    (tableExists): Restore to a check of if the asked-for table is in list of
    tables.  As it was, a check for tableExists would just wait on all timeouts
    and retries to expire and then report table does not exist..  Fixed up
    debug message listing regions of a table.  Added protection against meta
    table not having a COL_REGINFO (Seen in cluster testing -- probably a bug
    in row removal).
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStoreFile.java
    Loading store files, even if it was noticed that there was no corresponding
    map file, was still counting file as valid.  Also fix merger -- was
    constructing MapFile.Reader directly rather than asking HStoreFile for
    the reader (HStoreFile knows how to do MapFile references)
    (rename): Added check that move succeeded and logging.  In cluster-testing,
    the hdfs move of compacted file into place has failed on occasion (Need
    more info).
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStore.java
    Encapsulate ruling on whether a compaction should take place inside HStore.
    Added reading of the compactionThreshold her.  Compaction threshold is
    currently just number of store files.  Later may include other factors such
    as count of reference files.  Cleaned up debug messages around
    reconstruction log.  Removed compaction if size > 1 from constructor.  Let
    compaction happen after we've been deployed (Compactions that happen while
    we are online can continue to take updates.  Compaction in the constructor
    puts off our being able to take in updates).
    (close): Changed so it now returns set of store files.  This used to be done
    by calls to flush. Since flush and compaction have been disentangled, a
    compaction can come in after flush and the list of files could be off.
    Having it done by close, can be sure list of files is complete.
    (flushCache): No longer returns set of store files.  Added 'merging compaction'
    where we pick an arbitrary store file from disk and merge into it the content
    of memcache (Needs work).
    (getAllMapFiles): Renamed getAllStoreFiles.
    (needsCompaction): Added.
    (compactHelper): Added passing of maximum sequence number if already
    calculated. If compacting one file only, we used skip without rewriting
    the info file.  Fixed.
    Refactored.  Moved guts to new  compact(outFile, listOfStores)  method.
    (compact, CompactionReader): Added overrides and interface  to support
    'merging compaction' that takes files and memcache.  In compaction,
    if we failed the move of the compacted file, all data had already been
    deleted.  Changing, so deletion happens after confirmed move of
    compacted file.
    (getFull): Fixed bug where NPE when read of maps came back null.
    Revealed by our NOT compacting stores on startup.  Meant could be two
    backing stores one of which had no data regards queried key.
    (getNMaps): Renamed countOfStoreFiles.
    (toString): Added.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStoreKey.java
    Added comment on 'odd'-looking comparison.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionServer.java
    Javadoc edit. 
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HLogEdit.java
    Only return first 128 bytes of value when toStringing (On cluster,
    was returning complete web pages in log).
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMaster.java
    Removed confusing debug message (made sense once -- but not now).
    Test rootRegionLocation for null before using it (can be null).
M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMemcache.java
    Added comment that delete behavior needs study.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java
    Fixed merge so it doesn't do the incremental based off files
    returned by flush.  Instead all is done in the one go after
    region closes (using files returned by close).
    Moved duplicated code to new filesByFamily method.
    (WriteState): Removed writesOngoing in favor of compacting and
    flushing flags.
    (flushCache): No longer returns list of files.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/util/Writables.java
    Fix javadoc.


git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@566459 13f79535-47bb-0310-9956-ffa450edef68
2007-08-16 01:07:51 +00:00
Jim Kellerman 6c8e713671 HADOOP-1711 HTable API should use interfaces instead of concrete classes as method parameters and return values
git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@566291 13f79535-47bb-0310-9956-ffa450edef68
2007-08-15 18:20:53 +00:00
Jim Kellerman 40e6f64209 HADOOP-1710 - changes.txt
git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@565995 13f79535-47bb-0310-9956-ffa450edef68
2007-08-15 01:18:47 +00:00
Jim Kellerman 391379a6fb HADOOP-1710 All updates should be batch updates
git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@565993 13f79535-47bb-0310-9956-ffa450edef68
2007-08-15 01:12:29 +00:00
Jim Kellerman 0c7ac6795f HADOOP-1678 On region split, master should designate which host should serve daughter splits. Phase 2: Master assigns children of split region instead of HRegionServer serving both children.
git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@565616 13f79535-47bb-0310-9956-ffa450edef68
2007-08-14 03:37:01 +00:00
Jim Kellerman 931d452cb2 HADOOP-1678 On region split, master should designate which host should serve daughter splits.
Phase 1: Master balances load for new regions and when a region server fails.

git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@564780 13f79535-47bb-0310-9956-ffa450edef68
2007-08-10 22:11:05 +00:00
Michael Stack 790e3d767e HADOOP-1662 Make region splits faster
Splits are now near-instantaneous.  On split, daughter splits create
'references' to store files up in the parent region using new 'HalfMapFile'
class to proxy accesses against the top-half or bottom-half of 
backing MapFile.  Parent region is deleted after all references in daughter
regions have been let go.

Below includes other cleanups and at least one bug fix for fails adding
>32k records and improvements to make it more likely TestRegionServerAbort
will complete..

A src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestHStoreFile.java
    Added. Tests new Reference HStoreFiles. Test new HalfMapFileReader inner
    class of HStoreFile. Test that we do the right thing when HStoreFiles
    are smaller than a MapFile index range (i.e. there is not 'MidKey').
    Test we do right thing when key is outside of a HalfMapFile.
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/HBaseTestCase.java
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestGet.java
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestScanner.java
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestTimestamp.java
    getHRegionDir moved from HStoreFile to HRegion.
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestBatchUpdate.java
    Let out exception rather than catch and call 'fail'.
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/MiniHBaseCluster.java
    Refactored so can start and stop a minihbasecluster w/o having to
    subclass this TestCase. Refactored methods in this class to use the
    newly added methods listed below.
    (MasterThread, RegionServerThread, startMaster, startRegionServers
      shutdown): Added.
    Added logging of abort, close and wait.  Also on abort/close
    was doing a remove that made it so subsequent wait had nothing to
    wait on.
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestSplit.java
    Added tests that assert all works properly at region level on
    multiple levels of splits and then do same on a cluster.
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestHRegion.java
    Removed catch and 'fail()'.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStoreFile.java
    Javadoc to explain how split now works. Have constructors flow
    into each other rather than replicate setup per instance. Moved
    in here operations such as delete, rename, and length of store files
    (No need of clients to remember to delete map and info files).
    (REF_NAME_PARSER, Reference, HalfMapFile, isReference,
      writeReferenceFiles, writeSplitInfo, readSplitInfo,
      createOrFail, getReader, getWriter, toString): Added.
    (getMapDir, getMapFilePath, getInfoDir, getInfoFilePath): Added
    a bunch of overrides for reference handling.
    (loadHStoreFiles): Amended to load references off disk.
    (splitStoreFiles): Redone to instead write references into
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStore.java
    Rename maps as readers and mapFiles as storefiles.
    Moved BloomFilterReader and Writer into HStoreFile. Removed
    getMapFileReader and getMapFileWriter (They are in HStoreFile now).
    (getReaders): Added.
    (HStoreSize): Added.  Data Structure to hold aggregated size
    of all HStoreFiles in HStore, the largest, its midkey, and
    if the HStore is splitable (May not be if references).
    Previous we only did largest file; less accurate.
    (getLargestFileSize): Renamed size and redone to aggregate
    sizes, etc.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HColumnDescriptor.java
    Have constructors waterfall down through each other rather than
    repeat initializations.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMerge.java
    Use new HStoreSize structure.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionServer.java
    Added delayed remove of HRegion (Now done in HMaster as part of
    meta scan). Change LOG.error and LOG.warn so they throw stack trace
    instead of just the Exception.toString as message.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HConstants.java
    (COLUMN_FAMILY_STR): Added.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HLog.java
    Added why to log of splitting.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HLogEdit.java
    Short is not big enough to hold edits tha could contain a sizable
    web page.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HTable.java
    (getTableName): Added.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMaster.java
    Added constructor to BaseScanner that takes name of table we're
    scanning (ROOT or META usually). Added to scanOneRegion handling
    of split regions.  Collect splits to check while scanning and
    then outside of the scanning, so we can modify the META table
    is needed, do the checks of daughter regions and update on
    change of state.  Made LOG.warn and LOG.error print stack trace.
    (isSplitParent, cleanupSplits, hasReferences): Added. 
    Added toString to each of the PendingOperation implementations.
    In the ShutdownPendingOperation scan of meta data, removed
    check of startcode (if the server name is that of the dead
    server, it needs reassigning even if start code is good).
    Also, if server name is null -- possible if we are missing
    edits off end of log -- then the region should be reassigned
    just in case its from the dead server.  Also, if reassigning,
    clear from pendingRegions.  Server may have died after sending
    region is up but before the server confirms receipt in the
    meta scan. Added mare detail to each log.  In OpenPendingOperation
    we were trying to clear pendingRegion in wrong place -- it was
    never executed (regions were always pending). 
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionInfo.java
    Add split boolean.  Output offline and split status in toString.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMemcache.java
    Comments.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java
    Moved getRegionDir here from HStoreFile.
    (COL_SPLITA, COL_SPLITB): Added.
    (closeAndSplit): Refactored to use new fast split method.
       StringUtils.formatTimeDiff(System.currentTimeMillis(), startTime));
    (splitStoreFile): Moved into HStoreFile.
    (getSplitRegionDir, getSplitsDir, toString): Added.
    (needsSplit): Refactored to exploit new HStoreSize structure.
    Also manages notion of 'unsplitable' region.
    (largestHStore): Refactored.
    (removeSplitFromMETA, writeSplitToMETA, getSplit, hasReference): Added.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/util/Keying.java
    (intToBytes, getBytes): Added.
A src/contrib/hbase/src/java/org/apache/hadoop/hbase/util/Writables.java    
    Utility reading and writing Writables.


git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@564012 13f79535-47bb-0310-9956-ffa450edef68
2007-08-08 20:30:13 +00:00
Jim Kellerman b205cac36d HADOOP-1466 Clean up warnings, visibility and javadoc issues in HBase.
Works in my environment. Since no changes were made to the code aside from white space adjustment, not testing with Hudson.

git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@562608 13f79535-47bb-0310-9956-ffa450edef68
2007-08-03 22:39:43 +00:00
Jim Kellerman 4fa87a0cbb HADOOP-1528 HClient for multiple tables - expose close table function
HTable

    * added public method close
    * added protected method checkClosed
    * make getConnection public

HConnectionManager

    * a call to getTableServers or reloadTableServers will cause information for closed
      tables to be reloaded

TestHTable

    * new test case


git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@562294 13f79535-47bb-0310-9956-ffa450edef68
2007-08-03 00:02:19 +00:00
Jim Kellerman 84ef0ba801 HADOOP-1528 HClient for multiple tables (phase 2) all HBase client side code
(except TestHClient and HBaseShell) have been converted to use the new client
side objects (HTable/HBaseAdmin/HConnection) instead of HClient.

HBaseAdmin
- Expose connection methods getMaster, isMasterRunning and listTables



git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@562041 13f79535-47bb-0310-9956-ffa450edef68
2007-08-02 08:15:26 +00:00
Jim Kellerman 2bbcc5a122 HADOOP-1528 HClient for multiple tables (phase 1)
Modified:

HConstants
static final Text[] COL_REGIONINFO_ARRAY = new Text [] {COL_REGIONINFO};
static final Text EMPTY_START_ROW = new Text();

HMaster
- don't process a region server exit message if the lease has timed
  out. Otherwise we end up with two pending server shutdown messages
  to process and chaos ensues.
- don't reassign the root region when the server's lease expires. The
  lease expiration handler will queue a PendingServerShutdown
  operation that must run before the root region is reassigned because
  the HLog of the dead server must be split before any regions served
  by the dead server are reassigned.
- added some additional debug level logging

HBaseClusterTestCase
- call HConnectionManager.deleteConnection(conf) in tearDown() so that
  multiple tests can be run from the same test class.

TestScanner2
- changes to make test compatible with the change from inner class
  HClient.RegionLocation to public class HRegionLocation

Leases
- cancelLease just returns if the lease is not found instead of
  throwing an IOException

New:

HConnection - an interface that describes the operations performed by
a connection implementation

HConnectionManager - manages connections for multiple HBase instances
and returns an object that implements HConnection from its static
method getConnection

HBaseAdmin - the HBase administrative methods refactored out of
HClient. Each HBaseAdmin object can control a single HBase
instance. To manipulate multiple instances, create multiple HBaseAdmin
objects. 

HTable - The data manipulation methods refactored out of HClient. Each
HTable object talks to a single table in a single HBase
instance. Create multiple HTable objects to use more than one table.

HRegionLocation - an inner class refactored out of HClient. Each
HRegionLocation has an HRegionInfo object and an HServerAddress
object.

HClient - totally re-implemented in terms of the new classes
above. HClient is now deprecated.



git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@561935 13f79535-47bb-0310-9956-ffa450edef68
2007-08-01 20:10:11 +00:00
Jim Kellerman a9acbeab08 HADOOP-1468 Add HBase batch update to reduce RPC overhead (restrict batches to a single row at a time)
git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@560014 13f79535-47bb-0310-9956-ffa450edef68
2007-07-26 21:58:22 +00:00
Michael Stack 5b1bd1f8f2 HADOOP-1646 RegionServer OOME's under sustained, substantial loading by
10 concurrent clients


git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@559993 13f79535-47bb-0310-9956-ffa450edef68
2007-07-26 21:30:55 +00:00
Jim Kellerman 5f850fcee4 HADOOP-1516 HClient fails to readjust when ROOT or META redeployed on new region server
Detailed changes:

MiniHBaseCluster
- rewrite abortRegionServer, stopRegionServer - they now remove the
  server from the map of servers.
- rewrite waitOnRegionServer - now removes thread from map of threads

TestCleanRegionServerExit
- reduce Hadoop ipc client timeout and number of retries
- use rewritten stopRegionServer and waitOnRegionServer from MiniHBaseCluster
- add code to verify that failover worked
- moved testRegionServerAbort to separate test file

TestRegionServerAbort
- new test. Uses much the same code as TestCleanRegionServerExit but
  aborts the region server instead of shutting it down
  cleanly. Includes code to verify that failover worked.

hbase-site.xml (in src/contrib/hbase/src/test)
- reduce master lease timeout and time between lease timeout checks so
  that tests will run quicker.

HClient
- Major restructing of code that determines what region server to
  contact for a specific region. The main method findServersForTable
  is now recursive so that it will find the meta and root regions if
  they have not already been located or will re-find them if they have
  been reassigned and the old server can no longer be contacted.
- re-ordered administrative and general purpose methods so they are no
  longer located in seemingly random order.
- re-ordered code in ClientScanner.loadRegions so that if the location
  of the region changes, it will actually try to connect to the new
  server rather than continually trying to use the connection to the
  old server.

HLog
- use HashMap<Text, SequenceFile.Writer> instead of 
  TreeMap<Text, SequenceFile.Writer> because the TreeMap would return
  a value for a key it did not have (it was the value of another
  key). I have observed this before when the key is Text, but could
  not create a simple test case that reproduced the problem.
- added some new DEBUG level logging
- removed call to rollWriter() from closeAndDelete(). We don't need to
  start a new writer if we are closing the log.

HLogKey
- cleaned up per HADOOP-1466 (I initially modified it to add some
  debug logging which was later removed, but when I was making the
  modifications I took the opportunity to clean up the file)
- changed toString() format

HMaster
- better handling of RemoteException
- modified BaseScanner
 - now knows if it is scanning the root or a meta region
 - scanRegion no longer returns a value
 - if scanning the root region, it counts the number of meta regions
   it finds and sets a new AtomicInteger, numberOfMetaRegions when the
   scan is complete.
 - added abstract methods initialScan and maintenanceScan this allowed
   run method to be implemented in the base class.
- boolean rootScanned is now volatile
- modified RootScanner
 - moved actual scan into private method for readability (scanRoot)
 - implementation of abstract methods just call scanRoot
- add constructor for inner static class MetaRegion
- use a BlockingQueue to queue up work for the MetaScanner
- clean up handling of an unexpected region server exit
- PendingOperation.process now returns a boolean so that HMaster.run
  can determine if the operation completed or needs to be retried later
- PendingOperation processing no longer does a wait inside the process
  method since this might cause a deadlock if the current operation is
  waiting for another operation that has yet to be processed

HMsg
- removed MSG_REGIONSERVER_STOP_IN_ARRAY, MSG_NEW_REGION
- added MSG_REPORT_SPLIT

HRegionServer
- changed reportSplit to contain old region and new regions
- use IP from default interface rather than host name
- abort calls HLog.close() instead of HLog.rollWriter()



git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@559819 13f79535-47bb-0310-9956-ffa450edef68
2007-07-26 14:15:17 +00:00
Michael Stack 43e253359a HADOOP-1637 ] Fix to HScanner to Support Filters, Add Filter Tests to
TestScanner2


git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@558897 13f79535-47bb-0310-9956-ffa450edef68
2007-07-23 23:33:05 +00:00
Michael Stack 377bf72458 HADOOP-1579 Add new WhileMatchRowFilter and StopRowFilter filters
git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@558867 13f79535-47bb-0310-9956-ffa450edef68
2007-07-23 21:33:47 +00:00
Michael Stack 915add567f HADOOP-1606 Updated implementation of RowFilterSet, RowFilterInterface
git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@558243 13f79535-47bb-0310-9956-ffa450edef68
2007-07-21 05:06:13 +00:00
Jim Kellerman bf798cca50 HADOOP-1615 Replacing thread notification-based queue with java.util.concurrent.BlockingQueue in HMaster, HRegionServer
git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@557118 13f79535-47bb-0310-9956-ffa450edef68
2007-07-18 02:26:03 +00:00
Michael Stack 6a64ae1542 HADOOP-1616 [hbase] Sporadic TestTable failures
git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@557098 13f79535-47bb-0310-9956-ffa450edef68
2007-07-18 00:38:58 +00:00
Jim Kellerman cbb844d5f0 HADOOP-1468 Add HBase batch update to reduce RPC overhead
git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@556754 13f79535-47bb-0310-9956-ffa450edef68
2007-07-16 22:19:59 +00:00
Jim Kellerman 5463de47b3 HADOOP-1614 [hbase] HClient does not protect itself from simultaneous updates
git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@556359 13f79535-47bb-0310-9956-ffa450edef68
2007-07-15 00:26:15 +00:00