hbase/CHANGES.txt

142 lines
7.4 KiB
Plaintext
Raw Normal View History

HBase Change Log
Trunk (unreleased changes)
INCOMPATIBLE CHANGES
NEW FEATURES
HADOOP-1768 FS command using Hadoop FsShell operations
(Edward Yoon via Stack)
OPTIMIZATIONS
BUG FIXES
HADOOP-1527 Region server won't start because logdir exists
HADOOP-1723 If master asks region server to shut down, by-pass return of
shutdown message
HADOOP-1729 Recent renaming or META tables breaks hbase shell
HADOOP-1730 unexpected null value causes META scanner to exit (silently)
HADOOP-1747 On a cluster, on restart, regions multiply assigned
HADOOP-1776 Fix for sporadic compaction failures closing and moving compaction result M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HConnectionManager.java Minor fix of a log message. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStore.java (COMPACTION_DIR, WORKING_COMPACTION): Removed. (compactdir): Renamed compactionDir. Removed from constructor our checking if a compaction was left undone. Instead, just ignore it. When compaction reruns whatever as left on filesystem will just be cleaned up and we'll rerun the compaction (Likelihood of a crash mid-compaction in exactly the area where the compaction was recoverable are low -- more robust just redoing the compaction from scratch). (compactHelper): We were deleting HBaseRoot/compaction.tmp dir after a compaction completed. Usually fine but on a cluster of more than one machine, if two compactions were near-concurrent, one machine could remove the compaction working directory while another was mid-way through its compaction. Result was odd failures during compaction of result file, during the move of the resulting compacting file or subsequently trying to open reader on the resulting compaction file (See HADOOP-1765). a region fsck tool). (getFilesToCompact): Added. (processReadyCompaction): Added. Reorganized compaction so that the window during which loss-of-data is possible is narrowed and even then, we log a message with how a restore might be performed manually (TODO: Add a repair tool). M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStoreFile.java (rename): More checking around rename that it was successful. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HLog.java An empty-log gives HLog trouble. Added handling. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMaster.java Cleanup of debug level logging. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java Minor javadoc and changed a log from info to debug. git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@569446 13f79535-47bb-0310-9956-ffa450edef68
2007-08-24 12:24:40 -04:00
HADOOP-1776 Fix for sporadic compaction failures closing and moving
compaction result
HADOOP-1780 Regions are still being doubly assigned
HADOOP-1797 Fix NPEs in MetaScanner constructor
HADOOP-1799 Incorrect classpath in binary version of Hadoop
HADOOP-1805 Region server hang on exit
HADOOP-1785 TableInputFormat.TableRecordReader.next has a bug
(Ning Li via Stack)
HADOOP-1800 output should default utf8 encoding
HADOOP-1814 TestCleanRegionServerExit fails too often on Hudson
IMPROVEMENTS
HADOOP-1737 Make HColumnDescriptor data publically members settable
HADOOP-1746 Clean up findbugs warnings
HADOOP-1757 Bloomfilters: single argument constructor, use enum for bloom
filter types
HADOOP-1760 Use new MapWritable and SortedMapWritable classes from
org.apache.hadoop.io
HADOOP-1802 Startup scripts should wait until hdfs as cleared 'safe mode'
Below are the list of changes before 2007-08-18
1. HADOOP-1384. HBase omnibus patch. (jimk, Vuk Ercegovac, and Michael Stack)
2. HADOOP-1402. Fix javadoc warnings in hbase contrib. (Michael Stack)
3. HADOOP-1404. HBase command-line shutdown failing (Michael Stack)
4. HADOOP-1397. Replace custom hbase locking with
java.util.concurrent.locks.ReentrantLock (Michael Stack)
5. HADOOP-1403. HBase reliability - make master and region server more fault
tolerant.
6. HADOOP-1418. HBase miscellaneous: unit test for HClient, client to do
'Performance Evaluation', etc.
7. HADOOP-1420, HADOOP-1423. Findbugs changes, remove reference to removed
class HLocking.
8. HADOOP-1424. TestHBaseCluster fails with IllegalMonitorStateException. Fix
regression introduced by HADOOP-1397.
9. HADOOP-1426. Make hbase scripts executable + add test classes to CLASSPATH.
10. HADOOP-1430. HBase shutdown leaves regionservers up.
11. HADOOP-1392. Part1: includes create/delete table; enable/disable table;
add/remove column.
12. HADOOP-1392. Part2: includes table compaction by merging adjacent regions
that have shrunk in size.
13. HADOOP-1445 Support updates across region splits and compactions
14. HADOOP-1460 On shutdown IOException with complaint 'Cannot cancel lease
that is not held'
15. HADOOP-1421 Failover detection, split log files.
For the files modified, also clean up javadoc, class, field and method
visibility (HADOOP-1466)
16. HADOOP-1479 Fix NPE in HStore#get if store file only has keys < passed key.
17. HADOOP-1476 Distributed version of 'Performance Evaluation' script
18. HADOOP-1469 Asychronous table creation
19. HADOOP-1415 Integrate BSD licensed bloom filter implementation.
20. HADOOP-1465 Add cluster stop/start scripts for hbase
21. HADOOP-1415 Provide configurable per-column bloom filters - part 2.
22. HADOOP-1498. Replace boxed types with primitives in many places.
23. HADOOP-1509. Made methods/inner classes in HRegionServer and HClient protected
instead of private for easier extension. Also made HRegion and HRegionInfo public too.
Added an hbase-default.xml property for specifying what HRegionInterface extension to use
2007-07-05 15:50:04 -04:00
for proxy server connection. (James Kennedy via Jim Kellerman)
24. HADOOP-1534. [hbase] Memcache scanner fails if start key not present
25. HADOOP-1537. Catch exceptions in testCleanRegionServerExit so we can see
what is failing.
26. HADOOP-1543 [hbase] Add HClient.tableExists
2007-07-05 15:50:04 -04:00
27. HADOOP-1519 [hbase] map/reduce interface for HBase. (Vuk Ercegovac and
Jim Kellerman)
HADOOP-1523 'Hung region servers waiting on write locks' On shutdown, region servers and masters were just cancelling leases without letting 'lease expired' code run -- code to clean up outstanding locks in region server. Outstanding read locks were getting in the way of region server getting necessary write locks needed for the shutdown process. Also, cleaned up messaging around shutdown so its clean -- no timeout messages as region servers try to talk to a master that has already shutdown -- even when region servers take their time going down. M src/contrib/hbase/conf/hbase-default.xml Make region server timeout 30 seconds instead of 3 minutes. Clients retry anyways. Make so its likely region servers report in their shutdown message before their lease expires on master. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/Leases.java (closeAfterLeasesExpire): Added. * src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionServer.java Added comments. (stop): Converted from public to default access (master shuts down regionservers). (run): Use leases.closeAfterLeasesExpire instead of leases.close. Changed log of main thread exit from DEBUG to INFO. * src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMaster.java (letRegionsServersShutdown): Add better explaination of shutdown process to method doc. Changed timeout waits from hbase.regionserver.msginterval to threadWakeFrequency. (regionServerReport): If closing, we used to immediately respond to region server with a MSG_REGIONSERVER_STOP. This meant that we avoided handling of the region servers MSG_REPORT_EXITING sent on shutdown so region servers had no chance to cancel their lease in the master. Reordered. Moved sending of MSG_REGIONSERVER_STOP to after handling of MSG_REPORT_EXITING. Also, in handling of MSG_REGIONSERER_STOP removed cancelling of leases. Let leases expire normally (or get cancelled when the region server comes in with MSG_RPORT_EXITING). * src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMsg.java (MSG_REGIONSERVER_STOP_IN_ARRAY): Added. git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@552376 13f79535-47bb-0310-9956-ffa450edef68
2007-07-01 20:47:13 -04:00
28. HADOOP-1523 Hung region server waiting on write locks
29. HADOOP-1560 NPE in MiniHBaseCluster on Windows
2007-07-05 15:50:04 -04:00
30. HADOOP-1531 Add RowFilter to HRegion.HScanner
Adds a row filtering interface and two implemenentations: A page scanner,
and a regex row/column-data matcher. (James Kennedy via Stack)
31. HADOOP-1566 Key-making utility
32. HADOOP-1415 Provide configurable per-column bloom filters.
HADOOP-1466 Clean up visibility and javadoc issues in HBase.
33. HADOOP-1538 Provide capability for client specified time stamps in HBase
HADOOP-1466 Clean up visibility and javadoc issues in HBase.
34. HADOOP-1589 Exception handling in HBase is broken over client server connections
HADOOP-1375 a simple parser for hbase M src/contrib/hbase/NOTICE.txt Add notice of udanax contributions. Msrc/contrib/hbase/conf/hbase-default.xml (hbaseshell.jline.bell.enabled): Added. M src/contrib/hbase/CHANGES.txt (hadoop-1375) Added. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/package.html Add note on how to start up hbase shell M src/contrib/hbase/bin/hbase Add 'shell'. Remove 'client' (shell does what it used do and more). Removed all reader and logreader until better developed. Starting up a reader or logreader on a running hbase system could do damage). M src/contrib/hbase/build.xml Add a javacc target to generate content of shell/generated subpackage. A src/contrib/hbase/src/test/org/apache/hadoop/hbase/shell/TestHBaseShell.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/Shell.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/DeleteCommand.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/CreateCommand.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/DropCommand.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/InsertCommand.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/CommandFactory.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/HelpContents.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/ExitCommand.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/ConsoleTable.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/DescCommand.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/SelectCommand.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/Command.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/ShowCommand.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/BasicCommand.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/HelpManager.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/ReturnMsg.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/HelpCommand.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/HBaseShell.jj Added. A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/generated/Token.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/generated/TokenMgrError.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/generated/SimpleCharStream.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/generated/ParserTokenManager.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/generated/ParseException.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/generated/ParserConstants.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/generated/Parser.java Added javacc generated files. git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@555415 13f79535-47bb-0310-9956-ffa450edef68
2007-07-11 17:54:15 -04:00
35. HADOOP-1375 a simple parser for hbase (Edward Yoon via Stack)
36. HADOOP-1600 Update license in HBase code
37. HADOOP-1589 Exception handling in HBase is broken over client server
38. HADOOP-1574 Concurrent creates of a table named 'X' all succeed
39. HADOOP-1581 Un-openable tablename bug
40. HADOOP-1607 [shell] Clear screen command (Edward Yoon via Stack)
41. HADOOP-1614 [hbase] HClient does not protect itself from simultaneous updates
42. HADOOP-1468 Add HBase batch update to reduce RPC overhead
43. HADOOP-1616 Sporadic TestTable failures
44. HADOOP-1615 Replacing thread notification-based queue with
java.util.concurrent.BlockingQueue in HMaster, HRegionServer
45. HADOOP-1606 Updated implementation of RowFilterSet, RowFilterInterface
(Izaak Rubin via Stack)
46. HADOOP-1579 Add new WhileMatchRowFilter and StopRowFilter filters
(Izaak Rubin via Stack)
47. HADOOP-1637 Fix to HScanner to Support Filters, Add Filter Tests to
TestScanner2 (Izaak Rubin via Stack)
HADOOP-1516 HClient fails to readjust when ROOT or META redeployed on new region server Detailed changes: MiniHBaseCluster - rewrite abortRegionServer, stopRegionServer - they now remove the server from the map of servers. - rewrite waitOnRegionServer - now removes thread from map of threads TestCleanRegionServerExit - reduce Hadoop ipc client timeout and number of retries - use rewritten stopRegionServer and waitOnRegionServer from MiniHBaseCluster - add code to verify that failover worked - moved testRegionServerAbort to separate test file TestRegionServerAbort - new test. Uses much the same code as TestCleanRegionServerExit but aborts the region server instead of shutting it down cleanly. Includes code to verify that failover worked. hbase-site.xml (in src/contrib/hbase/src/test) - reduce master lease timeout and time between lease timeout checks so that tests will run quicker. HClient - Major restructing of code that determines what region server to contact for a specific region. The main method findServersForTable is now recursive so that it will find the meta and root regions if they have not already been located or will re-find them if they have been reassigned and the old server can no longer be contacted. - re-ordered administrative and general purpose methods so they are no longer located in seemingly random order. - re-ordered code in ClientScanner.loadRegions so that if the location of the region changes, it will actually try to connect to the new server rather than continually trying to use the connection to the old server. HLog - use HashMap<Text, SequenceFile.Writer> instead of TreeMap<Text, SequenceFile.Writer> because the TreeMap would return a value for a key it did not have (it was the value of another key). I have observed this before when the key is Text, but could not create a simple test case that reproduced the problem. - added some new DEBUG level logging - removed call to rollWriter() from closeAndDelete(). We don't need to start a new writer if we are closing the log. HLogKey - cleaned up per HADOOP-1466 (I initially modified it to add some debug logging which was later removed, but when I was making the modifications I took the opportunity to clean up the file) - changed toString() format HMaster - better handling of RemoteException - modified BaseScanner - now knows if it is scanning the root or a meta region - scanRegion no longer returns a value - if scanning the root region, it counts the number of meta regions it finds and sets a new AtomicInteger, numberOfMetaRegions when the scan is complete. - added abstract methods initialScan and maintenanceScan this allowed run method to be implemented in the base class. - boolean rootScanned is now volatile - modified RootScanner - moved actual scan into private method for readability (scanRoot) - implementation of abstract methods just call scanRoot - add constructor for inner static class MetaRegion - use a BlockingQueue to queue up work for the MetaScanner - clean up handling of an unexpected region server exit - PendingOperation.process now returns a boolean so that HMaster.run can determine if the operation completed or needs to be retried later - PendingOperation processing no longer does a wait inside the process method since this might cause a deadlock if the current operation is waiting for another operation that has yet to be processed HMsg - removed MSG_REGIONSERVER_STOP_IN_ARRAY, MSG_NEW_REGION - added MSG_REPORT_SPLIT HRegionServer - changed reportSplit to contain old region and new regions - use IP from default interface rather than host name - abort calls HLog.close() instead of HLog.rollWriter() git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@559819 13f79535-47bb-0310-9956-ffa450edef68
2007-07-26 10:15:17 -04:00
48. HADOOP-1516 HClient fails to readjust when ROOT or META redeployed on new
region server
49. HADOOP-1646 RegionServer OOME's under sustained, substantial loading by
10 concurrent clients
50. HADOOP-1468 Add HBase batch update to reduce RPC overhead (restrict batches
to a single row at a time)
51. HADOOP-1528 HClient for multiple tables (phase 1) (James Kennedy & JimK)
52. HADOOP-1528 HClient for multiple tables (phase 2) all HBase client side code
(except TestHClient and HBaseShell) have been converted to use the new client
side objects (HTable/HBaseAdmin/HConnection) instead of HClient.
53. HADOOP-1528 HClient for multiple tables - expose close table function
54. HADOOP-1466 Clean up warnings, visibility and javadoc issues in HBase.
HADOOP-1662 Make region splits faster Splits are now near-instantaneous. On split, daughter splits create 'references' to store files up in the parent region using new 'HalfMapFile' class to proxy accesses against the top-half or bottom-half of backing MapFile. Parent region is deleted after all references in daughter regions have been let go. Below includes other cleanups and at least one bug fix for fails adding >32k records and improvements to make it more likely TestRegionServerAbort will complete.. A src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestHStoreFile.java Added. Tests new Reference HStoreFiles. Test new HalfMapFileReader inner class of HStoreFile. Test that we do the right thing when HStoreFiles are smaller than a MapFile index range (i.e. there is not 'MidKey'). Test we do right thing when key is outside of a HalfMapFile. M src/contrib/hbase/src/test/org/apache/hadoop/hbase/HBaseTestCase.java M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestGet.java M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestScanner.java M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestTimestamp.java getHRegionDir moved from HStoreFile to HRegion. M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestBatchUpdate.java Let out exception rather than catch and call 'fail'. M src/contrib/hbase/src/test/org/apache/hadoop/hbase/MiniHBaseCluster.java Refactored so can start and stop a minihbasecluster w/o having to subclass this TestCase. Refactored methods in this class to use the newly added methods listed below. (MasterThread, RegionServerThread, startMaster, startRegionServers shutdown): Added. Added logging of abort, close and wait. Also on abort/close was doing a remove that made it so subsequent wait had nothing to wait on. M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestSplit.java Added tests that assert all works properly at region level on multiple levels of splits and then do same on a cluster. M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestHRegion.java Removed catch and 'fail()'. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStoreFile.java Javadoc to explain how split now works. Have constructors flow into each other rather than replicate setup per instance. Moved in here operations such as delete, rename, and length of store files (No need of clients to remember to delete map and info files). (REF_NAME_PARSER, Reference, HalfMapFile, isReference, writeReferenceFiles, writeSplitInfo, readSplitInfo, createOrFail, getReader, getWriter, toString): Added. (getMapDir, getMapFilePath, getInfoDir, getInfoFilePath): Added a bunch of overrides for reference handling. (loadHStoreFiles): Amended to load references off disk. (splitStoreFiles): Redone to instead write references into M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStore.java Rename maps as readers and mapFiles as storefiles. Moved BloomFilterReader and Writer into HStoreFile. Removed getMapFileReader and getMapFileWriter (They are in HStoreFile now). (getReaders): Added. (HStoreSize): Added. Data Structure to hold aggregated size of all HStoreFiles in HStore, the largest, its midkey, and if the HStore is splitable (May not be if references). Previous we only did largest file; less accurate. (getLargestFileSize): Renamed size and redone to aggregate sizes, etc. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HColumnDescriptor.java Have constructors waterfall down through each other rather than repeat initializations. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMerge.java Use new HStoreSize structure. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionServer.java Added delayed remove of HRegion (Now done in HMaster as part of meta scan). Change LOG.error and LOG.warn so they throw stack trace instead of just the Exception.toString as message. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HConstants.java (COLUMN_FAMILY_STR): Added. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HLog.java Added why to log of splitting. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HLogEdit.java Short is not big enough to hold edits tha could contain a sizable web page. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HTable.java (getTableName): Added. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMaster.java Added constructor to BaseScanner that takes name of table we're scanning (ROOT or META usually). Added to scanOneRegion handling of split regions. Collect splits to check while scanning and then outside of the scanning, so we can modify the META table is needed, do the checks of daughter regions and update on change of state. Made LOG.warn and LOG.error print stack trace. (isSplitParent, cleanupSplits, hasReferences): Added. Added toString to each of the PendingOperation implementations. In the ShutdownPendingOperation scan of meta data, removed check of startcode (if the server name is that of the dead server, it needs reassigning even if start code is good). Also, if server name is null -- possible if we are missing edits off end of log -- then the region should be reassigned just in case its from the dead server. Also, if reassigning, clear from pendingRegions. Server may have died after sending region is up but before the server confirms receipt in the meta scan. Added mare detail to each log. In OpenPendingOperation we were trying to clear pendingRegion in wrong place -- it was never executed (regions were always pending). M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionInfo.java Add split boolean. Output offline and split status in toString. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMemcache.java Comments. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java Moved getRegionDir here from HStoreFile. (COL_SPLITA, COL_SPLITB): Added. (closeAndSplit): Refactored to use new fast split method. StringUtils.formatTimeDiff(System.currentTimeMillis(), startTime)); (splitStoreFile): Moved into HStoreFile. (getSplitRegionDir, getSplitsDir, toString): Added. (needsSplit): Refactored to exploit new HStoreSize structure. Also manages notion of 'unsplitable' region. (largestHStore): Refactored. (removeSplitFromMETA, writeSplitToMETA, getSplit, hasReference): Added. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/util/Keying.java (intToBytes, getBytes): Added. A src/contrib/hbase/src/java/org/apache/hadoop/hbase/util/Writables.java Utility reading and writing Writables. git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@564012 13f79535-47bb-0310-9956-ffa450edef68
2007-08-08 16:30:13 -04:00
55. HADOOP-1662 Make region splits faster
56. HADOOP-1678 On region split, master should designate which host should
serve daughter splits. Phase 1: Master balances load for new regions and
when a region server fails.
57. HADOOP-1678 On region split, master should designate which host should
serve daughter splits. Phase 2: Master assigns children of split region
instead of HRegionServer serving both children.
58. HADOOP-1710 All updates should be batch updates
59. HADOOP-1711 HTable API should use interfaces instead of concrete classes as
method parameters and return values
HADOOP-1644 [hbase] Compactions should not block updates Disentangles flushes and compactions; flushes can proceed while a compaction is happening. Also, don't compact unless we hit compaction threshold: i.e. don't automatically compact on HRegion startup so regions can come online the faster. M src/contrib/hbase/conf/hbase-default.xml (hbase.hregion.compactionThreashold): Moved to be a hstore property as part of encapsulating compaction decision inside hstore. M src/contrib/hbase/src/test/org/apache/hadoop/hbase/HBaseTestCase.java Refactored. Moved here generalized content loading code that can be shared by tests. Add to setup and teardown the setup and removal of local test dir (if it exists). M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestCompare.java Added test of HStoreKey compare (It works other than one would at first expect). M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestSplit.java Bulk of content loading code has been moved up into the parent class. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HConnectionManager.java (tableExists): Restore to a check of if the asked-for table is in list of tables. As it was, a check for tableExists would just wait on all timeouts and retries to expire and then report table does not exist.. Fixed up debug message listing regions of a table. Added protection against meta table not having a COL_REGINFO (Seen in cluster testing -- probably a bug in row removal). M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStoreFile.java Loading store files, even if it was noticed that there was no corresponding map file, was still counting file as valid. Also fix merger -- was constructing MapFile.Reader directly rather than asking HStoreFile for the reader (HStoreFile knows how to do MapFile references) (rename): Added check that move succeeded and logging. In cluster-testing, the hdfs move of compacted file into place has failed on occasion (Need more info). M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStore.java Encapsulate ruling on whether a compaction should take place inside HStore. Added reading of the compactionThreshold her. Compaction threshold is currently just number of store files. Later may include other factors such as count of reference files. Cleaned up debug messages around reconstruction log. Removed compaction if size > 1 from constructor. Let compaction happen after we've been deployed (Compactions that happen while we are online can continue to take updates. Compaction in the constructor puts off our being able to take in updates). (close): Changed so it now returns set of store files. This used to be done by calls to flush. Since flush and compaction have been disentangled, a compaction can come in after flush and the list of files could be off. Having it done by close, can be sure list of files is complete. (flushCache): No longer returns set of store files. Added 'merging compaction' where we pick an arbitrary store file from disk and merge into it the content of memcache (Needs work). (getAllMapFiles): Renamed getAllStoreFiles. (needsCompaction): Added. (compactHelper): Added passing of maximum sequence number if already calculated. If compacting one file only, we used skip without rewriting the info file. Fixed. Refactored. Moved guts to new compact(outFile, listOfStores) method. (compact, CompactionReader): Added overrides and interface to support 'merging compaction' that takes files and memcache. In compaction, if we failed the move of the compacted file, all data had already been deleted. Changing, so deletion happens after confirmed move of compacted file. (getFull): Fixed bug where NPE when read of maps came back null. Revealed by our NOT compacting stores on startup. Meant could be two backing stores one of which had no data regards queried key. (getNMaps): Renamed countOfStoreFiles. (toString): Added. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStoreKey.java Added comment on 'odd'-looking comparison. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionServer.java Javadoc edit. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HLogEdit.java Only return first 128 bytes of value when toStringing (On cluster, was returning complete web pages in log). M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMaster.java Removed confusing debug message (made sense once -- but not now). Test rootRegionLocation for null before using it (can be null). M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMemcache.java Added comment that delete behavior needs study. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java Fixed merge so it doesn't do the incremental based off files returned by flush. Instead all is done in the one go after region closes (using files returned by close). Moved duplicated code to new filesByFamily method. (WriteState): Removed writesOngoing in favor of compacting and flushing flags. (flushCache): No longer returns list of files. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/util/Writables.java Fix javadoc. git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@566459 13f79535-47bb-0310-9956-ffa450edef68
2007-08-15 21:07:51 -04:00
60. HADOOP-1644 Compactions should not block updates
60. HADOOP-1672 HBase Shell should use new client classes
(Edward Yoon via Stack).
61. HADOOP-1709 Make HRegionInterface more like that of HTable
HADOOP-1725 Client find of table regions should not include offlined, split parents