hbase/CHANGES.txt

221 lines
12 KiB
Plaintext
Raw Normal View History

HBase Change Log
Trunk (unreleased changes)
INCOMPATIBLE CHANGES
NEW FEATURES
HADOOP-2061 Add new Base64 dialects
OPTIMIZATIONS
BUG FIXES
HADOOP-2059 In tests, exceptions in min dfs shutdown should not fail test
(e.g. nightly #272)
IMPROVEMENTS
HADOOP-2401 Add convenience put method that takes writable
(Johan Oskarsson via Stack)
Branch 0.15 (unreleased changes)
INCOMPATIBLE CHANGES
HADOOP-1931 Hbase scripts take --ARG=ARG_VALUE when should be like hadoop
and do ---ARG ARG_VALUE
NEW FEATURES
HADOOP-1768 FS command using Hadoop FsShell operations
(Edward Yoon via Stack)
HADOOP-1784 Delete: Fix scanners and gets so they work properly in presence
of deletes. Added a deleteAll to remove all cells equal to or
older than passed timestamp. Fixed compaction so deleted cells
do not make it out into compacted output. Ensure also that
versions > column max are dropped compacting.
HADOOP-1720 Addition of HQL (Hbase Query Language) support in Hbase Shell.
The old shell syntax has been replaced by HQL, a small SQL-like
set of operators, for creating, altering, dropping, inserting,
deleting, and selecting, etc., data in hbase.
(Inchul Song and Edward Yoon via Stack)
HADOOP-1913 Build a Lucene index on an HBase table
(Ning Li via Stack)
HADOOP-1957 Web UI with report on cluster state and basic browsing of tables
OPTIMIZATIONS
BUG FIXES
HADOOP-1527 Region server won't start because logdir exists
HADOOP-1723 If master asks region server to shut down, by-pass return of
shutdown message
HADOOP-1729 Recent renaming or META tables breaks hbase shell
HADOOP-1730 unexpected null value causes META scanner to exit (silently)
HADOOP-1747 On a cluster, on restart, regions multiply assigned
HADOOP-1776 Fix for sporadic compaction failures closing and moving compaction result M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HConnectionManager.java Minor fix of a log message. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStore.java (COMPACTION_DIR, WORKING_COMPACTION): Removed. (compactdir): Renamed compactionDir. Removed from constructor our checking if a compaction was left undone. Instead, just ignore it. When compaction reruns whatever as left on filesystem will just be cleaned up and we'll rerun the compaction (Likelihood of a crash mid-compaction in exactly the area where the compaction was recoverable are low -- more robust just redoing the compaction from scratch). (compactHelper): We were deleting HBaseRoot/compaction.tmp dir after a compaction completed. Usually fine but on a cluster of more than one machine, if two compactions were near-concurrent, one machine could remove the compaction working directory while another was mid-way through its compaction. Result was odd failures during compaction of result file, during the move of the resulting compacting file or subsequently trying to open reader on the resulting compaction file (See HADOOP-1765). a region fsck tool). (getFilesToCompact): Added. (processReadyCompaction): Added. Reorganized compaction so that the window during which loss-of-data is possible is narrowed and even then, we log a message with how a restore might be performed manually (TODO: Add a repair tool). M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStoreFile.java (rename): More checking around rename that it was successful. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HLog.java An empty-log gives HLog trouble. Added handling. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMaster.java Cleanup of debug level logging. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java Minor javadoc and changed a log from info to debug. git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@569446 13f79535-47bb-0310-9956-ffa450edef68
2007-08-24 12:24:40 -04:00
HADOOP-1776 Fix for sporadic compaction failures closing and moving
compaction result
HADOOP-1780 Regions are still being doubly assigned
HADOOP-1797 Fix NPEs in MetaScanner constructor
HADOOP-1799 Incorrect classpath in binary version of Hadoop
HADOOP-1805 Region server hang on exit
HADOOP-1785 TableInputFormat.TableRecordReader.next has a bug
(Ning Li via Stack)
HADOOP-1800 output should default utf8 encoding
HADOOP-1801 When hdfs is yanked out from under hbase, hbase should go down gracefully
HADOOP-1813 OOME makes zombie of region server
HADOOP-1814 TestCleanRegionServerExit fails too often on Hudson
HADOOP-1820 Regionserver creates hlogs without bound
(reverted 2007/09/25) (Fixed 2007/09/30)
HADOOP-1821 Replace all String.getBytes() with String.getBytes("UTF-8")
HADOOP-1832 listTables() returns duplicate tables
HADOOP-1834 Scanners ignore timestamp passed on creation
HADOOP-1847 Many HBase tests do not fail well.
HADOOP-1847 Many HBase tests do not fail well. (phase 2)
HADOOP-1870 Once file system failure has been detected, don't check it again
and get on with shutting down the hbase cluster.
HADOOP-1888 NullPointerException in HMemcacheScanner (reprise)
HADOOP-1903 Possible data loss if Exception happens between snapshot and
flush to disk.
HADOOP-1920 Wrapper scripts broken when hadoop in one location and hbase in
another
HADOOP-1923, HADOOP-1924 a) tests fail sporadically because set up and tear
down is inconsistent b) TestDFSAbort failed in nightly #242
HADOOP-1929 Add hbase-default.xml to hbase jar
HADOOP-1941 StopRowFilter throws NPE when passed null row
HADOOP-1966 Make HBase unit tests more reliable in the Hudson environment.
HADOOP-1975 HBase tests failing with java.lang.NumberFormatException
HADOOP-1990 Regression test instability affects nightly and patch builds
HADOOP-1996 TestHStoreFile fails on windows if run multiple times
HADOOP-1937 When the master times out a region server's lease, it is too
aggressive in reclaiming the server's log.
HADOOP-2004 webapp hql formatting bugs
HADOOP_2011 Make hbase daemon scripts take args in same order as hadoop
daemon scripts
HADOOP-2017 TestRegionServerAbort failure in patch build #903 and
nightly #266
HADOOP-2029 TestLogRolling fails too often in patch and nightlies
HADOOP-2038 TestCleanRegionExit failed in patch build #927
IMPROVEMENTS
HADOOP-1737 Make HColumnDescriptor data publically members settable
HADOOP-1746 Clean up findbugs warnings
HADOOP-1757 Bloomfilters: single argument constructor, use enum for bloom
filter types
HADOOP-1760 Use new MapWritable and SortedMapWritable classes from
org.apache.hadoop.io
HADOOP-1793 (Phase 1) Remove TestHClient (Phase2) remove HClient.
HADOOP-1794 Remove deprecated APIs
HADOOP-1802 Startup scripts should wait until hdfs as cleared 'safe mode'
HADOOP-1833 bin/stop_hbase.sh returns before it completes
(Izaak Rubin via Stack)
HADOOP-1835 Updated Documentation for HBase setup/installation
(Izaak Rubin via Stack)
HADOOP-1868 Make default configuration more responsive
HADOOP-1884 Remove useless debugging log messages from hbase.mapred
HADOOP-1856 Add Jar command to hbase shell using Hadoop RunJar util
(Edward Yoon via Stack)
HADOOP-1928 Have master pass the regionserver the filesystem to use
HADOOP-1789 Output formatting
HADOOP-1960 If a region server cannot talk to the master before its lease
times out, it should shut itself down
HADOOP-2035 Add logo to webapps
Below are the list of changes before 2007-08-18
1. HADOOP-1384. HBase omnibus patch. (jimk, Vuk Ercegovac, and Michael Stack)
2. HADOOP-1402. Fix javadoc warnings in hbase contrib. (Michael Stack)
3. HADOOP-1404. HBase command-line shutdown failing (Michael Stack)
4. HADOOP-1397. Replace custom hbase locking with
java.util.concurrent.locks.ReentrantLock (Michael Stack)
5. HADOOP-1403. HBase reliability - make master and region server more fault
tolerant.
6. HADOOP-1418. HBase miscellaneous: unit test for HClient, client to do
'Performance Evaluation', etc.
7. HADOOP-1420, HADOOP-1423. Findbugs changes, remove reference to removed
class HLocking.
8. HADOOP-1424. TestHBaseCluster fails with IllegalMonitorStateException. Fix
regression introduced by HADOOP-1397.
9. HADOOP-1426. Make hbase scripts executable + add test classes to CLASSPATH.
10. HADOOP-1430. HBase shutdown leaves regionservers up.
11. HADOOP-1392. Part1: includes create/delete table; enable/disable table;
add/remove column.
12. HADOOP-1392. Part2: includes table compaction by merging adjacent regions
that have shrunk in size.
13. HADOOP-1445 Support updates across region splits and compactions
14. HADOOP-1460 On shutdown IOException with complaint 'Cannot cancel lease
that is not held'
15. HADOOP-1421 Failover detection, split log files.
For the files modified, also clean up javadoc, class, field and method
visibility (HADOOP-1466)
16. HADOOP-1479 Fix NPE in HStore#get if store file only has keys < passed key.
17. HADOOP-1476 Distributed version of 'Performance Evaluation' script
18. HADOOP-1469 Asychronous table creation
19. HADOOP-1415 Integrate BSD licensed bloom filter implementation.
20. HADOOP-1465 Add cluster stop/start scripts for hbase
21. HADOOP-1415 Provide configurable per-column bloom filters - part 2.
22. HADOOP-1498. Replace boxed types with primitives in many places.
23. HADOOP-1509. Made methods/inner classes in HRegionServer and HClient protected
instead of private for easier extension. Also made HRegion and HRegionInfo public too.
Added an hbase-default.xml property for specifying what HRegionInterface extension to use
2007-07-05 15:50:04 -04:00
for proxy server connection. (James Kennedy via Jim Kellerman)
24. HADOOP-1534. [hbase] Memcache scanner fails if start key not present
25. HADOOP-1537. Catch exceptions in testCleanRegionServerExit so we can see
what is failing.
26. HADOOP-1543 [hbase] Add HClient.tableExists
2007-07-05 15:50:04 -04:00
27. HADOOP-1519 [hbase] map/reduce interface for HBase. (Vuk Ercegovac and
Jim Kellerman)
HADOOP-1523 'Hung region servers waiting on write locks' On shutdown, region servers and masters were just cancelling leases without letting 'lease expired' code run -- code to clean up outstanding locks in region server. Outstanding read locks were getting in the way of region server getting necessary write locks needed for the shutdown process. Also, cleaned up messaging around shutdown so its clean -- no timeout messages as region servers try to talk to a master that has already shutdown -- even when region servers take their time going down. M src/contrib/hbase/conf/hbase-default.xml Make region server timeout 30 seconds instead of 3 minutes. Clients retry anyways. Make so its likely region servers report in their shutdown message before their lease expires on master. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/Leases.java (closeAfterLeasesExpire): Added. * src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionServer.java Added comments. (stop): Converted from public to default access (master shuts down regionservers). (run): Use leases.closeAfterLeasesExpire instead of leases.close. Changed log of main thread exit from DEBUG to INFO. * src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMaster.java (letRegionsServersShutdown): Add better explaination of shutdown process to method doc. Changed timeout waits from hbase.regionserver.msginterval to threadWakeFrequency. (regionServerReport): If closing, we used to immediately respond to region server with a MSG_REGIONSERVER_STOP. This meant that we avoided handling of the region servers MSG_REPORT_EXITING sent on shutdown so region servers had no chance to cancel their lease in the master. Reordered. Moved sending of MSG_REGIONSERVER_STOP to after handling of MSG_REPORT_EXITING. Also, in handling of MSG_REGIONSERER_STOP removed cancelling of leases. Let leases expire normally (or get cancelled when the region server comes in with MSG_RPORT_EXITING). * src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMsg.java (MSG_REGIONSERVER_STOP_IN_ARRAY): Added. git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@552376 13f79535-47bb-0310-9956-ffa450edef68
2007-07-01 20:47:13 -04:00
28. HADOOP-1523 Hung region server waiting on write locks
29. HADOOP-1560 NPE in MiniHBaseCluster on Windows
2007-07-05 15:50:04 -04:00
30. HADOOP-1531 Add RowFilter to HRegion.HScanner
Adds a row filtering interface and two implemenentations: A page scanner,
and a regex row/column-data matcher. (James Kennedy via Stack)
31. HADOOP-1566 Key-making utility
32. HADOOP-1415 Provide configurable per-column bloom filters.
HADOOP-1466 Clean up visibility and javadoc issues in HBase.
33. HADOOP-1538 Provide capability for client specified time stamps in HBase
HADOOP-1466 Clean up visibility and javadoc issues in HBase.
34. HADOOP-1589 Exception handling in HBase is broken over client server connections
HADOOP-1375 a simple parser for hbase M src/contrib/hbase/NOTICE.txt Add notice of udanax contributions. Msrc/contrib/hbase/conf/hbase-default.xml (hbaseshell.jline.bell.enabled): Added. M src/contrib/hbase/CHANGES.txt (hadoop-1375) Added. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/package.html Add note on how to start up hbase shell M src/contrib/hbase/bin/hbase Add 'shell'. Remove 'client' (shell does what it used do and more). Removed all reader and logreader until better developed. Starting up a reader or logreader on a running hbase system could do damage). M src/contrib/hbase/build.xml Add a javacc target to generate content of shell/generated subpackage. A src/contrib/hbase/src/test/org/apache/hadoop/hbase/shell/TestHBaseShell.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/Shell.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/DeleteCommand.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/CreateCommand.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/DropCommand.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/InsertCommand.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/CommandFactory.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/HelpContents.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/ExitCommand.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/ConsoleTable.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/DescCommand.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/SelectCommand.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/Command.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/ShowCommand.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/BasicCommand.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/HelpManager.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/ReturnMsg.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/HelpCommand.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/HBaseShell.jj Added. A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/generated/Token.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/generated/TokenMgrError.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/generated/SimpleCharStream.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/generated/ParserTokenManager.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/generated/ParseException.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/generated/ParserConstants.java A src/contrib/hbase/src/java/org/apache/hadoop/hbase/shell/generated/Parser.java Added javacc generated files. git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@555415 13f79535-47bb-0310-9956-ffa450edef68
2007-07-11 17:54:15 -04:00
35. HADOOP-1375 a simple parser for hbase (Edward Yoon via Stack)
36. HADOOP-1600 Update license in HBase code
37. HADOOP-1589 Exception handling in HBase is broken over client server
38. HADOOP-1574 Concurrent creates of a table named 'X' all succeed
39. HADOOP-1581 Un-openable tablename bug
40. HADOOP-1607 [shell] Clear screen command (Edward Yoon via Stack)
41. HADOOP-1614 [hbase] HClient does not protect itself from simultaneous updates
42. HADOOP-1468 Add HBase batch update to reduce RPC overhead
43. HADOOP-1616 Sporadic TestTable failures
44. HADOOP-1615 Replacing thread notification-based queue with
java.util.concurrent.BlockingQueue in HMaster, HRegionServer
45. HADOOP-1606 Updated implementation of RowFilterSet, RowFilterInterface
(Izaak Rubin via Stack)
46. HADOOP-1579 Add new WhileMatchRowFilter and StopRowFilter filters
(Izaak Rubin via Stack)
47. HADOOP-1637 Fix to HScanner to Support Filters, Add Filter Tests to
TestScanner2 (Izaak Rubin via Stack)
HADOOP-1516 HClient fails to readjust when ROOT or META redeployed on new region server Detailed changes: MiniHBaseCluster - rewrite abortRegionServer, stopRegionServer - they now remove the server from the map of servers. - rewrite waitOnRegionServer - now removes thread from map of threads TestCleanRegionServerExit - reduce Hadoop ipc client timeout and number of retries - use rewritten stopRegionServer and waitOnRegionServer from MiniHBaseCluster - add code to verify that failover worked - moved testRegionServerAbort to separate test file TestRegionServerAbort - new test. Uses much the same code as TestCleanRegionServerExit but aborts the region server instead of shutting it down cleanly. Includes code to verify that failover worked. hbase-site.xml (in src/contrib/hbase/src/test) - reduce master lease timeout and time between lease timeout checks so that tests will run quicker. HClient - Major restructing of code that determines what region server to contact for a specific region. The main method findServersForTable is now recursive so that it will find the meta and root regions if they have not already been located or will re-find them if they have been reassigned and the old server can no longer be contacted. - re-ordered administrative and general purpose methods so they are no longer located in seemingly random order. - re-ordered code in ClientScanner.loadRegions so that if the location of the region changes, it will actually try to connect to the new server rather than continually trying to use the connection to the old server. HLog - use HashMap<Text, SequenceFile.Writer> instead of TreeMap<Text, SequenceFile.Writer> because the TreeMap would return a value for a key it did not have (it was the value of another key). I have observed this before when the key is Text, but could not create a simple test case that reproduced the problem. - added some new DEBUG level logging - removed call to rollWriter() from closeAndDelete(). We don't need to start a new writer if we are closing the log. HLogKey - cleaned up per HADOOP-1466 (I initially modified it to add some debug logging which was later removed, but when I was making the modifications I took the opportunity to clean up the file) - changed toString() format HMaster - better handling of RemoteException - modified BaseScanner - now knows if it is scanning the root or a meta region - scanRegion no longer returns a value - if scanning the root region, it counts the number of meta regions it finds and sets a new AtomicInteger, numberOfMetaRegions when the scan is complete. - added abstract methods initialScan and maintenanceScan this allowed run method to be implemented in the base class. - boolean rootScanned is now volatile - modified RootScanner - moved actual scan into private method for readability (scanRoot) - implementation of abstract methods just call scanRoot - add constructor for inner static class MetaRegion - use a BlockingQueue to queue up work for the MetaScanner - clean up handling of an unexpected region server exit - PendingOperation.process now returns a boolean so that HMaster.run can determine if the operation completed or needs to be retried later - PendingOperation processing no longer does a wait inside the process method since this might cause a deadlock if the current operation is waiting for another operation that has yet to be processed HMsg - removed MSG_REGIONSERVER_STOP_IN_ARRAY, MSG_NEW_REGION - added MSG_REPORT_SPLIT HRegionServer - changed reportSplit to contain old region and new regions - use IP from default interface rather than host name - abort calls HLog.close() instead of HLog.rollWriter() git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@559819 13f79535-47bb-0310-9956-ffa450edef68
2007-07-26 10:15:17 -04:00
48. HADOOP-1516 HClient fails to readjust when ROOT or META redeployed on new
region server
49. HADOOP-1646 RegionServer OOME's under sustained, substantial loading by
10 concurrent clients
50. HADOOP-1468 Add HBase batch update to reduce RPC overhead (restrict batches
to a single row at a time)
51. HADOOP-1528 HClient for multiple tables (phase 1) (James Kennedy & JimK)
52. HADOOP-1528 HClient for multiple tables (phase 2) all HBase client side code
(except TestHClient and HBaseShell) have been converted to use the new client
side objects (HTable/HBaseAdmin/HConnection) instead of HClient.
53. HADOOP-1528 HClient for multiple tables - expose close table function
54. HADOOP-1466 Clean up warnings, visibility and javadoc issues in HBase.
HADOOP-1662 Make region splits faster Splits are now near-instantaneous. On split, daughter splits create 'references' to store files up in the parent region using new 'HalfMapFile' class to proxy accesses against the top-half or bottom-half of backing MapFile. Parent region is deleted after all references in daughter regions have been let go. Below includes other cleanups and at least one bug fix for fails adding >32k records and improvements to make it more likely TestRegionServerAbort will complete.. A src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestHStoreFile.java Added. Tests new Reference HStoreFiles. Test new HalfMapFileReader inner class of HStoreFile. Test that we do the right thing when HStoreFiles are smaller than a MapFile index range (i.e. there is not 'MidKey'). Test we do right thing when key is outside of a HalfMapFile. M src/contrib/hbase/src/test/org/apache/hadoop/hbase/HBaseTestCase.java M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestGet.java M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestScanner.java M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestTimestamp.java getHRegionDir moved from HStoreFile to HRegion. M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestBatchUpdate.java Let out exception rather than catch and call 'fail'. M src/contrib/hbase/src/test/org/apache/hadoop/hbase/MiniHBaseCluster.java Refactored so can start and stop a minihbasecluster w/o having to subclass this TestCase. Refactored methods in this class to use the newly added methods listed below. (MasterThread, RegionServerThread, startMaster, startRegionServers shutdown): Added. Added logging of abort, close and wait. Also on abort/close was doing a remove that made it so subsequent wait had nothing to wait on. M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestSplit.java Added tests that assert all works properly at region level on multiple levels of splits and then do same on a cluster. M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestHRegion.java Removed catch and 'fail()'. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStoreFile.java Javadoc to explain how split now works. Have constructors flow into each other rather than replicate setup per instance. Moved in here operations such as delete, rename, and length of store files (No need of clients to remember to delete map and info files). (REF_NAME_PARSER, Reference, HalfMapFile, isReference, writeReferenceFiles, writeSplitInfo, readSplitInfo, createOrFail, getReader, getWriter, toString): Added. (getMapDir, getMapFilePath, getInfoDir, getInfoFilePath): Added a bunch of overrides for reference handling. (loadHStoreFiles): Amended to load references off disk. (splitStoreFiles): Redone to instead write references into M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStore.java Rename maps as readers and mapFiles as storefiles. Moved BloomFilterReader and Writer into HStoreFile. Removed getMapFileReader and getMapFileWriter (They are in HStoreFile now). (getReaders): Added. (HStoreSize): Added. Data Structure to hold aggregated size of all HStoreFiles in HStore, the largest, its midkey, and if the HStore is splitable (May not be if references). Previous we only did largest file; less accurate. (getLargestFileSize): Renamed size and redone to aggregate sizes, etc. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HColumnDescriptor.java Have constructors waterfall down through each other rather than repeat initializations. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMerge.java Use new HStoreSize structure. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionServer.java Added delayed remove of HRegion (Now done in HMaster as part of meta scan). Change LOG.error and LOG.warn so they throw stack trace instead of just the Exception.toString as message. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HConstants.java (COLUMN_FAMILY_STR): Added. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HLog.java Added why to log of splitting. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HLogEdit.java Short is not big enough to hold edits tha could contain a sizable web page. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HTable.java (getTableName): Added. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMaster.java Added constructor to BaseScanner that takes name of table we're scanning (ROOT or META usually). Added to scanOneRegion handling of split regions. Collect splits to check while scanning and then outside of the scanning, so we can modify the META table is needed, do the checks of daughter regions and update on change of state. Made LOG.warn and LOG.error print stack trace. (isSplitParent, cleanupSplits, hasReferences): Added. Added toString to each of the PendingOperation implementations. In the ShutdownPendingOperation scan of meta data, removed check of startcode (if the server name is that of the dead server, it needs reassigning even if start code is good). Also, if server name is null -- possible if we are missing edits off end of log -- then the region should be reassigned just in case its from the dead server. Also, if reassigning, clear from pendingRegions. Server may have died after sending region is up but before the server confirms receipt in the meta scan. Added mare detail to each log. In OpenPendingOperation we were trying to clear pendingRegion in wrong place -- it was never executed (regions were always pending). M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionInfo.java Add split boolean. Output offline and split status in toString. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMemcache.java Comments. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java Moved getRegionDir here from HStoreFile. (COL_SPLITA, COL_SPLITB): Added. (closeAndSplit): Refactored to use new fast split method. StringUtils.formatTimeDiff(System.currentTimeMillis(), startTime)); (splitStoreFile): Moved into HStoreFile. (getSplitRegionDir, getSplitsDir, toString): Added. (needsSplit): Refactored to exploit new HStoreSize structure. Also manages notion of 'unsplitable' region. (largestHStore): Refactored. (removeSplitFromMETA, writeSplitToMETA, getSplit, hasReference): Added. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/util/Keying.java (intToBytes, getBytes): Added. A src/contrib/hbase/src/java/org/apache/hadoop/hbase/util/Writables.java Utility reading and writing Writables. git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@564012 13f79535-47bb-0310-9956-ffa450edef68
2007-08-08 16:30:13 -04:00
55. HADOOP-1662 Make region splits faster
56. HADOOP-1678 On region split, master should designate which host should
serve daughter splits. Phase 1: Master balances load for new regions and
when a region server fails.
57. HADOOP-1678 On region split, master should designate which host should
serve daughter splits. Phase 2: Master assigns children of split region
instead of HRegionServer serving both children.
58. HADOOP-1710 All updates should be batch updates
59. HADOOP-1711 HTable API should use interfaces instead of concrete classes as
method parameters and return values
HADOOP-1644 [hbase] Compactions should not block updates Disentangles flushes and compactions; flushes can proceed while a compaction is happening. Also, don't compact unless we hit compaction threshold: i.e. don't automatically compact on HRegion startup so regions can come online the faster. M src/contrib/hbase/conf/hbase-default.xml (hbase.hregion.compactionThreashold): Moved to be a hstore property as part of encapsulating compaction decision inside hstore. M src/contrib/hbase/src/test/org/apache/hadoop/hbase/HBaseTestCase.java Refactored. Moved here generalized content loading code that can be shared by tests. Add to setup and teardown the setup and removal of local test dir (if it exists). M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestCompare.java Added test of HStoreKey compare (It works other than one would at first expect). M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestSplit.java Bulk of content loading code has been moved up into the parent class. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HConnectionManager.java (tableExists): Restore to a check of if the asked-for table is in list of tables. As it was, a check for tableExists would just wait on all timeouts and retries to expire and then report table does not exist.. Fixed up debug message listing regions of a table. Added protection against meta table not having a COL_REGINFO (Seen in cluster testing -- probably a bug in row removal). M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStoreFile.java Loading store files, even if it was noticed that there was no corresponding map file, was still counting file as valid. Also fix merger -- was constructing MapFile.Reader directly rather than asking HStoreFile for the reader (HStoreFile knows how to do MapFile references) (rename): Added check that move succeeded and logging. In cluster-testing, the hdfs move of compacted file into place has failed on occasion (Need more info). M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStore.java Encapsulate ruling on whether a compaction should take place inside HStore. Added reading of the compactionThreshold her. Compaction threshold is currently just number of store files. Later may include other factors such as count of reference files. Cleaned up debug messages around reconstruction log. Removed compaction if size > 1 from constructor. Let compaction happen after we've been deployed (Compactions that happen while we are online can continue to take updates. Compaction in the constructor puts off our being able to take in updates). (close): Changed so it now returns set of store files. This used to be done by calls to flush. Since flush and compaction have been disentangled, a compaction can come in after flush and the list of files could be off. Having it done by close, can be sure list of files is complete. (flushCache): No longer returns set of store files. Added 'merging compaction' where we pick an arbitrary store file from disk and merge into it the content of memcache (Needs work). (getAllMapFiles): Renamed getAllStoreFiles. (needsCompaction): Added. (compactHelper): Added passing of maximum sequence number if already calculated. If compacting one file only, we used skip without rewriting the info file. Fixed. Refactored. Moved guts to new compact(outFile, listOfStores) method. (compact, CompactionReader): Added overrides and interface to support 'merging compaction' that takes files and memcache. In compaction, if we failed the move of the compacted file, all data had already been deleted. Changing, so deletion happens after confirmed move of compacted file. (getFull): Fixed bug where NPE when read of maps came back null. Revealed by our NOT compacting stores on startup. Meant could be two backing stores one of which had no data regards queried key. (getNMaps): Renamed countOfStoreFiles. (toString): Added. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStoreKey.java Added comment on 'odd'-looking comparison. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionServer.java Javadoc edit. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HLogEdit.java Only return first 128 bytes of value when toStringing (On cluster, was returning complete web pages in log). M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMaster.java Removed confusing debug message (made sense once -- but not now). Test rootRegionLocation for null before using it (can be null). M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMemcache.java Added comment that delete behavior needs study. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java Fixed merge so it doesn't do the incremental based off files returned by flush. Instead all is done in the one go after region closes (using files returned by close). Moved duplicated code to new filesByFamily method. (WriteState): Removed writesOngoing in favor of compacting and flushing flags. (flushCache): No longer returns list of files. M src/contrib/hbase/src/java/org/apache/hadoop/hbase/util/Writables.java Fix javadoc. git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@566459 13f79535-47bb-0310-9956-ffa450edef68
2007-08-15 21:07:51 -04:00
60. HADOOP-1644 Compactions should not block updates
60. HADOOP-1672 HBase Shell should use new client classes
(Edward Yoon via Stack).
61. HADOOP-1709 Make HRegionInterface more like that of HTable
HADOOP-1725 Client find of table regions should not include offlined, split parents