HADOOP-1516 HClient fails to readjust when ROOT or META redeployed on new region server

Detailed changes:

MiniHBaseCluster
- rewrite abortRegionServer, stopRegionServer - they now remove the
  server from the map of servers.
- rewrite waitOnRegionServer - now removes thread from map of threads

TestCleanRegionServerExit
- reduce Hadoop ipc client timeout and number of retries
- use rewritten stopRegionServer and waitOnRegionServer from MiniHBaseCluster
- add code to verify that failover worked
- moved testRegionServerAbort to separate test file

TestRegionServerAbort
- new test. Uses much the same code as TestCleanRegionServerExit but
  aborts the region server instead of shutting it down
  cleanly. Includes code to verify that failover worked.

hbase-site.xml (in src/contrib/hbase/src/test)
- reduce master lease timeout and time between lease timeout checks so
  that tests will run quicker.

HClient
- Major restructing of code that determines what region server to
  contact for a specific region. The main method findServersForTable
  is now recursive so that it will find the meta and root regions if
  they have not already been located or will re-find them if they have
  been reassigned and the old server can no longer be contacted.
- re-ordered administrative and general purpose methods so they are no
  longer located in seemingly random order.
- re-ordered code in ClientScanner.loadRegions so that if the location
  of the region changes, it will actually try to connect to the new
  server rather than continually trying to use the connection to the
  old server.

HLog
- use HashMap<Text, SequenceFile.Writer> instead of 
  TreeMap<Text, SequenceFile.Writer> because the TreeMap would return
  a value for a key it did not have (it was the value of another
  key). I have observed this before when the key is Text, but could
  not create a simple test case that reproduced the problem.
- added some new DEBUG level logging
- removed call to rollWriter() from closeAndDelete(). We don't need to
  start a new writer if we are closing the log.

HLogKey
- cleaned up per HADOOP-1466 (I initially modified it to add some
  debug logging which was later removed, but when I was making the
  modifications I took the opportunity to clean up the file)
- changed toString() format

HMaster
- better handling of RemoteException
- modified BaseScanner
 - now knows if it is scanning the root or a meta region
 - scanRegion no longer returns a value
 - if scanning the root region, it counts the number of meta regions
   it finds and sets a new AtomicInteger, numberOfMetaRegions when the
   scan is complete.
 - added abstract methods initialScan and maintenanceScan this allowed
   run method to be implemented in the base class.
- boolean rootScanned is now volatile
- modified RootScanner
 - moved actual scan into private method for readability (scanRoot)
 - implementation of abstract methods just call scanRoot
- add constructor for inner static class MetaRegion
- use a BlockingQueue to queue up work for the MetaScanner
- clean up handling of an unexpected region server exit
- PendingOperation.process now returns a boolean so that HMaster.run
  can determine if the operation completed or needs to be retried later
- PendingOperation processing no longer does a wait inside the process
  method since this might cause a deadlock if the current operation is
  waiting for another operation that has yet to be processed

HMsg
- removed MSG_REGIONSERVER_STOP_IN_ARRAY, MSG_NEW_REGION
- added MSG_REPORT_SPLIT

HRegionServer
- changed reportSplit to contain old region and new regions
- use IP from default interface rather than host name
- abort calls HLog.close() instead of HLog.rollWriter()



git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@559819 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Jim Kellerman 2007-07-26 14:15:17 +00:00
parent 43e253359a
commit 5f850fcee4
14 changed files with 1346 additions and 1074 deletions

View File

@ -73,3 +73,5 @@ Trunk (unreleased changes)
(Izaak Rubin via Stack) (Izaak Rubin via Stack)
47. HADOOP-1637 Fix to HScanner to Support Filters, Add Filter Tests to 47. HADOOP-1637 Fix to HScanner to Support Filters, Add Filter Tests to
TestScanner2 (Izaak Rubin via Stack) TestScanner2 (Izaak Rubin via Stack)
48. HADOOP-1516 HClient fails to readjust when ROOT or META redeployed on new
region server

File diff suppressed because it is too large Load Diff

View File

@ -97,12 +97,11 @@ public class HLog implements HConstants {
static void splitLog(Path rootDir, Path srcDir, FileSystem fs, static void splitLog(Path rootDir, Path srcDir, FileSystem fs,
Configuration conf) throws IOException { Configuration conf) throws IOException {
Path logfiles[] = fs.listPaths(srcDir); Path logfiles[] = fs.listPaths(srcDir);
if(LOG.isDebugEnabled()) { LOG.info("splitting " + logfiles.length + " log files in " +
LOG.debug("splitting " + logfiles.length + " log files in " +
srcDir.toString()); srcDir.toString());
}
TreeMap<Text, SequenceFile.Writer> logWriters = HashMap<Text, SequenceFile.Writer> logWriters =
new TreeMap<Text, SequenceFile.Writer>(); new HashMap<Text, SequenceFile.Writer>();
try { try {
for(int i = 0; i < logfiles.length; i++) { for(int i = 0; i < logfiles.length; i++) {
SequenceFile.Reader in = SequenceFile.Reader in =
@ -116,6 +115,9 @@ public class HLog implements HConstants {
if (w == null) { if (w == null) {
Path logfile = new Path(HStoreFile.getHRegionDir(rootDir, Path logfile = new Path(HStoreFile.getHRegionDir(rootDir,
regionName), HREGION_OLDLOGFILE_NAME); regionName), HREGION_OLDLOGFILE_NAME);
if (LOG.isDebugEnabled()) {
LOG.debug("getting new log file writer for path " + logfile);
}
w = SequenceFile.createWriter(fs, conf, logfile, HLogKey.class, w = SequenceFile.createWriter(fs, conf, logfile, HLogKey.class,
HLogEdit.class); HLogEdit.class);
logWriters.put(regionName, w); logWriters.put(regionName, w);
@ -140,9 +142,7 @@ public class HLog implements HConstants {
} }
} }
} }
if(LOG.isDebugEnabled()) { LOG.info("log file splitting completed for " + srcDir.toString());
LOG.debug("log file splitting completed for " + srcDir.toString());
}
} }
/** /**
@ -205,8 +205,6 @@ public class HLog implements HConstants {
// continue; // continue;
} }
} }
// Close the current writer (if any), and grab a new one. // Close the current writer (if any), and grab a new one.
if(writer != null) { if(writer != null) {
@ -280,7 +278,6 @@ public class HLog implements HConstants {
* @throws IOException * @throws IOException
*/ */
synchronized void closeAndDelete() throws IOException { synchronized void closeAndDelete() throws IOException {
rollWriter();
close(); close();
fs.delete(dir); fs.delete(dir);
} }

View File

@ -23,13 +23,13 @@ import org.apache.hadoop.io.*;
import java.io.*; import java.io.*;
/******************************************************************************* /**
* A Key for an entry in the change log. * A Key for an entry in the change log.
* *
* The log intermingles edits to many tables and rows, so each log entry * The log intermingles edits to many tables and rows, so each log entry
* identifies the appropriate table and row. Within a table and row, they're * identifies the appropriate table and row. Within a table and row, they're
* also sorted. * also sorted.
******************************************************************************/ */
public class HLogKey implements WritableComparable { public class HLogKey implements WritableComparable {
Text regionName = new Text(); Text regionName = new Text();
Text tablename = new Text(); Text tablename = new Text();
@ -79,16 +79,25 @@ public class HLogKey implements WritableComparable {
return logSeqNum; return logSeqNum;
} }
/**
* {@inheritDoc}
*/
@Override @Override
public String toString() { public String toString() {
return tablename + " " + regionName + " " + row + " " + logSeqNum; return tablename + "," + regionName + "," + row + "," + logSeqNum;
} }
/**
* {@inheritDoc}
*/
@Override @Override
public boolean equals(Object obj) { public boolean equals(Object obj) {
return compareTo(obj) == 0; return compareTo(obj) == 0;
} }
/**
* {@inheritDoc}
*/
@Override @Override
public int hashCode() { public int hashCode() {
int result = this.regionName.hashCode(); int result = this.regionName.hashCode();
@ -97,12 +106,12 @@ public class HLogKey implements WritableComparable {
return result; return result;
} }
////////////////////////////////////////////////////////////////////////////// //
// Comparable // Comparable
////////////////////////////////////////////////////////////////////////////// //
/* (non-Javadoc) /**
* @see java.lang.Comparable#compareTo(java.lang.Object) * {@inheritDoc}
*/ */
public int compareTo(Object o) { public int compareTo(Object o) {
HLogKey other = (HLogKey) o; HLogKey other = (HLogKey) o;
@ -124,12 +133,12 @@ public class HLogKey implements WritableComparable {
return result; return result;
} }
////////////////////////////////////////////////////////////////////////////// //
// Writable // Writable
////////////////////////////////////////////////////////////////////////////// //
/* (non-Javadoc) /**
* @see org.apache.hadoop.io.Writable#write(java.io.DataOutput) * {@inheritDoc}
*/ */
public void write(DataOutput out) throws IOException { public void write(DataOutput out) throws IOException {
this.regionName.write(out); this.regionName.write(out);
@ -138,8 +147,8 @@ public class HLogKey implements WritableComparable {
out.writeLong(logSeqNum); out.writeLong(logSeqNum);
} }
/* (non-Javadoc) /**
* @see org.apache.hadoop.io.Writable#readFields(java.io.DataInput) * {@inheritDoc}
*/ */
public void readFields(DataInput in) throws IOException { public void readFields(DataInput in) throws IOException {
this.regionName.readFields(in); this.regionName.readFields(in);

File diff suppressed because it is too large Load Diff

View File

@ -42,10 +42,7 @@ public class HMsg implements Writable {
/** Master tells region server to stop */ /** Master tells region server to stop */
public static final byte MSG_REGIONSERVER_STOP = 5; public static final byte MSG_REGIONSERVER_STOP = 5;
public static final HMsg [] MSG_REGIONSERVER_STOP_IN_ARRAY =
{new HMsg(HMsg.MSG_REGIONSERVER_STOP)};
/** Stop serving the specified region and don't report back that it's closed */ /** Stop serving the specified region and don't report back that it's closed */
public static final byte MSG_REGION_CLOSE_WITHOUT_REPORT = 6; public static final byte MSG_REGION_CLOSE_WITHOUT_REPORT = 6;
@ -57,10 +54,20 @@ public class HMsg implements Writable {
/** region server is no longer serving the specified region */ /** region server is no longer serving the specified region */
public static final byte MSG_REPORT_CLOSE = 101; public static final byte MSG_REPORT_CLOSE = 101;
/** region server is now serving a region produced by a region split */ /**
public static final byte MSG_NEW_REGION = 103; * region server split the region associated with this message.
*
* note that this message is immediately followed by two MSG_REPORT_OPEN
* messages, one for each of the new regions resulting from the split
*/
public static final byte MSG_REPORT_SPLIT = 103;
/** region server is shutting down */ /**
* region server is shutting down
*
* note that this message is followed by MSG_REPORT_CLOSE messages for each
* region the region server was serving.
*/
public static final byte MSG_REPORT_EXITING = 104; public static final byte MSG_REPORT_EXITING = 104;
byte msg; byte msg;
@ -108,6 +115,9 @@ public class HMsg implements Writable {
return info; return info;
} }
/**
* {@inheritDoc}
*/
@Override @Override
public String toString() { public String toString() {
StringBuilder message = new StringBuilder(); StringBuilder message = new StringBuilder();
@ -140,8 +150,8 @@ public class HMsg implements Writable {
message.append("MSG_REPORT_CLOSE : "); message.append("MSG_REPORT_CLOSE : ");
break; break;
case MSG_NEW_REGION: case MSG_REPORT_SPLIT:
message.append("MSG_NEW_REGION : "); message.append("MSG_REGION_SPLIT : ");
break; break;
case MSG_REPORT_EXITING: case MSG_REPORT_EXITING:
@ -162,16 +172,16 @@ public class HMsg implements Writable {
// Writable // Writable
////////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////////
/* (non-Javadoc) /**
* @see org.apache.hadoop.io.Writable#write(java.io.DataOutput) * {@inheritDoc}
*/ */
public void write(DataOutput out) throws IOException { public void write(DataOutput out) throws IOException {
out.writeByte(msg); out.writeByte(msg);
info.write(out); info.write(out);
} }
/* (non-Javadoc) /**
* @see org.apache.hadoop.io.Writable#readFields(java.io.DataInput) * {@inheritDoc}
*/ */
public void readFields(DataInput in) throws IOException { public void readFields(DataInput in) throws IOException {
this.msg = in.readByte(); this.msg = in.readByte();

View File

@ -407,6 +407,23 @@ public class HRegion implements HConstants {
* @throws IOException * @throws IOException
*/ */
public Vector<HStoreFile> close() throws IOException { public Vector<HStoreFile> close() throws IOException {
return close(false);
}
/**
* Close down this HRegion. Flush the cache unless abort parameter is true,
* Shut down each HStore, don't service any more calls.
*
* This method could take some time to execute, so don't call it from a
* time-sensitive thread.
*
* @param abort true if server is aborting (only during testing)
* @return Vector of all the storage files that the HRegion's component
* HStores make use of. It's a list of HStoreFile objects.
*
* @throws IOException
*/
Vector<HStoreFile> close(boolean abort) throws IOException {
lock.obtainWriteLock(); lock.obtainWriteLock();
try { try {
boolean shouldClose = false; boolean shouldClose = false;
@ -430,7 +447,11 @@ public class HRegion implements HConstants {
return null; return null;
} }
LOG.info("closing region " + this.regionInfo.regionName); LOG.info("closing region " + this.regionInfo.regionName);
Vector<HStoreFile> allHStoreFiles = internalFlushcache(); Vector<HStoreFile> allHStoreFiles = null;
if (!abort) {
// Don't flush the cache if we are aborting during a test.
allHStoreFiles = internalFlushcache();
}
for (HStore store: stores.values()) { for (HStore store: stores.values()) {
store.close(); store.close();
} }

View File

@ -194,7 +194,7 @@ public class HRegionServer implements HConstants, HRegionInterface, Runnable {
private void split(final HRegion region, final Text midKey) private void split(final HRegion region, final Text midKey)
throws IOException { throws IOException {
final Text oldRegion = region.getRegionName(); final HRegionInfo oldRegionInfo = region.getRegionInfo();
final HRegion[] newRegions = region.closeAndSplit(midKey, this); final HRegion[] newRegions = region.closeAndSplit(midKey, this);
// When a region is split, the META table needs to updated if we're // When a region is split, the META table needs to updated if we're
@ -204,9 +204,7 @@ public class HRegionServer implements HConstants, HRegionInterface, Runnable {
final Text tableToUpdate = final Text tableToUpdate =
region.getRegionInfo().tableDesc.getName().equals(META_TABLE_NAME) ? region.getRegionInfo().tableDesc.getName().equals(META_TABLE_NAME) ?
ROOT_TABLE_NAME : META_TABLE_NAME; ROOT_TABLE_NAME : META_TABLE_NAME;
if(LOG.isDebugEnabled()) { LOG.info("Updating " + tableToUpdate + " with region split info");
LOG.debug("Updating " + tableToUpdate + " with region split info");
}
// Remove old region from META // Remove old region from META
@ -249,11 +247,11 @@ public class HRegionServer implements HConstants, HRegionInterface, Runnable {
if (LOG.isDebugEnabled()) { if (LOG.isDebugEnabled()) {
LOG.debug("Reporting region split to master"); LOG.debug("Reporting region split to master");
} }
reportSplit(newRegions[0].getRegionInfo(), newRegions[1]. reportSplit(oldRegionInfo, newRegions[0].getRegionInfo(),
getRegionInfo()); newRegions[1].getRegionInfo());
LOG.info("region split, META update, and report to master all" + LOG.info("region split, META update, and report to master all" +
" successful. Old region=" + oldRegion + ", new regions: " + " successful. Old region=" + oldRegionInfo.getRegionName() +
newRegions[0].getRegionName() + ", " + ", new regions: " + newRegions[0].getRegionName() + ", " +
newRegions[1].getRegionName()); newRegions[1].getRegionName());
// Finally, start serving the new regions // Finally, start serving the new regions
@ -262,6 +260,7 @@ public class HRegionServer implements HConstants, HRegionInterface, Runnable {
try { try {
onlineRegions.put(newRegions[0].getRegionName(), newRegions[0]); onlineRegions.put(newRegions[0].getRegionName(), newRegions[0]);
onlineRegions.put(newRegions[1].getRegionName(), newRegions[1]); onlineRegions.put(newRegions[1].getRegionName(), newRegions[1]);
} finally { } finally {
lock.writeLock().unlock(); lock.writeLock().unlock();
} }
@ -461,23 +460,19 @@ public class HRegionServer implements HConstants, HRegionInterface, Runnable {
address.getPort(), conf.getInt("hbase.regionserver.handler.count", 10), address.getPort(), conf.getInt("hbase.regionserver.handler.count", 10),
false, conf); false, conf);
// Use configured nameserver & interface to get local hostname. // Use interface to get the 'real' IP for this host.
// 'serverInfo' is sent to master. Should have name of this host rather than // 'serverInfo' is sent to master. Should have the real IP of this host
// 'localhost' or 0.0.0.0 or 127.0.0.1 in it. // rather than 'localhost' or 0.0.0.0 or 127.0.0.1 in it.
String localHostname = DNS.getDefaultHost( String realIP = DNS.getDefaultIP(
conf.get("dfs.datanode.dns.interface","default"), conf.get("dfs.datanode.dns.interface","default"));
conf.get("dfs.datanode.dns.nameserver","default"));
InetSocketAddress hostnameAddress = new InetSocketAddress(localHostname,
server.getListenerAddress().getPort());
this.serverInfo = new HServerInfo(new HServerAddress(hostnameAddress),
this.rand.nextLong());
// Local file paths this.serverInfo = new HServerInfo(new HServerAddress(
String serverName = localHostname + "_" + new InetSocketAddress(realIP, server.getListenerAddress().getPort())),
this.serverInfo.getServerAddress().getPort(); this.rand.nextLong());
Path logdir = new Path(rootDir, "log" + "_" + realIP + "_" +
this.serverInfo.getServerAddress().getPort());
Path logdir = new Path(rootDir, "log" + "_" + serverName);
// Logging // Logging
this.fs = FileSystem.get(conf); this.fs = FileSystem.get(conf);
if(fs.exists(logdir)) { if(fs.exists(logdir)) {
@ -636,54 +631,48 @@ public class HRegionServer implements HConstants, HRegionInterface, Runnable {
} }
try { try {
HMsg msgs[] = hbaseMaster.regionServerReport(serverInfo, outboundArray); HMsg msgs[] =
hbaseMaster.regionServerReport(serverInfo, outboundArray);
lastMsg = System.currentTimeMillis(); lastMsg = System.currentTimeMillis();
// Queue up the HMaster's instruction stream for processing // Queue up the HMaster's instruction stream for processing
synchronized(toDo) {
boolean restart = false; boolean restart = false;
for(int i = 0; i < msgs.length && !stopRequested && !restart; i++) { for(int i = 0; i < msgs.length && !stopRequested && !restart; i++) {
switch(msgs[i].getMsg()) { switch(msgs[i].getMsg()) {
case HMsg.MSG_CALL_SERVER_STARTUP: case HMsg.MSG_CALL_SERVER_STARTUP:
if (LOG.isDebugEnabled()) {
LOG.debug("Got call server startup message");
}
closeAllRegions();
restart = true;
break;
case HMsg.MSG_REGIONSERVER_STOP:
if (LOG.isDebugEnabled()) {
LOG.debug("Got regionserver stop message");
}
stopRequested = true;
break;
default:
if (LOG.isDebugEnabled()) {
LOG.debug("Got default message");
}
try {
toDo.put(new ToDoEntry(msgs[i]));
} catch (InterruptedException e) {
throw new RuntimeException("Putting into msgQueue was interrupted.", e);
}
}
}
if(restart || stopRequested) {
toDo.clear();
break;
}
if(toDo.size() > 0) {
if (LOG.isDebugEnabled()) { if (LOG.isDebugEnabled()) {
LOG.debug("notify on todo"); LOG.debug("Got call server startup message");
}
closeAllRegions();
restart = true;
break;
case HMsg.MSG_REGIONSERVER_STOP:
if (LOG.isDebugEnabled()) {
LOG.debug("Got regionserver stop message");
}
stopRequested = true;
break;
default:
if (LOG.isDebugEnabled()) {
LOG.debug("Got default message");
}
try {
toDo.put(new ToDoEntry(msgs[i]));
} catch (InterruptedException e) {
throw new RuntimeException("Putting into msgQueue was interrupted.", e);
} }
toDo.notifyAll();
} }
} }
if(restart || stopRequested) {
toDo.clear();
break;
}
} catch (IOException e) { } catch (IOException e) {
if (e instanceof RemoteException) { if (e instanceof RemoteException) {
try { try {
@ -730,7 +719,7 @@ public class HRegionServer implements HConstants, HRegionInterface, Runnable {
if (abortRequested) { if (abortRequested) {
try { try {
log.rollWriter(); log.close();
} catch (IOException e) { } catch (IOException e) {
if (e instanceof RemoteException) { if (e instanceof RemoteException) {
try { try {
@ -742,10 +731,11 @@ public class HRegionServer implements HConstants, HRegionInterface, Runnable {
} }
LOG.warn(e); LOG.warn(e);
} }
closeAllRegions(); // Don't leave any open file handles
LOG.info("aborting server at: " + LOG.info("aborting server at: " +
serverInfo.getServerAddress().toString()); serverInfo.getServerAddress().toString());
} else { } else {
Vector<HRegion> closedRegions = closeAllRegions(); ArrayList<HRegion> closedRegions = closeAllRegions();
try { try {
log.closeAndDelete(); log.closeAndDelete();
} catch (IOException e) { } catch (IOException e) {
@ -815,10 +805,12 @@ public class HRegionServer implements HConstants, HRegionInterface, Runnable {
* updated the meta or root regions, and the master will pick that up on its * updated the meta or root regions, and the master will pick that up on its
* next rescan of the root or meta tables. * next rescan of the root or meta tables.
*/ */
void reportSplit(HRegionInfo newRegionA, HRegionInfo newRegionB) { void reportSplit(HRegionInfo oldRegion, HRegionInfo newRegionA,
HRegionInfo newRegionB) {
synchronized(outboundMsgs) { synchronized(outboundMsgs) {
outboundMsgs.add(new HMsg(HMsg.MSG_NEW_REGION, newRegionA)); outboundMsgs.add(new HMsg(HMsg.MSG_REPORT_SPLIT, oldRegion));
outboundMsgs.add(new HMsg(HMsg.MSG_NEW_REGION, newRegionB)); outboundMsgs.add(new HMsg(HMsg.MSG_REPORT_OPEN, newRegionA));
outboundMsgs.add(new HMsg(HMsg.MSG_REPORT_OPEN, newRegionB));
} }
} }
@ -859,9 +851,7 @@ public class HRegionServer implements HConstants, HRegionInterface, Runnable {
continue; continue;
} }
try { try {
if (LOG.isDebugEnabled()) { LOG.info(e.msg.toString());
LOG.debug(e.msg.toString());
}
switch(e.msg.getMsg()) { switch(e.msg.getMsg()) {
@ -942,8 +932,8 @@ public class HRegionServer implements HConstants, HRegionInterface, Runnable {
} }
/** Called either when the master tells us to restart or from stop() */ /** Called either when the master tells us to restart or from stop() */
Vector<HRegion> closeAllRegions() { ArrayList<HRegion> closeAllRegions() {
Vector<HRegion> regionsToClose = new Vector<HRegion>(); ArrayList<HRegion> regionsToClose = new ArrayList<HRegion>();
this.lock.writeLock().lock(); this.lock.writeLock().lock();
try { try {
regionsToClose.addAll(onlineRegions.values()); regionsToClose.addAll(onlineRegions.values());
@ -956,7 +946,7 @@ public class HRegionServer implements HConstants, HRegionInterface, Runnable {
LOG.debug("closing region " + region.getRegionName()); LOG.debug("closing region " + region.getRegionName());
} }
try { try {
region.close(); region.close(abortRequested);
LOG.debug("region closed " + region.getRegionName()); LOG.debug("region closed " + region.getRegionName());
} catch (IOException e) { } catch (IOException e) {
if (e instanceof RemoteException) { if (e instanceof RemoteException) {

View File

@ -265,10 +265,11 @@ class HStore implements HConstants {
|| !key.getRegionName().equals(this.regionName) || !key.getRegionName().equals(this.regionName)
|| !HStoreKey.extractFamily(column).equals(this.familyName)) { || !HStoreKey.extractFamily(column).equals(this.familyName)) {
if (LOG.isDebugEnabled()) { if (LOG.isDebugEnabled()) {
LOG.debug("Passing on edit " + key.getRegionName() + ", " LOG.debug("Passing on edit " + key.getRegionName() + ", " +
+ column.toString() + ": " + new String(val.getVal()) column.toString() + ": " +
+ ", my region: " + this.regionName + ", my column: " new String(val.getVal(), UTF8_ENCODING) +
+ this.familyName); ", my region: " + this.regionName + ", my column: " +
this.familyName);
} }
continue; continue;
} }

View File

@ -21,14 +21,17 @@ package org.apache.hadoop.hbase;
import java.io.IOException; import java.io.IOException;
public class RegionNotFoundException extends IOException { /** Thrown when a table can not be located */
public class TableNotFoundException extends IOException {
private static final long serialVersionUID = 993179627856392526L; private static final long serialVersionUID = 993179627856392526L;
public RegionNotFoundException() { /** default constructor */
public TableNotFoundException() {
super(); super();
} }
public RegionNotFoundException(String s) { /** @param s message */
public TableNotFoundException(String s) {
super(s); super(s);
} }
} }

View File

@ -52,10 +52,27 @@
</property> </property>
<property> <property>
<name>hbase.regionserver.handler.count</name> <name>hbase.regionserver.handler.count</name>
<value>3</value> <value>5</value>
<description>Count of RPC Server instances spun up on RegionServers <description>Count of RPC Server instances spun up on RegionServers
Same property is used by the HMaster for count of master handlers. Same property is used by the HMaster for count of master handlers.
Default is 10. Default is 10.
</description> </description>
</property> </property>
<property>
<name>hbase.master.lease.period</name>
<value>5000</value>
<description>Length of time the master will wait before timing out a region
server lease. Since region servers report in every second (see above), this
value has been reduced so that the master will notice a dead region server
sooner. The default is 30 seconds.
</description>
</property>
<property>
<name>hbase.master.lease.thread.wakefrequency</name>
<value>2500</value>
<description>The interval between checks for expired region server leases.
This value has been reduced due to the other reduced values above so that
the master will notice a dead region server sooner. The default is 15 seconds.
</description>
</property>
</configuration> </configuration>

View File

@ -196,17 +196,24 @@ public class MiniHBaseCluster implements HConstants {
return master.getMasterAddress(); return master.getMasterAddress();
} }
/**
* Cause a region server to exit without cleaning up
*
* @param serverNumber
*/
public void abortRegionServer(int serverNumber) {
HRegionServer server = this.regionServers.remove(serverNumber);
server.abort();
}
/** /**
* Shut down the specified region server cleanly * Shut down the specified region server cleanly
* *
* @param serverNumber * @param serverNumber
*/ */
public void stopRegionServer(int serverNumber) { public void stopRegionServer(int serverNumber) {
if (serverNumber >= regionServers.size()) { HRegionServer server = this.regionServers.remove(serverNumber);
throw new ArrayIndexOutOfBoundsException( server.stop();
"serverNumber > number of region servers");
}
this.regionServers.get(serverNumber).stop();
} }
/** /**
@ -215,30 +222,14 @@ public class MiniHBaseCluster implements HConstants {
* @param serverNumber * @param serverNumber
*/ */
public void waitOnRegionServer(int serverNumber) { public void waitOnRegionServer(int serverNumber) {
if (serverNumber >= regionServers.size()) { Thread regionServerThread = this.regionThreads.remove(serverNumber);
throw new ArrayIndexOutOfBoundsException(
"serverNumber > number of region servers");
}
try { try {
this.regionThreads.get(serverNumber).join(); regionServerThread.join();
} catch (InterruptedException e) { } catch (InterruptedException e) {
e.printStackTrace(); e.printStackTrace();
} }
} }
/**
* Cause a region server to exit without cleaning up
*
* @param serverNumber
*/
public void abortRegionServer(int serverNumber) {
if(serverNumber >= this.regionServers.size()) {
throw new ArrayIndexOutOfBoundsException(
"serverNumber > number of region servers");
}
this.regionServers.get(serverNumber).abort();
}
/** Shut down the HBase cluster */ /** Shut down the HBase cluster */
public void shutdown() { public void shutdown() {
LOG.info("Shutting down the HBase Cluster"); LOG.info("Shutting down the HBase Cluster");

View File

@ -20,64 +20,78 @@
package org.apache.hadoop.hbase; package org.apache.hadoop.hbase;
import java.io.IOException; import java.io.IOException;
import java.util.TreeMap;
import org.apache.hadoop.io.Text;
/** /**
* Tests region server failover when a region server exits. * Tests region server failover when a region server exits.
*/ */
public class TestCleanRegionServerExit extends HBaseClusterTestCase { public class TestCleanRegionServerExit extends HBaseClusterTestCase {
private HClient client; private HClient client;
/** constructor */
public TestCleanRegionServerExit() {
super();
conf.setInt("ipc.client.timeout", 5000); // reduce ipc client timeout
conf.setInt("ipc.client.connect.max.retries", 5); // and number of retries
conf.setInt("hbase.client.retries.number", 2); // reduce HBase retries
}
/**
* {@inheritDoc}
*/
@Override @Override
public void setUp() throws Exception { public void setUp() throws Exception {
super.setUp(); super.setUp();
this.client = new HClient(conf); this.client = new HClient(conf);
} }
public void testCleanRegionServerExit() /**
throws IOException, InterruptedException { * The test
try { * @throws IOException
// When the META table can be opened, the region servers are running */
this.client.openTable(HConstants.META_TABLE_NAME); public void testCleanRegionServerExit() throws IOException {
// Put something into the meta table.
this.client.createTable(new HTableDescriptor(getName()));
// Get current region server instance.
HRegionServer hsr = this.cluster.regionServers.get(0);
Thread hrst = this.cluster.regionThreads.get(0);
// Start up a new one to take over serving of root and meta after we shut
// down the current meta/root host.
this.cluster.startRegionServer();
// Now shutdown the region server and wait for it to go down.
hsr.stop();
hrst.join();
// The recalibration of the client is not working properly. FIX.
// After above is fixed, add in assertions that we can get data from
// newly located meta table.
} catch(Exception e) {
e.printStackTrace();
fail();
}
}
/* Comment out till recalibration of client is working properly.
public void testRegionServerAbort()
throws IOException, InterruptedException {
// When the META table can be opened, the region servers are running // When the META table can be opened, the region servers are running
this.client.openTable(HConstants.META_TABLE_NAME); this.client.openTable(HConstants.META_TABLE_NAME);
// Put something into the meta table. // Put something into the meta table.
this.client.createTable(new HTableDescriptor(getName())); String tableName = getName();
// Get current region server instance. HTableDescriptor desc = new HTableDescriptor(tableName);
HRegionServer hsr = this.cluster.regionServers.get(0); desc.addFamily(new HColumnDescriptor(HConstants.COLUMN_FAMILY.toString()));
Thread hrst = this.cluster.regionThreads.get(0); this.client.createTable(desc);
// Start up a new one to take over serving of root and meta after we shut // put some values in the table
// down the current meta/root host. this.client.openTable(new Text(tableName));
Text row = new Text("row1");
long lockid = client.startUpdate(row);
client.put(lockid, HConstants.COLUMN_FAMILY,
tableName.getBytes(HConstants.UTF8_ENCODING));
client.commit(lockid);
// Start up a new region server to take over serving of root and meta
// after we shut down the current meta/root host.
this.cluster.startRegionServer(); this.cluster.startRegionServer();
// Force a region server to exit "ungracefully" // Now shutdown the region server and wait for it to go down.
hsr.abort(); this.cluster.stopRegionServer(0);
hrst.join(); this.cluster.waitOnRegionServer(0);
// The recalibration of the client is not working properly. FIX.
// After above is fixed, add in assertions that we can get data from // Verify that the client can find the data after the region has been moved
// newly located meta table. // to a different server
HScannerInterface scanner =
client.obtainScanner(HConstants.COLUMN_FAMILY_ARRAY, new Text());
try {
HStoreKey key = new HStoreKey();
TreeMap<Text, byte[]> results = new TreeMap<Text, byte[]>();
while (scanner.next(key, results)) {
assertTrue(key.getRow().equals(row));
assertEquals(1, results.size());
byte[] bytes = results.get(HConstants.COLUMN_FAMILY);
assertNotNull(bytes);
assertTrue(tableName.equals(new String(bytes, HConstants.UTF8_ENCODING)));
}
System.out.println("Success!");
} finally {
scanner.close();
}
} }
*/
} }

View File

@ -0,0 +1,100 @@
/**
* Copyright 2007 The Apache Software Foundation
*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.hbase;
import java.io.IOException;
import java.util.TreeMap;
import org.apache.hadoop.io.Text;
import org.apache.log4j.Level;
import org.apache.log4j.Logger;
/**
* Tests region server failover when a region server exits.
*/
public class TestRegionServerAbort extends HBaseClusterTestCase {
private HClient client;
/** constructor */
public TestRegionServerAbort() {
super();
conf.setInt("ipc.client.timeout", 5000); // reduce client timeout
conf.setInt("ipc.client.connect.max.retries", 5); // and number of retries
conf.setInt("hbase.client.retries.number", 2); // reduce HBase retries
// Logger.getLogger(this.getClass().getPackage().getName()).setLevel(Level.DEBUG);
}
/**
* {@inheritDoc}
*/
@Override
public void setUp() throws Exception {
super.setUp();
this.client = new HClient(conf);
}
/**
* The test
* @throws IOException
*/
public void testRegionServerAbort() throws IOException {
// When the META table can be opened, the region servers are running
this.client.openTable(HConstants.META_TABLE_NAME);
// Put something into the meta table.
String tableName = getName();
HTableDescriptor desc = new HTableDescriptor(tableName);
desc.addFamily(new HColumnDescriptor(HConstants.COLUMN_FAMILY.toString()));
this.client.createTable(desc);
// put some values in the table
this.client.openTable(new Text(tableName));
Text row = new Text("row1");
long lockid = client.startUpdate(row);
client.put(lockid, HConstants.COLUMN_FAMILY,
tableName.getBytes(HConstants.UTF8_ENCODING));
client.commit(lockid);
// Start up a new region server to take over serving of root and meta
// after we shut down the current meta/root host.
this.cluster.startRegionServer();
// Now shutdown the region server and wait for it to go down.
this.cluster.abortRegionServer(0);
this.cluster.waitOnRegionServer(0);
// Verify that the client can find the data after the region has been moved
// to a different server
HScannerInterface scanner =
client.obtainScanner(HConstants.COLUMN_FAMILY_ARRAY, new Text());
try {
HStoreKey key = new HStoreKey();
TreeMap<Text, byte[]> results = new TreeMap<Text, byte[]>();
while (scanner.next(key, results)) {
assertTrue(key.getRow().equals(row));
assertEquals(1, results.size());
byte[] bytes = results.get(HConstants.COLUMN_FAMILY);
assertNotNull(bytes);
assertTrue(tableName.equals(new String(bytes, HConstants.UTF8_ENCODING)));
}
System.out.println("Success!");
} finally {
scanner.close();
}
}
}