HADOOP-1516 HClient fails to readjust when ROOT or META redeployed on new region server
Detailed changes: MiniHBaseCluster - rewrite abortRegionServer, stopRegionServer - they now remove the server from the map of servers. - rewrite waitOnRegionServer - now removes thread from map of threads TestCleanRegionServerExit - reduce Hadoop ipc client timeout and number of retries - use rewritten stopRegionServer and waitOnRegionServer from MiniHBaseCluster - add code to verify that failover worked - moved testRegionServerAbort to separate test file TestRegionServerAbort - new test. Uses much the same code as TestCleanRegionServerExit but aborts the region server instead of shutting it down cleanly. Includes code to verify that failover worked. hbase-site.xml (in src/contrib/hbase/src/test) - reduce master lease timeout and time between lease timeout checks so that tests will run quicker. HClient - Major restructing of code that determines what region server to contact for a specific region. The main method findServersForTable is now recursive so that it will find the meta and root regions if they have not already been located or will re-find them if they have been reassigned and the old server can no longer be contacted. - re-ordered administrative and general purpose methods so they are no longer located in seemingly random order. - re-ordered code in ClientScanner.loadRegions so that if the location of the region changes, it will actually try to connect to the new server rather than continually trying to use the connection to the old server. HLog - use HashMap<Text, SequenceFile.Writer> instead of TreeMap<Text, SequenceFile.Writer> because the TreeMap would return a value for a key it did not have (it was the value of another key). I have observed this before when the key is Text, but could not create a simple test case that reproduced the problem. - added some new DEBUG level logging - removed call to rollWriter() from closeAndDelete(). We don't need to start a new writer if we are closing the log. HLogKey - cleaned up per HADOOP-1466 (I initially modified it to add some debug logging which was later removed, but when I was making the modifications I took the opportunity to clean up the file) - changed toString() format HMaster - better handling of RemoteException - modified BaseScanner - now knows if it is scanning the root or a meta region - scanRegion no longer returns a value - if scanning the root region, it counts the number of meta regions it finds and sets a new AtomicInteger, numberOfMetaRegions when the scan is complete. - added abstract methods initialScan and maintenanceScan this allowed run method to be implemented in the base class. - boolean rootScanned is now volatile - modified RootScanner - moved actual scan into private method for readability (scanRoot) - implementation of abstract methods just call scanRoot - add constructor for inner static class MetaRegion - use a BlockingQueue to queue up work for the MetaScanner - clean up handling of an unexpected region server exit - PendingOperation.process now returns a boolean so that HMaster.run can determine if the operation completed or needs to be retried later - PendingOperation processing no longer does a wait inside the process method since this might cause a deadlock if the current operation is waiting for another operation that has yet to be processed HMsg - removed MSG_REGIONSERVER_STOP_IN_ARRAY, MSG_NEW_REGION - added MSG_REPORT_SPLIT HRegionServer - changed reportSplit to contain old region and new regions - use IP from default interface rather than host name - abort calls HLog.close() instead of HLog.rollWriter() git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk/src/contrib/hbase@559819 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
43e253359a
commit
5f850fcee4
|
@ -73,3 +73,5 @@ Trunk (unreleased changes)
|
|||
(Izaak Rubin via Stack)
|
||||
47. HADOOP-1637 Fix to HScanner to Support Filters, Add Filter Tests to
|
||||
TestScanner2 (Izaak Rubin via Stack)
|
||||
48. HADOOP-1516 HClient fails to readjust when ROOT or META redeployed on new
|
||||
region server
|
||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -97,12 +97,11 @@ public class HLog implements HConstants {
|
|||
static void splitLog(Path rootDir, Path srcDir, FileSystem fs,
|
||||
Configuration conf) throws IOException {
|
||||
Path logfiles[] = fs.listPaths(srcDir);
|
||||
if(LOG.isDebugEnabled()) {
|
||||
LOG.debug("splitting " + logfiles.length + " log files in " +
|
||||
LOG.info("splitting " + logfiles.length + " log files in " +
|
||||
srcDir.toString());
|
||||
}
|
||||
TreeMap<Text, SequenceFile.Writer> logWriters =
|
||||
new TreeMap<Text, SequenceFile.Writer>();
|
||||
|
||||
HashMap<Text, SequenceFile.Writer> logWriters =
|
||||
new HashMap<Text, SequenceFile.Writer>();
|
||||
try {
|
||||
for(int i = 0; i < logfiles.length; i++) {
|
||||
SequenceFile.Reader in =
|
||||
|
@ -116,6 +115,9 @@ public class HLog implements HConstants {
|
|||
if (w == null) {
|
||||
Path logfile = new Path(HStoreFile.getHRegionDir(rootDir,
|
||||
regionName), HREGION_OLDLOGFILE_NAME);
|
||||
if (LOG.isDebugEnabled()) {
|
||||
LOG.debug("getting new log file writer for path " + logfile);
|
||||
}
|
||||
w = SequenceFile.createWriter(fs, conf, logfile, HLogKey.class,
|
||||
HLogEdit.class);
|
||||
logWriters.put(regionName, w);
|
||||
|
@ -140,9 +142,7 @@ public class HLog implements HConstants {
|
|||
}
|
||||
}
|
||||
}
|
||||
if(LOG.isDebugEnabled()) {
|
||||
LOG.debug("log file splitting completed for " + srcDir.toString());
|
||||
}
|
||||
LOG.info("log file splitting completed for " + srcDir.toString());
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -205,8 +205,6 @@ public class HLog implements HConstants {
|
|||
// continue;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
|
||||
// Close the current writer (if any), and grab a new one.
|
||||
if(writer != null) {
|
||||
|
@ -280,7 +278,6 @@ public class HLog implements HConstants {
|
|||
* @throws IOException
|
||||
*/
|
||||
synchronized void closeAndDelete() throws IOException {
|
||||
rollWriter();
|
||||
close();
|
||||
fs.delete(dir);
|
||||
}
|
||||
|
|
|
@ -23,13 +23,13 @@ import org.apache.hadoop.io.*;
|
|||
|
||||
import java.io.*;
|
||||
|
||||
/*******************************************************************************
|
||||
/**
|
||||
* A Key for an entry in the change log.
|
||||
*
|
||||
* The log intermingles edits to many tables and rows, so each log entry
|
||||
* identifies the appropriate table and row. Within a table and row, they're
|
||||
* also sorted.
|
||||
******************************************************************************/
|
||||
*/
|
||||
public class HLogKey implements WritableComparable {
|
||||
Text regionName = new Text();
|
||||
Text tablename = new Text();
|
||||
|
@ -79,16 +79,25 @@ public class HLogKey implements WritableComparable {
|
|||
return logSeqNum;
|
||||
}
|
||||
|
||||
/**
|
||||
* {@inheritDoc}
|
||||
*/
|
||||
@Override
|
||||
public String toString() {
|
||||
return tablename + " " + regionName + " " + row + " " + logSeqNum;
|
||||
return tablename + "," + regionName + "," + row + "," + logSeqNum;
|
||||
}
|
||||
|
||||
/**
|
||||
* {@inheritDoc}
|
||||
*/
|
||||
@Override
|
||||
public boolean equals(Object obj) {
|
||||
return compareTo(obj) == 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* {@inheritDoc}
|
||||
*/
|
||||
@Override
|
||||
public int hashCode() {
|
||||
int result = this.regionName.hashCode();
|
||||
|
@ -97,12 +106,12 @@ public class HLogKey implements WritableComparable {
|
|||
return result;
|
||||
}
|
||||
|
||||
//////////////////////////////////////////////////////////////////////////////
|
||||
//
|
||||
// Comparable
|
||||
//////////////////////////////////////////////////////////////////////////////
|
||||
//
|
||||
|
||||
/* (non-Javadoc)
|
||||
* @see java.lang.Comparable#compareTo(java.lang.Object)
|
||||
/**
|
||||
* {@inheritDoc}
|
||||
*/
|
||||
public int compareTo(Object o) {
|
||||
HLogKey other = (HLogKey) o;
|
||||
|
@ -124,12 +133,12 @@ public class HLogKey implements WritableComparable {
|
|||
return result;
|
||||
}
|
||||
|
||||
//////////////////////////////////////////////////////////////////////////////
|
||||
//
|
||||
// Writable
|
||||
//////////////////////////////////////////////////////////////////////////////
|
||||
//
|
||||
|
||||
/* (non-Javadoc)
|
||||
* @see org.apache.hadoop.io.Writable#write(java.io.DataOutput)
|
||||
/**
|
||||
* {@inheritDoc}
|
||||
*/
|
||||
public void write(DataOutput out) throws IOException {
|
||||
this.regionName.write(out);
|
||||
|
@ -138,8 +147,8 @@ public class HLogKey implements WritableComparable {
|
|||
out.writeLong(logSeqNum);
|
||||
}
|
||||
|
||||
/* (non-Javadoc)
|
||||
* @see org.apache.hadoop.io.Writable#readFields(java.io.DataInput)
|
||||
/**
|
||||
* {@inheritDoc}
|
||||
*/
|
||||
public void readFields(DataInput in) throws IOException {
|
||||
this.regionName.readFields(in);
|
||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -42,10 +42,7 @@ public class HMsg implements Writable {
|
|||
|
||||
/** Master tells region server to stop */
|
||||
public static final byte MSG_REGIONSERVER_STOP = 5;
|
||||
|
||||
public static final HMsg [] MSG_REGIONSERVER_STOP_IN_ARRAY =
|
||||
{new HMsg(HMsg.MSG_REGIONSERVER_STOP)};
|
||||
|
||||
|
||||
/** Stop serving the specified region and don't report back that it's closed */
|
||||
public static final byte MSG_REGION_CLOSE_WITHOUT_REPORT = 6;
|
||||
|
||||
|
@ -57,10 +54,20 @@ public class HMsg implements Writable {
|
|||
/** region server is no longer serving the specified region */
|
||||
public static final byte MSG_REPORT_CLOSE = 101;
|
||||
|
||||
/** region server is now serving a region produced by a region split */
|
||||
public static final byte MSG_NEW_REGION = 103;
|
||||
/**
|
||||
* region server split the region associated with this message.
|
||||
*
|
||||
* note that this message is immediately followed by two MSG_REPORT_OPEN
|
||||
* messages, one for each of the new regions resulting from the split
|
||||
*/
|
||||
public static final byte MSG_REPORT_SPLIT = 103;
|
||||
|
||||
/** region server is shutting down */
|
||||
/**
|
||||
* region server is shutting down
|
||||
*
|
||||
* note that this message is followed by MSG_REPORT_CLOSE messages for each
|
||||
* region the region server was serving.
|
||||
*/
|
||||
public static final byte MSG_REPORT_EXITING = 104;
|
||||
|
||||
byte msg;
|
||||
|
@ -108,6 +115,9 @@ public class HMsg implements Writable {
|
|||
return info;
|
||||
}
|
||||
|
||||
/**
|
||||
* {@inheritDoc}
|
||||
*/
|
||||
@Override
|
||||
public String toString() {
|
||||
StringBuilder message = new StringBuilder();
|
||||
|
@ -140,8 +150,8 @@ public class HMsg implements Writable {
|
|||
message.append("MSG_REPORT_CLOSE : ");
|
||||
break;
|
||||
|
||||
case MSG_NEW_REGION:
|
||||
message.append("MSG_NEW_REGION : ");
|
||||
case MSG_REPORT_SPLIT:
|
||||
message.append("MSG_REGION_SPLIT : ");
|
||||
break;
|
||||
|
||||
case MSG_REPORT_EXITING:
|
||||
|
@ -162,16 +172,16 @@ public class HMsg implements Writable {
|
|||
// Writable
|
||||
//////////////////////////////////////////////////////////////////////////////
|
||||
|
||||
/* (non-Javadoc)
|
||||
* @see org.apache.hadoop.io.Writable#write(java.io.DataOutput)
|
||||
/**
|
||||
* {@inheritDoc}
|
||||
*/
|
||||
public void write(DataOutput out) throws IOException {
|
||||
out.writeByte(msg);
|
||||
info.write(out);
|
||||
}
|
||||
|
||||
/* (non-Javadoc)
|
||||
* @see org.apache.hadoop.io.Writable#readFields(java.io.DataInput)
|
||||
/**
|
||||
* {@inheritDoc}
|
||||
*/
|
||||
public void readFields(DataInput in) throws IOException {
|
||||
this.msg = in.readByte();
|
||||
|
|
|
@ -407,6 +407,23 @@ public class HRegion implements HConstants {
|
|||
* @throws IOException
|
||||
*/
|
||||
public Vector<HStoreFile> close() throws IOException {
|
||||
return close(false);
|
||||
}
|
||||
|
||||
/**
|
||||
* Close down this HRegion. Flush the cache unless abort parameter is true,
|
||||
* Shut down each HStore, don't service any more calls.
|
||||
*
|
||||
* This method could take some time to execute, so don't call it from a
|
||||
* time-sensitive thread.
|
||||
*
|
||||
* @param abort true if server is aborting (only during testing)
|
||||
* @return Vector of all the storage files that the HRegion's component
|
||||
* HStores make use of. It's a list of HStoreFile objects.
|
||||
*
|
||||
* @throws IOException
|
||||
*/
|
||||
Vector<HStoreFile> close(boolean abort) throws IOException {
|
||||
lock.obtainWriteLock();
|
||||
try {
|
||||
boolean shouldClose = false;
|
||||
|
@ -430,7 +447,11 @@ public class HRegion implements HConstants {
|
|||
return null;
|
||||
}
|
||||
LOG.info("closing region " + this.regionInfo.regionName);
|
||||
Vector<HStoreFile> allHStoreFiles = internalFlushcache();
|
||||
Vector<HStoreFile> allHStoreFiles = null;
|
||||
if (!abort) {
|
||||
// Don't flush the cache if we are aborting during a test.
|
||||
allHStoreFiles = internalFlushcache();
|
||||
}
|
||||
for (HStore store: stores.values()) {
|
||||
store.close();
|
||||
}
|
||||
|
|
|
@ -194,7 +194,7 @@ public class HRegionServer implements HConstants, HRegionInterface, Runnable {
|
|||
|
||||
private void split(final HRegion region, final Text midKey)
|
||||
throws IOException {
|
||||
final Text oldRegion = region.getRegionName();
|
||||
final HRegionInfo oldRegionInfo = region.getRegionInfo();
|
||||
final HRegion[] newRegions = region.closeAndSplit(midKey, this);
|
||||
|
||||
// When a region is split, the META table needs to updated if we're
|
||||
|
@ -204,9 +204,7 @@ public class HRegionServer implements HConstants, HRegionInterface, Runnable {
|
|||
final Text tableToUpdate =
|
||||
region.getRegionInfo().tableDesc.getName().equals(META_TABLE_NAME) ?
|
||||
ROOT_TABLE_NAME : META_TABLE_NAME;
|
||||
if(LOG.isDebugEnabled()) {
|
||||
LOG.debug("Updating " + tableToUpdate + " with region split info");
|
||||
}
|
||||
LOG.info("Updating " + tableToUpdate + " with region split info");
|
||||
|
||||
// Remove old region from META
|
||||
|
||||
|
@ -249,11 +247,11 @@ public class HRegionServer implements HConstants, HRegionInterface, Runnable {
|
|||
if (LOG.isDebugEnabled()) {
|
||||
LOG.debug("Reporting region split to master");
|
||||
}
|
||||
reportSplit(newRegions[0].getRegionInfo(), newRegions[1].
|
||||
getRegionInfo());
|
||||
reportSplit(oldRegionInfo, newRegions[0].getRegionInfo(),
|
||||
newRegions[1].getRegionInfo());
|
||||
LOG.info("region split, META update, and report to master all" +
|
||||
" successful. Old region=" + oldRegion + ", new regions: " +
|
||||
newRegions[0].getRegionName() + ", " +
|
||||
" successful. Old region=" + oldRegionInfo.getRegionName() +
|
||||
", new regions: " + newRegions[0].getRegionName() + ", " +
|
||||
newRegions[1].getRegionName());
|
||||
|
||||
// Finally, start serving the new regions
|
||||
|
@ -262,6 +260,7 @@ public class HRegionServer implements HConstants, HRegionInterface, Runnable {
|
|||
try {
|
||||
onlineRegions.put(newRegions[0].getRegionName(), newRegions[0]);
|
||||
onlineRegions.put(newRegions[1].getRegionName(), newRegions[1]);
|
||||
|
||||
} finally {
|
||||
lock.writeLock().unlock();
|
||||
}
|
||||
|
@ -461,23 +460,19 @@ public class HRegionServer implements HConstants, HRegionInterface, Runnable {
|
|||
address.getPort(), conf.getInt("hbase.regionserver.handler.count", 10),
|
||||
false, conf);
|
||||
|
||||
// Use configured nameserver & interface to get local hostname.
|
||||
// 'serverInfo' is sent to master. Should have name of this host rather than
|
||||
// 'localhost' or 0.0.0.0 or 127.0.0.1 in it.
|
||||
String localHostname = DNS.getDefaultHost(
|
||||
conf.get("dfs.datanode.dns.interface","default"),
|
||||
conf.get("dfs.datanode.dns.nameserver","default"));
|
||||
InetSocketAddress hostnameAddress = new InetSocketAddress(localHostname,
|
||||
server.getListenerAddress().getPort());
|
||||
this.serverInfo = new HServerInfo(new HServerAddress(hostnameAddress),
|
||||
this.rand.nextLong());
|
||||
// Use interface to get the 'real' IP for this host.
|
||||
// 'serverInfo' is sent to master. Should have the real IP of this host
|
||||
// rather than 'localhost' or 0.0.0.0 or 127.0.0.1 in it.
|
||||
String realIP = DNS.getDefaultIP(
|
||||
conf.get("dfs.datanode.dns.interface","default"));
|
||||
|
||||
// Local file paths
|
||||
String serverName = localHostname + "_" +
|
||||
this.serverInfo.getServerAddress().getPort();
|
||||
this.serverInfo = new HServerInfo(new HServerAddress(
|
||||
new InetSocketAddress(realIP, server.getListenerAddress().getPort())),
|
||||
this.rand.nextLong());
|
||||
|
||||
Path logdir = new Path(rootDir, "log" + "_" + realIP + "_" +
|
||||
this.serverInfo.getServerAddress().getPort());
|
||||
|
||||
Path logdir = new Path(rootDir, "log" + "_" + serverName);
|
||||
|
||||
// Logging
|
||||
this.fs = FileSystem.get(conf);
|
||||
if(fs.exists(logdir)) {
|
||||
|
@ -636,54 +631,48 @@ public class HRegionServer implements HConstants, HRegionInterface, Runnable {
|
|||
}
|
||||
|
||||
try {
|
||||
HMsg msgs[] = hbaseMaster.regionServerReport(serverInfo, outboundArray);
|
||||
HMsg msgs[] =
|
||||
hbaseMaster.regionServerReport(serverInfo, outboundArray);
|
||||
lastMsg = System.currentTimeMillis();
|
||||
|
||||
// Queue up the HMaster's instruction stream for processing
|
||||
synchronized(toDo) {
|
||||
boolean restart = false;
|
||||
for(int i = 0; i < msgs.length && !stopRequested && !restart; i++) {
|
||||
switch(msgs[i].getMsg()) {
|
||||
|
||||
case HMsg.MSG_CALL_SERVER_STARTUP:
|
||||
if (LOG.isDebugEnabled()) {
|
||||
LOG.debug("Got call server startup message");
|
||||
}
|
||||
closeAllRegions();
|
||||
restart = true;
|
||||
break;
|
||||
|
||||
case HMsg.MSG_REGIONSERVER_STOP:
|
||||
if (LOG.isDebugEnabled()) {
|
||||
LOG.debug("Got regionserver stop message");
|
||||
}
|
||||
stopRequested = true;
|
||||
break;
|
||||
|
||||
default:
|
||||
if (LOG.isDebugEnabled()) {
|
||||
LOG.debug("Got default message");
|
||||
}
|
||||
try {
|
||||
toDo.put(new ToDoEntry(msgs[i]));
|
||||
} catch (InterruptedException e) {
|
||||
throw new RuntimeException("Putting into msgQueue was interrupted.", e);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if(restart || stopRequested) {
|
||||
toDo.clear();
|
||||
break;
|
||||
}
|
||||
|
||||
if(toDo.size() > 0) {
|
||||
|
||||
boolean restart = false;
|
||||
for(int i = 0; i < msgs.length && !stopRequested && !restart; i++) {
|
||||
switch(msgs[i].getMsg()) {
|
||||
|
||||
case HMsg.MSG_CALL_SERVER_STARTUP:
|
||||
if (LOG.isDebugEnabled()) {
|
||||
LOG.debug("notify on todo");
|
||||
LOG.debug("Got call server startup message");
|
||||
}
|
||||
closeAllRegions();
|
||||
restart = true;
|
||||
break;
|
||||
|
||||
case HMsg.MSG_REGIONSERVER_STOP:
|
||||
if (LOG.isDebugEnabled()) {
|
||||
LOG.debug("Got regionserver stop message");
|
||||
}
|
||||
stopRequested = true;
|
||||
break;
|
||||
|
||||
default:
|
||||
if (LOG.isDebugEnabled()) {
|
||||
LOG.debug("Got default message");
|
||||
}
|
||||
try {
|
||||
toDo.put(new ToDoEntry(msgs[i]));
|
||||
} catch (InterruptedException e) {
|
||||
throw new RuntimeException("Putting into msgQueue was interrupted.", e);
|
||||
}
|
||||
toDo.notifyAll();
|
||||
}
|
||||
}
|
||||
|
||||
if(restart || stopRequested) {
|
||||
toDo.clear();
|
||||
break;
|
||||
}
|
||||
|
||||
} catch (IOException e) {
|
||||
if (e instanceof RemoteException) {
|
||||
try {
|
||||
|
@ -730,7 +719,7 @@ public class HRegionServer implements HConstants, HRegionInterface, Runnable {
|
|||
|
||||
if (abortRequested) {
|
||||
try {
|
||||
log.rollWriter();
|
||||
log.close();
|
||||
} catch (IOException e) {
|
||||
if (e instanceof RemoteException) {
|
||||
try {
|
||||
|
@ -742,10 +731,11 @@ public class HRegionServer implements HConstants, HRegionInterface, Runnable {
|
|||
}
|
||||
LOG.warn(e);
|
||||
}
|
||||
closeAllRegions(); // Don't leave any open file handles
|
||||
LOG.info("aborting server at: " +
|
||||
serverInfo.getServerAddress().toString());
|
||||
} else {
|
||||
Vector<HRegion> closedRegions = closeAllRegions();
|
||||
ArrayList<HRegion> closedRegions = closeAllRegions();
|
||||
try {
|
||||
log.closeAndDelete();
|
||||
} catch (IOException e) {
|
||||
|
@ -815,10 +805,12 @@ public class HRegionServer implements HConstants, HRegionInterface, Runnable {
|
|||
* updated the meta or root regions, and the master will pick that up on its
|
||||
* next rescan of the root or meta tables.
|
||||
*/
|
||||
void reportSplit(HRegionInfo newRegionA, HRegionInfo newRegionB) {
|
||||
void reportSplit(HRegionInfo oldRegion, HRegionInfo newRegionA,
|
||||
HRegionInfo newRegionB) {
|
||||
synchronized(outboundMsgs) {
|
||||
outboundMsgs.add(new HMsg(HMsg.MSG_NEW_REGION, newRegionA));
|
||||
outboundMsgs.add(new HMsg(HMsg.MSG_NEW_REGION, newRegionB));
|
||||
outboundMsgs.add(new HMsg(HMsg.MSG_REPORT_SPLIT, oldRegion));
|
||||
outboundMsgs.add(new HMsg(HMsg.MSG_REPORT_OPEN, newRegionA));
|
||||
outboundMsgs.add(new HMsg(HMsg.MSG_REPORT_OPEN, newRegionB));
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -859,9 +851,7 @@ public class HRegionServer implements HConstants, HRegionInterface, Runnable {
|
|||
continue;
|
||||
}
|
||||
try {
|
||||
if (LOG.isDebugEnabled()) {
|
||||
LOG.debug(e.msg.toString());
|
||||
}
|
||||
LOG.info(e.msg.toString());
|
||||
|
||||
switch(e.msg.getMsg()) {
|
||||
|
||||
|
@ -942,8 +932,8 @@ public class HRegionServer implements HConstants, HRegionInterface, Runnable {
|
|||
}
|
||||
|
||||
/** Called either when the master tells us to restart or from stop() */
|
||||
Vector<HRegion> closeAllRegions() {
|
||||
Vector<HRegion> regionsToClose = new Vector<HRegion>();
|
||||
ArrayList<HRegion> closeAllRegions() {
|
||||
ArrayList<HRegion> regionsToClose = new ArrayList<HRegion>();
|
||||
this.lock.writeLock().lock();
|
||||
try {
|
||||
regionsToClose.addAll(onlineRegions.values());
|
||||
|
@ -956,7 +946,7 @@ public class HRegionServer implements HConstants, HRegionInterface, Runnable {
|
|||
LOG.debug("closing region " + region.getRegionName());
|
||||
}
|
||||
try {
|
||||
region.close();
|
||||
region.close(abortRequested);
|
||||
LOG.debug("region closed " + region.getRegionName());
|
||||
} catch (IOException e) {
|
||||
if (e instanceof RemoteException) {
|
||||
|
|
|
@ -265,10 +265,11 @@ class HStore implements HConstants {
|
|||
|| !key.getRegionName().equals(this.regionName)
|
||||
|| !HStoreKey.extractFamily(column).equals(this.familyName)) {
|
||||
if (LOG.isDebugEnabled()) {
|
||||
LOG.debug("Passing on edit " + key.getRegionName() + ", "
|
||||
+ column.toString() + ": " + new String(val.getVal())
|
||||
+ ", my region: " + this.regionName + ", my column: "
|
||||
+ this.familyName);
|
||||
LOG.debug("Passing on edit " + key.getRegionName() + ", " +
|
||||
column.toString() + ": " +
|
||||
new String(val.getVal(), UTF8_ENCODING) +
|
||||
", my region: " + this.regionName + ", my column: " +
|
||||
this.familyName);
|
||||
}
|
||||
continue;
|
||||
}
|
||||
|
|
|
@ -21,14 +21,17 @@ package org.apache.hadoop.hbase;
|
|||
|
||||
import java.io.IOException;
|
||||
|
||||
public class RegionNotFoundException extends IOException {
|
||||
/** Thrown when a table can not be located */
|
||||
public class TableNotFoundException extends IOException {
|
||||
private static final long serialVersionUID = 993179627856392526L;
|
||||
|
||||
public RegionNotFoundException() {
|
||||
/** default constructor */
|
||||
public TableNotFoundException() {
|
||||
super();
|
||||
}
|
||||
|
||||
public RegionNotFoundException(String s) {
|
||||
/** @param s message */
|
||||
public TableNotFoundException(String s) {
|
||||
super(s);
|
||||
}
|
||||
}
|
||||
}
|
|
@ -52,10 +52,27 @@
|
|||
</property>
|
||||
<property>
|
||||
<name>hbase.regionserver.handler.count</name>
|
||||
<value>3</value>
|
||||
<value>5</value>
|
||||
<description>Count of RPC Server instances spun up on RegionServers
|
||||
Same property is used by the HMaster for count of master handlers.
|
||||
Default is 10.
|
||||
</description>
|
||||
</property>
|
||||
<property>
|
||||
<name>hbase.master.lease.period</name>
|
||||
<value>5000</value>
|
||||
<description>Length of time the master will wait before timing out a region
|
||||
server lease. Since region servers report in every second (see above), this
|
||||
value has been reduced so that the master will notice a dead region server
|
||||
sooner. The default is 30 seconds.
|
||||
</description>
|
||||
</property>
|
||||
<property>
|
||||
<name>hbase.master.lease.thread.wakefrequency</name>
|
||||
<value>2500</value>
|
||||
<description>The interval between checks for expired region server leases.
|
||||
This value has been reduced due to the other reduced values above so that
|
||||
the master will notice a dead region server sooner. The default is 15 seconds.
|
||||
</description>
|
||||
</property>
|
||||
</configuration>
|
||||
|
|
|
@ -196,17 +196,24 @@ public class MiniHBaseCluster implements HConstants {
|
|||
return master.getMasterAddress();
|
||||
}
|
||||
|
||||
/**
|
||||
* Cause a region server to exit without cleaning up
|
||||
*
|
||||
* @param serverNumber
|
||||
*/
|
||||
public void abortRegionServer(int serverNumber) {
|
||||
HRegionServer server = this.regionServers.remove(serverNumber);
|
||||
server.abort();
|
||||
}
|
||||
|
||||
/**
|
||||
* Shut down the specified region server cleanly
|
||||
*
|
||||
* @param serverNumber
|
||||
*/
|
||||
public void stopRegionServer(int serverNumber) {
|
||||
if (serverNumber >= regionServers.size()) {
|
||||
throw new ArrayIndexOutOfBoundsException(
|
||||
"serverNumber > number of region servers");
|
||||
}
|
||||
this.regionServers.get(serverNumber).stop();
|
||||
HRegionServer server = this.regionServers.remove(serverNumber);
|
||||
server.stop();
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -215,30 +222,14 @@ public class MiniHBaseCluster implements HConstants {
|
|||
* @param serverNumber
|
||||
*/
|
||||
public void waitOnRegionServer(int serverNumber) {
|
||||
if (serverNumber >= regionServers.size()) {
|
||||
throw new ArrayIndexOutOfBoundsException(
|
||||
"serverNumber > number of region servers");
|
||||
}
|
||||
Thread regionServerThread = this.regionThreads.remove(serverNumber);
|
||||
try {
|
||||
this.regionThreads.get(serverNumber).join();
|
||||
regionServerThread.join();
|
||||
} catch (InterruptedException e) {
|
||||
e.printStackTrace();
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Cause a region server to exit without cleaning up
|
||||
*
|
||||
* @param serverNumber
|
||||
*/
|
||||
public void abortRegionServer(int serverNumber) {
|
||||
if(serverNumber >= this.regionServers.size()) {
|
||||
throw new ArrayIndexOutOfBoundsException(
|
||||
"serverNumber > number of region servers");
|
||||
}
|
||||
this.regionServers.get(serverNumber).abort();
|
||||
}
|
||||
|
||||
/** Shut down the HBase cluster */
|
||||
public void shutdown() {
|
||||
LOG.info("Shutting down the HBase Cluster");
|
||||
|
|
|
@ -20,64 +20,78 @@
|
|||
package org.apache.hadoop.hbase;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.util.TreeMap;
|
||||
|
||||
import org.apache.hadoop.io.Text;
|
||||
|
||||
/**
|
||||
* Tests region server failover when a region server exits.
|
||||
*/
|
||||
public class TestCleanRegionServerExit extends HBaseClusterTestCase {
|
||||
private HClient client;
|
||||
|
||||
/** constructor */
|
||||
public TestCleanRegionServerExit() {
|
||||
super();
|
||||
conf.setInt("ipc.client.timeout", 5000); // reduce ipc client timeout
|
||||
conf.setInt("ipc.client.connect.max.retries", 5); // and number of retries
|
||||
conf.setInt("hbase.client.retries.number", 2); // reduce HBase retries
|
||||
}
|
||||
|
||||
/**
|
||||
* {@inheritDoc}
|
||||
*/
|
||||
@Override
|
||||
public void setUp() throws Exception {
|
||||
super.setUp();
|
||||
this.client = new HClient(conf);
|
||||
}
|
||||
|
||||
public void testCleanRegionServerExit()
|
||||
throws IOException, InterruptedException {
|
||||
try {
|
||||
// When the META table can be opened, the region servers are running
|
||||
this.client.openTable(HConstants.META_TABLE_NAME);
|
||||
// Put something into the meta table.
|
||||
this.client.createTable(new HTableDescriptor(getName()));
|
||||
// Get current region server instance.
|
||||
HRegionServer hsr = this.cluster.regionServers.get(0);
|
||||
Thread hrst = this.cluster.regionThreads.get(0);
|
||||
// Start up a new one to take over serving of root and meta after we shut
|
||||
// down the current meta/root host.
|
||||
this.cluster.startRegionServer();
|
||||
// Now shutdown the region server and wait for it to go down.
|
||||
hsr.stop();
|
||||
hrst.join();
|
||||
// The recalibration of the client is not working properly. FIX.
|
||||
// After above is fixed, add in assertions that we can get data from
|
||||
// newly located meta table.
|
||||
} catch(Exception e) {
|
||||
e.printStackTrace();
|
||||
fail();
|
||||
}
|
||||
}
|
||||
|
||||
/* Comment out till recalibration of client is working properly.
|
||||
|
||||
public void testRegionServerAbort()
|
||||
throws IOException, InterruptedException {
|
||||
/**
|
||||
* The test
|
||||
* @throws IOException
|
||||
*/
|
||||
public void testCleanRegionServerExit() throws IOException {
|
||||
// When the META table can be opened, the region servers are running
|
||||
this.client.openTable(HConstants.META_TABLE_NAME);
|
||||
// Put something into the meta table.
|
||||
this.client.createTable(new HTableDescriptor(getName()));
|
||||
// Get current region server instance.
|
||||
HRegionServer hsr = this.cluster.regionServers.get(0);
|
||||
Thread hrst = this.cluster.regionThreads.get(0);
|
||||
// Start up a new one to take over serving of root and meta after we shut
|
||||
// down the current meta/root host.
|
||||
String tableName = getName();
|
||||
HTableDescriptor desc = new HTableDescriptor(tableName);
|
||||
desc.addFamily(new HColumnDescriptor(HConstants.COLUMN_FAMILY.toString()));
|
||||
this.client.createTable(desc);
|
||||
// put some values in the table
|
||||
this.client.openTable(new Text(tableName));
|
||||
Text row = new Text("row1");
|
||||
long lockid = client.startUpdate(row);
|
||||
client.put(lockid, HConstants.COLUMN_FAMILY,
|
||||
tableName.getBytes(HConstants.UTF8_ENCODING));
|
||||
client.commit(lockid);
|
||||
// Start up a new region server to take over serving of root and meta
|
||||
// after we shut down the current meta/root host.
|
||||
this.cluster.startRegionServer();
|
||||
// Force a region server to exit "ungracefully"
|
||||
hsr.abort();
|
||||
hrst.join();
|
||||
// The recalibration of the client is not working properly. FIX.
|
||||
// After above is fixed, add in assertions that we can get data from
|
||||
// newly located meta table.
|
||||
// Now shutdown the region server and wait for it to go down.
|
||||
this.cluster.stopRegionServer(0);
|
||||
this.cluster.waitOnRegionServer(0);
|
||||
|
||||
// Verify that the client can find the data after the region has been moved
|
||||
// to a different server
|
||||
|
||||
HScannerInterface scanner =
|
||||
client.obtainScanner(HConstants.COLUMN_FAMILY_ARRAY, new Text());
|
||||
|
||||
try {
|
||||
HStoreKey key = new HStoreKey();
|
||||
TreeMap<Text, byte[]> results = new TreeMap<Text, byte[]>();
|
||||
while (scanner.next(key, results)) {
|
||||
assertTrue(key.getRow().equals(row));
|
||||
assertEquals(1, results.size());
|
||||
byte[] bytes = results.get(HConstants.COLUMN_FAMILY);
|
||||
assertNotNull(bytes);
|
||||
assertTrue(tableName.equals(new String(bytes, HConstants.UTF8_ENCODING)));
|
||||
}
|
||||
System.out.println("Success!");
|
||||
} finally {
|
||||
scanner.close();
|
||||
}
|
||||
}
|
||||
*/
|
||||
}
|
||||
|
|
|
@ -0,0 +1,100 @@
|
|||
/**
|
||||
* Copyright 2007 The Apache Software Foundation
|
||||
*
|
||||
* Licensed to the Apache Software Foundation (ASF) under one
|
||||
* or more contributor license agreements. See the NOTICE file
|
||||
* distributed with this work for additional information
|
||||
* regarding copyright ownership. The ASF licenses this file
|
||||
* to you under the Apache License, Version 2.0 (the
|
||||
* "License"); you may not use this file except in compliance
|
||||
* with the License. You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
package org.apache.hadoop.hbase;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.util.TreeMap;
|
||||
|
||||
import org.apache.hadoop.io.Text;
|
||||
import org.apache.log4j.Level;
|
||||
import org.apache.log4j.Logger;
|
||||
|
||||
/**
|
||||
* Tests region server failover when a region server exits.
|
||||
*/
|
||||
public class TestRegionServerAbort extends HBaseClusterTestCase {
|
||||
private HClient client;
|
||||
|
||||
/** constructor */
|
||||
public TestRegionServerAbort() {
|
||||
super();
|
||||
conf.setInt("ipc.client.timeout", 5000); // reduce client timeout
|
||||
conf.setInt("ipc.client.connect.max.retries", 5); // and number of retries
|
||||
conf.setInt("hbase.client.retries.number", 2); // reduce HBase retries
|
||||
// Logger.getLogger(this.getClass().getPackage().getName()).setLevel(Level.DEBUG);
|
||||
}
|
||||
|
||||
/**
|
||||
* {@inheritDoc}
|
||||
*/
|
||||
@Override
|
||||
public void setUp() throws Exception {
|
||||
super.setUp();
|
||||
this.client = new HClient(conf);
|
||||
}
|
||||
|
||||
/**
|
||||
* The test
|
||||
* @throws IOException
|
||||
*/
|
||||
public void testRegionServerAbort() throws IOException {
|
||||
// When the META table can be opened, the region servers are running
|
||||
this.client.openTable(HConstants.META_TABLE_NAME);
|
||||
// Put something into the meta table.
|
||||
String tableName = getName();
|
||||
HTableDescriptor desc = new HTableDescriptor(tableName);
|
||||
desc.addFamily(new HColumnDescriptor(HConstants.COLUMN_FAMILY.toString()));
|
||||
this.client.createTable(desc);
|
||||
// put some values in the table
|
||||
this.client.openTable(new Text(tableName));
|
||||
Text row = new Text("row1");
|
||||
long lockid = client.startUpdate(row);
|
||||
client.put(lockid, HConstants.COLUMN_FAMILY,
|
||||
tableName.getBytes(HConstants.UTF8_ENCODING));
|
||||
client.commit(lockid);
|
||||
// Start up a new region server to take over serving of root and meta
|
||||
// after we shut down the current meta/root host.
|
||||
this.cluster.startRegionServer();
|
||||
// Now shutdown the region server and wait for it to go down.
|
||||
this.cluster.abortRegionServer(0);
|
||||
this.cluster.waitOnRegionServer(0);
|
||||
|
||||
// Verify that the client can find the data after the region has been moved
|
||||
// to a different server
|
||||
|
||||
HScannerInterface scanner =
|
||||
client.obtainScanner(HConstants.COLUMN_FAMILY_ARRAY, new Text());
|
||||
|
||||
try {
|
||||
HStoreKey key = new HStoreKey();
|
||||
TreeMap<Text, byte[]> results = new TreeMap<Text, byte[]>();
|
||||
while (scanner.next(key, results)) {
|
||||
assertTrue(key.getRow().equals(row));
|
||||
assertEquals(1, results.size());
|
||||
byte[] bytes = results.get(HConstants.COLUMN_FAMILY);
|
||||
assertNotNull(bytes);
|
||||
assertTrue(tableName.equals(new String(bytes, HConstants.UTF8_ENCODING)));
|
||||
}
|
||||
System.out.println("Success!");
|
||||
} finally {
|
||||
scanner.close();
|
||||
}
|
||||
}
|
||||
}
|
Loading…
Reference in New Issue