From e04919de093e44489d8afdd268e4d83bfb8c53f4 Mon Sep 17 00:00:00 2001 From: Michael Stack Date: Tue, 17 Jan 2012 05:15:44 +0000 Subject: [PATCH] Update version on hadoop versions to include note on hadoop 1.0.0 git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1232308 13f79535-47bb-0310-9956-ffa450edef68 --- src/docbkx/configuration.xml | 53 +++++++++++++++++++++--------------- 1 file changed, 31 insertions(+), 22 deletions(-) diff --git a/src/docbkx/configuration.xml b/src/docbkx/configuration.xml index 13806440c2a..6ae0af40c4f 100644 --- a/src/docbkx/configuration.xml +++ b/src/docbkx/configuration.xml @@ -212,35 +212,44 @@ to ensure well-formedness of your document after an edit session. xlink:href="http://hadoop.apache.org">Hadoop Hadoop + Please read all of this section + Please read this section to the end. Up front we + wade through the weeds of Hadoop versions. Later we talk of what you must do in HBase + to make it work w/ a particular Hadoop version. + - - On Hadoop Versions? + + HBase will lose data unless it is running on an HDFS that has a durable + sync implementation. Hadoop 0.20.2, Hadoop 0.20.203.0, and Hadoop 0.20.204.0 + DO NOT have this attribute. + Currently only Hadoop versions 0.20.205.x or any release in excess of this + version -- this includes hadoop 1.0.0 -- have a working, durable sync + + On Hadoop Versions The Cloudera blog post An update on Apache Hadoop 1.0 by Charles Zedlweski has a nice exposition on how all the Hadoop versions relate. Its worth checking out if you are having trouble making sense of the Hadoop version morass. - - - This version of HBase will only run on Hadoop - 0.20.x. It will not run on hadoop 0.21.x (but may run on 0.22.x/0.23.x). - HBase will lose data unless it is running on an HDFS that has a durable - sync. Hadoop 0.20.2, Hadoop 0.20.203.0, and Hadoop 0.20.204.0 - DO NOT have this attribute. - Currently only Hadoop versions 0.20.205.x or any release in excess of this - version has a durable sync. You have to explicitly enable it though by - setting dfs.support.append equal to true on both - the client side -- in hbase-site.xml though it should - be on in your base-default.xml file -- and on the - serverside in hdfs-site.xml (You will have to restart - your cluster after setting this configuration). Ignore the chicken-little - comment you'll find in the hdfs-site.xml in the - description for this configuration; it says it is not enabled because there + . Sync has to be explicitly enabled by setting + dfs.support.append equal + to true on both the client side -- in hbase-site.xml + -- and on the serverside in hdfs-site.xml (The sync + facility HBase needs is a subset of the append code path). + + <property> + <name>dfs.support.append</name> + <value>true</value> + </property> + + You will have to restart your cluster after making this edit. Ignore the chicken-little + comment you'll find in the hdfs-default.xml in the + description for the dfs.support.append configuration; it says it is not enabled because there are ... bugs in the 'append code' and is not supported in any production - cluster. because it is not true (I'm sure there are bugs but the - append code has been running in production at large scale deploys and is on - by default in the offerings of hadoop by commercial vendors) + cluster.. This comment is stale, from another era, and while I'm sure there + are bugs, the sync/append code has been running + in production at large scale deploys and is on + by default in the offerings of hadoop by commercial vendors Until recently only the branch-0.20-append branch had a working sync but no official release was ever made from this branch.