HBASE-9406 Document 0.96 migration

git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1519155 13f79535-47bb-0310-9956-ffa450edef68
2013-08-31 04:51:34 +00:00 · 2013-08-31 04:51:34 +00:00 · 0f37448507
parent c41e90e54b
commit 0f37448507
1 changed files with 119 additions and 44 deletions
--- a/src/main/docbkx/upgrading.xml
+++ b/src/main/docbkx/upgrading.xml
@ -80,59 +80,134 @@
            </para>
        </section>
 </section>
    <section xml:id="upgrade0.96">
      <title>Upgrading from 0.94.x to 0.96.x</title>
      <subtitle>The Singularity</subtitle>
-      <para>You will have to stop your old 0.94 cluster completely to upgrade.  If you are replicating
+      <para>You will have to stop your old 0.94.x cluster completely to upgrade.  If you are replicating
-     between clusters, both clusters will have to go down to upgrade.  Make sure it is a clean shutdown
+     between clusters, both clusters will have to go down to upgrade.  Make sure it is a clean shutdown.
-     so there are no WAL files laying around (TODO: Can 0.96 read 0.94 WAL files?).  Make sure
+     The less WAL files around, the faster the upgrade will run (the upgrade will split any log files it
-     zookeeper is cleared of state.  All clients must be upgraded to 0.96 too.
+     finds in the filesystem as part of the upgrade process).  All clients must be upgraded to 0.96 too.
 </para>
- <para>The API has changed in a few areas; in particular how you use coprocessors (TODO: MapReduce too?)
+ <para>The API has changed.  You will need to recompile your code against 0.96 and you may need to
     adjust applications to go against new APIs (TODO: List of changes).
 </para>
- <para>TODO: Need to recompile your code against 0.96, choose the right hbase jar to suit your h1 or h2
+ <section>
-     setup, etc.  WHAT ELSE
+     <title>Executing the 0.96 Upgrade</title>
     <note>
     <para>HDFS and ZooKeeper should be up and running during the upgrade process.</para>
 </note>
 <para>hbase-0.96.0 comes with an upgrade script.  Run
     <programlisting>$ bin/hbase upgrade</programlisting> to see its usage.
     The script has two main modes: -check, and -execute.
 </para>
- <section xml:id="096.zk.cleaning">
+     <section><title>check</title>
-     <title>Cleaning zookeeper of old data</title>
+         <para>The <emphasis>check</emphasis> step is run against a running 0.94 cluster.
-     <para>Clean zookeeper of all its content before you start 0.96.x (or 0.95.x).  Here is how:
+             Run it from a downloaded 0.96.x binary.  The <emphasis>check</emphasis> step
-         <programlisting>$ ./bin/hbase clean</programlisting>
+             is looking for the presence of <filename>HFileV1</filename> files.  These are
-         This will printout usage.</para>
+             unsupported in hbase-0.96.0.  To purge them -- have them rewritten as HFileV2 --
-     <para>To 'clean' ZooKeeper, it needs to be running.  But you don't want the cluster running
+             you must run a compaction.
-         because the cluster will then register its entities in ZooKeeper and as a precaution,
+         </para>
-         our clean script will not run if there are registered masters and regionservers with
+         <para>The <emphasis>check</emphasis> step prints stats at the end of its run
-         registered znodes present.  So, make sure all servers are down but for zookeeper.  If
+             (grep for “Result:” in the log) printing absolute path of the tables it scanned,
-         zookeeper is managed by HBase, a commmon-configuration, then you will need to start
+             any HFileV1 files found, the regions containing said files (the regions we
-         zookeeper only:
+             need to  major compact to purge the HFileV1s), and any corrupted files if
-         <programlisting>$ ./hbase/bin/hbase-daemons.sh  --config /home/stack/conf-hbase start zookeeper</programlisting>
+             any found. A corrupt file is unreadable, and so is undefined (neither HFileV1 nor HFileV2).
-         If zookeeper is managed independently of HBase, make sure it is up.
+         </para>
-       Now run the following to clean zookeeper in particular
+         <para>To run the check step, run <programlisting>$ bin/hbase upgrade -check</programlisting>.
-         <programlisting>$ ./bin/hbase clean --cleanZk</programlisting>
+          Here is sample output:
-         It may complain that there are still registered regionserver znodes present in zookeeper.
+<computeroutput>
-         If so, ensure they are indeed down.  Then wait a few tens of seconds and they should
+             Tables Processed:
-         disappear.
+             hdfs://localhost:41020/myHBase/.META.
-     </para>
+             hdfs://localhost:41020/myHBase/usertable
-     <para>This is what you will see if zookeeper has old data in it: the Master won't start with
+             hdfs://localhost:41020/myHBase/TestTable
-         an exception like the following
+             hdfs://localhost:41020/myHBase/t
         <programlisting>2013-05-30 09:46:29,767 FATAL [master-sss-1,60000,1369932387523] org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown.
 org.apache.zookeeper.KeeperException$DataInconsistencyException: KeeperErrorCode = DataInconsistency
        at org.apache.hadoop.hbase.zookeeper.ZKUtil.convert(ZKUtil.java:1789)
        at org.apache.hadoop.hbase.zookeeper.ZKTableReadOnly.getTableState(ZKTableReadOnly.java:156)
        at org.apache.hadoop.hbase.zookeeper.ZKTable.populateTableStates(ZKTable.java:81)
        at org.apache.hadoop.hbase.zookeeper.ZKTable.&lt;init>(ZKTable.java:68)
        at org.apache.hadoop.hbase.master.AssignmentManager.&lt;init>(AssignmentManager.java:246)
        at org.apache.hadoop.hbase.master.HMaster.initializeZKBasedSystemTrackers(HMaster.java:626)
        at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:757)
        at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:552)
        at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.hadoop.hbase.exceptions.DeserializationException: Missing pb magic PBUF prefix
        at org.apache.hadoop.hbase.protobuf.ProtobufUtil.expectPBMagicPrefix(ProtobufUtil.java:205)
        at org.apache.hadoop.hbase.zookeeper.ZKTableReadOnly.getTableState(ZKTableReadOnly.java:146)
        ... 7 more</programlisting>
     </para>
             Count of HFileV1: 2
             HFileV1:
             hdfs://localhost:41020/myHBase/usertable    /fa02dac1f38d03577bd0f7e666f12812/family/249450144068442524
             hdfs://localhost:41020/myHBase/usertable    /ecdd3eaee2d2fcf8184ac025555bb2af/family/249450144068442512
             Count of corrupted files: 1
             Corrupted Files:
             hdfs://localhost:41020/myHBase/usertable/fa02dac1f38d03577bd0f7e666f12812/family/1
             Count of Regions with HFileV1: 2
             Regions to Major Compact:
             hdfs://localhost:41020/myHBase/usertable/fa02dac1f38d03577bd0f7e666f12812
             hdfs://localhost:41020/myHBase/usertable/ecdd3eaee2d2fcf8184ac025555bb2af
             There are some HFileV1, or corrupt files (files with incorrect major version)
 </computeroutput>
             In the above sample output, there are two HFileV1 in two regions, and one corrupt file.
             Corrupt files should probably be removed.  The regions that have HFileV1s need to be major
             compacted.  To major compact, start up the hbase shell and review how to compact an individual
             region.  After the major compaction is done, rerun the check step and the HFileV1s shoudl be
             gone, replaced by HFileV2 instances.
         </para>
         <para>By default, the check step scans the hbase root directory (defined as hbase.rootdir in the configuration).
             To scan a specific directory only, pass the <emphasis>-dir</emphasis> option.
             <programlisting>$ bin/hbase upgrade -check -dir /myHBase/testTable</programlisting>
             The above command would detect HFileV1s in the /myHBase/testTable directory.
         </para>
         <para>
             Once the check step reports all the HFileV1 files have been rewritten, it is safe to proceed with the
             upgrade.
          </para>
     </section>
     <section><title>execute</title>
         <para>After the check step shows the cluster is free of HFileV1, it is safe to proceed with the upgrade.
             Next is the <emphasis>execute</emphasis> step.  You must <emphasis>SHUTDOWN YOUR 0.94.x CLUSTER</emphasis>
             before you can run the <emphasis>execute</emphasis> step.  The execute step will not run if it
             detects running HBase masters or regionservers.
         <note>
         <para>HDFS and ZooKeeper should be up and running during the upgrade process.
             If zookeeper is managed by HBase, then you can start zookeeper so it is available to the upgrade
             by running <programlisting>$ ./hbase/bin/hbase-daemon.sh  start zookeeper</programlisting>
         </para></note>
         </para>
         <para>
             The <emphasis>execute</emphasis> upgrade step is made of three substeps.
             <itemizedlist>
             <listitem> <para>Namespaces: HBase 0.96.0 has support for namespaces.  The upgrade needs to reorder directories in the filesystem for namespaces to work.</para> </listitem>
             <listitem> <para>ZNodes: All znodes are purged so that new ones can be written in their place using a new protobuf'ed format and a few are migrated in place: e.g. replication and table state znodes</para> </listitem>
             <listitem> <para>WAL Log Splitting: If the 0.94.x cluster shutdown was not clean, we'll split WAL logs as part of migration before
                     we startup on 0.96.0.  This WAL splitting runs slower than the native distributed WAL splitting because it is all inside the
                     single upgrade process (so try and get a clean shutdown of the 0.94.0 cluster  if you can).
             </para> </listitem>
         </itemizedlist>
         </para>
         <para>
             To run the <emphasis>execute</emphasis> step, make sure that first you have copied hbase-0.96.0
             binaries everywhere under servers and under clients.  Make sure the 0.94.0 cluster is down.
             Then do as follows:
         <programlisting>$ bin/hbase upgrade -execute</programlisting>
         Here is some sample output
         <computeroutput>
             Starting Namespace upgrade
             Created version file at hdfs://localhost:41020/myHBase with version=7
             Migrating table testTable to hdfs://localhost:41020/myHBase/.data/default/testTable
             …..
             Created version file at hdfs://localhost:41020/myHBase with version=8
             Successfully completed NameSpace upgrade.
             Starting Znode upgrade
             ….
             Successfully completed Znode upgrade
             Starting Log splitting
             …
             Successfully completed Log splitting
         </computeroutput>
         </para>
         <para>
             If the output from the execute step looks good, start hbase-0.96.0.
         </para>
     </section>
 </section>
    </section>
    <section xml:id="upgrade0.94">
      <title>Upgrading from 0.92.x to 0.94.x</title>
      <para>We used to think that 0.92 and 0.94 were interface compatible and that you can do a