<p><i>HDFS rolling upgrade</i> allows upgrading individual HDFS daemons. For examples, the datanodes can be upgraded independent of the namenodes. A namenode can be upgraded independent of the other namenodes. The namenodes can be upgraded independent of datanodes and journal nodes.</p></section><section>
<h2><aname="Upgrade"></a>Upgrade</h2>
<p>In Hadoop v2, HDFS supports highly-available (HA) namenode services and wire compatibility. These two capabilities make it feasible to upgrade HDFS without incurring HDFS downtime. In order to upgrade a HDFS cluster without downtime, the cluster must be setup with HA.</p>
<p>If there is any new feature which is enabled in new software release, may not work with old software release after upgrade. In such cases upgrade should be done by following steps.</p>
<olstyle="list-style-type: decimal">
<li>Disable new feature.</li>
<li>Upgrade the cluster.</li>
<li>Enable the new feature.</li>
</ol>
<p>Note that rolling upgrade is supported only from Hadoop-2.4.0 onwards.</p><section>
<h3><aname="Upgrade_without_Downtime"></a>Upgrade without Downtime</h3>
<p>In an HA cluster, there are two or more <i>NameNodes (NNs)</i>, many <i>DataNodes (DNs)</i>, a few <i>JournalNodes (JNs)</i> and a few <i>ZooKeeperNodes (ZKNs)</i>. <i>JNs</i> is relatively stable and does not require upgrade when upgrading HDFS in most of the cases. In the rolling upgrade procedure described here, only <i>NNs</i> and <i>DNs</i> are considered but <i>JNs</i> and <i>ZKNs</i> are not. Upgrading <i>JNs</i> and <i>ZKNs</i> may incur cluster downtime.</p><section>
<p>Suppose there are two namenodes <i>NN1</i> and <i>NN2</i>, where <i>NN1</i> and <i>NN2</i> are respectively in active and standby states. The following are the steps for upgrading an HA cluster:</p>
<olstyle="list-style-type: decimal">
<li>Prepare Rolling Upgrade
<olstyle="list-style-type: decimal">
<li>Run “<ahref="#dfsadmin_-rollingUpgrade"><code>hdfs dfsadmin -rollingUpgrade prepare</code></a>” to create a fsimage for rollback.</li>
<li>Run “<ahref="#dfsadmin_-rollingUpgrade"><code>hdfs dfsadmin -rollingUpgrade query</code></a>” to check the status of the rollback image. Wait and re-run the command until the “<code>Proceed with rolling upgrade</code>” message is shown.</li>
</ol>
</li>
<li>Upgrade Active and Standby <i>NNs</i>
<olstyle="list-style-type: decimal">
<li>Shutdown and upgrade <i>NN2</i>.</li>
<li>Start <i>NN2</i> as standby with the “<ahref="#namenode_-rollingUpgrade"><code>-rollingUpgrade started</code></a>” option.</li>
<li>Failover from <i>NN1</i> to <i>NN2</i> so that <i>NN2</i> becomes active and <i>NN1</i> becomes standby.</li>
<li>Shutdown and upgrade <i>NN1</i>.</li>
<li>Start <i>NN1</i> as standby with the “<ahref="#namenode_-rollingUpgrade"><code>-rollingUpgrade started</code></a>” option.</li>
</ol>
</li>
<li>Upgrade <i>DNs</i>
<olstyle="list-style-type: decimal">
<li>Choose a small subset of datanodes (e.g. all datanodes under a particular rack).
<olstyle="list-style-type: decimal">
<li>Run “<ahref="#dfsadmin_-shutdownDatanode"><code>hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT> upgrade</code></a>” to shutdown one of the chosen datanodes.</li>
<li>Run “<ahref="#dfsadmin_-getDatanodeInfo"><code>hdfs dfsadmin -getDatanodeInfo <DATANODE_HOST:IPC_PORT></code></a>” to check and wait for the datanode to shutdown.</li>
<li>Upgrade and restart the datanode.</li>
<li>Perform the above steps for all the chosen datanodes in the subset in parallel.</li>
</ol>
</li>
<li>Repeat the above steps until all datanodes in the cluster are upgraded.</li>
</ol>
</li>
<li>Finalize Rolling Upgrade
<olstyle="list-style-type: decimal">
<li>Run “<ahref="#dfsadmin_-rollingUpgrade"><code>hdfs dfsadmin -rollingUpgrade finalize</code></a>” to finalize the rolling upgrade.</li>
<p>In a federated cluster, there are multiple namespaces and a pair of active and standby <i>NNs</i> for each namespace. The procedure for upgrading a federated cluster is similar to upgrading a non-federated cluster except that Step 1 and Step 4 are performed on each namespace and Step 2 is performed on each pair of active and standby <i>NNs</i>, i.e.</p>
<olstyle="list-style-type: decimal">
<li>Prepare Rolling Upgrade for Each Namespace</li>
<li>Upgrade Active and Standby <i>NN</i> pairs for Each Namespace</li>
<li>Upgrade <i>DNs</i></li>
<li>Finalize Rolling Upgrade for Each Namespace</li>
</ol></section></section><section>
<h3><aname="Upgrade_with_Downtime"></a>Upgrade with Downtime</h3>
<p>For non-HA clusters, it is impossible to upgrade HDFS without downtime since it requires restarting the namenodes. However, datanodes can still be upgraded in a rolling manner.</p><section>
<p>In a non-HA cluster, there are a <i>NameNode (NN)</i>, a <i>SecondaryNameNode (SNN)</i> and many <i>DataNodes (DNs)</i>. The procedure for upgrading a non-HA cluster is similar to upgrading an HA cluster except that Step 2 “Upgrade Active and Standby <i>NNs</i>” is changed to below:</p>
<ul>
<li>Upgrade <i>NN</i> and <i>SNN</i>
<olstyle="list-style-type: decimal">
<li>Shutdown <i>SNN</i></li>
<li>Shutdown and upgrade <i>NN</i>.</li>
<li>Start <i>NN</i> with the “<ahref="#namenode_-rollingUpgrade"><code>-rollingUpgrade started</code></a>” option.</li>
<li>Upgrade and restart <i>SNN</i></li>
</ol>
</li>
</ul></section></section></section><section>
<h2><aname="Downgrade_and_Rollback"></a>Downgrade and Rollback</h2>
<p>When the upgraded release is undesirable or, in some unlikely case, the upgrade fails (due to bugs in the newer release), administrators may choose to downgrade HDFS back to the pre-upgrade release, or rollback HDFS to the pre-upgrade release and the pre-upgrade state.</p>
<p>Note that downgrade can be done in a rolling fashion but rollback cannot. Rollback requires cluster downtime.</p>
<p>Note also that downgrade and rollback are possible only after a rolling upgrade is started and before the upgrade is terminated. An upgrade can be terminated by either finalize, downgrade or rollback. Therefore, it may not be possible to perform rollback after finalize or downgrade, or to perform downgrade after finalize.</p></section><section>
<h2><aname="Downgrade"></a>Downgrade</h2>
<p><i>Downgrade</i> restores the software back to the pre-upgrade release and preserves the user data. Suppose time <i>T</i> is the rolling upgrade start time and the upgrade is terminated by downgrade. Then, the files created before or after <i>T</i> remain available in HDFS. The files deleted before or after <i>T</i> remain deleted in HDFS.</p>
<p>A newer release is downgradable to the pre-upgrade release only if both the namenode layout version and the datanode layout version are not changed between these two releases.</p>
<p>In an HA cluster, when a rolling upgrade from an old software release to a new software release is in progress, it is possible to downgrade, in a rolling fashion, the upgraded machines back to the old software release. Same as before, suppose <i>NN1</i> and <i>NN2</i> are respectively in active and standby states. Below are the steps for rolling downgrade without downtime:</p>
<olstyle="list-style-type: decimal">
<li>Downgrade <i>DNs</i>
<olstyle="list-style-type: decimal">
<li>Choose a small subset of datanodes (e.g. all datanodes under a particular rack).
<olstyle="list-style-type: decimal">
<li>Run “<ahref="#dfsadmin_-shutdownDatanode"><code>hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT> upgrade</code></a>” to shutdown one of the chosen datanodes.</li>
<li>Run “<ahref="#dfsadmin_-getDatanodeInfo"><code>hdfs dfsadmin -getDatanodeInfo <DATANODE_HOST:IPC_PORT></code></a>” to check and wait for the datanode to shutdown.</li>
<li>Downgrade and restart the datanode.</li>
<li>Perform the above steps for all the chosen datanodes in the subset in parallel.</li>
</ol>
</li>
<li>Repeat the above steps until all upgraded datanodes in the cluster are downgraded.</li>
</ol>
</li>
<li>Downgrade Active and Standby <i>NNs</i>
<olstyle="list-style-type: decimal">
<li>Shutdown and downgrade <i>NN2</i>.</li>
<li>Start <i>NN2</i> as standby normally.</li>
<li>Failover from <i>NN1</i> to <i>NN2</i> so that <i>NN2</i> becomes active and <i>NN1</i> becomes standby.</li>
<li>Shutdown and downgrade <i>NN1</i>.</li>
<li>Start <i>NN1</i> as standby normally.</li>
</ol>
</li>
<li>Finalize Rolling Downgrade
<olstyle="list-style-type: decimal">
<li>Run “<ahref="#dfsadmin_-rollingUpgrade"><code>hdfs dfsadmin -rollingUpgrade finalize</code></a>” to finalize the rolling downgrade.</li>
</ol>
</li>
</ol>
<p>Note that the datanodes must be downgraded before downgrading the namenodes since protocols may be changed in a backward compatible manner but not forward compatible, i.e. old datanodes can talk to the new namenodes but not vice versa.</p></section><section>
<h2><aname="Rollback"></a>Rollback</h2>
<p><i>Rollback</i> restores the software back to the pre-upgrade release but also reverts the user data back to the pre-upgrade state. Suppose time <i>T</i> is the rolling upgrade start time and the upgrade is terminated by rollback. The files created before <i>T</i> remain available in HDFS but the files created after <i>T</i> become unavailable. The files deleted before <i>T</i> remain deleted in HDFS but the files deleted after <i>T</i> are restored.</p>
<p>Rollback from a newer release to the pre-upgrade release is always supported. However, it cannot be done in a rolling fashion. It requires cluster downtime. Suppose <i>NN1</i> and <i>NN2</i> are respectively in active and standby states. Below are the steps for rollback:</p>
<ul>
<li>Rollback HDFS
<olstyle="list-style-type: decimal">
<li>Shutdown all <i>NNs</i> and <i>DNs</i>.</li>
<li>Restore the pre-upgrade release in all machines.</li>
<li>Start <i>NN1</i> as Active with the “<ahref="#namenode_-rollingUpgrade"><code>-rollingUpgrade rollback</code></a>” option.</li>
<li>Run `-bootstrapStandby’ on NN2 and start it normally as standby.</li>
<li>Start <i>DNs</i> with the “<code>-rollback</code>” option.</li>
</ol>
</li>
</ul></section><section>
<h2><aname="Commands_and_Startup_Options_for_Rolling_Upgrade"></a>Commands and Startup Options for Rolling Upgrade</h2><section>
<p>Get the information about the given datanode. This command can be used for checking if a datanode is alive like the Unix <code>ping</code> command.</p></section><section>
<p>Submit a shutdown request for the given datanode. If the optional <code>upgrade</code> argument is specified, clients accessing the datanode will be advised to wait for it to restart and the fast start-up mode will be enabled. When the restart does not happen in time, clients will timeout and ignore the datanode. In such case, the fast start-up mode will also be disabled.</p>
<p>Note that the command does not wait for the datanode shutdown to complete. The “<ahref="#dfsadmin_-getDatanodeInfo"><code>dfsadmin -getDatanodeInfo</code></a>” command can be used for checking if the datanode shutdown is completed.</p></section></section><section>
<p>When a rolling upgrade is in progress, the <code>-rollingUpgrade</code> namenode startup option is used to specify various rolling upgrade options.</p>
<ul>
<li>Options:
<tableborder="0"class="bodyTable">
<thead></thead><tbody>
<trclass="a">
<td><code>rollback</code></td>
<td> Restores the namenode back to the pre-upgrade release but also reverts the user data back to the pre-upgrade state. </td></tr>
<trclass="b">
<td><code>started</code></td>
<td> Specifies a rolling upgrade already started so that the namenode should allow image directories with different layout versions during startup. </td></tr>
</tbody>
</table>
</li>
</ul>
<p><b>WARN: downgrade options is obsolete.</b> It is not necessary to start namenode with downgrade options explicitly.</p></section></section></section>