diff --git a/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt b/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt index 75e511a9d54..3719501c7ca 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt +++ b/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt @@ -258,6 +258,8 @@ Release 2.6.0 - UNRELEASED HDFS-7093. Add config key to restrict setStoragePolicy. (Arpit Agarwal) + HDFS-6519. Document oiv_legacy command (Akira AJISAKA via aw) + OPTIMIZATIONS HDFS-6690. Deduplicate xattr names in memory. (wang) diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml b/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml index b17a664ae7a..4c9a4bfafdf 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml @@ -2140,4 +2140,15 @@ + + dfs.namenode.legacy-oiv-image.dir + + Determines where to save the namespace in the old fsimage format + during checkpointing by standby NameNode or SecondaryNameNode. Users can + dump the contents of the old format fsimage by oiv_legacy command. If + the value is not specified, old format fsimage will not be saved in + checkpoint. + + + diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsImageViewer.apt.vm b/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsImageViewer.apt.vm index 9a9946e7736..3b842265f7c 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsImageViewer.apt.vm +++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsImageViewer.apt.vm @@ -28,7 +28,7 @@ Offline Image Viewer Guide namespace. The tool is able to process very large image files relatively quickly. The tool handles the layout formats that were included with Hadoop versions 2.4 and up. If you want to handle older layout formats, you can - use the Offline Image Viewer of Hadoop 2.3. + use the Offline Image Viewer of Hadoop 2.3 or {{oiv_legacy Command}}. If the tool is not able to process an image file, it will exit cleanly. The Offline Image Viewer does not require a Hadoop cluster to be running; it is entirely offline in its operation. @@ -188,3 +188,60 @@ Offline Image Viewer Guide about the hdfs namespace. This information can then be used to explore file system usage patterns or find specific files that match arbitrary criteria, along with other types of namespace analysis. + +* oiv_legacy Command + + Due to the internal layout changes introduced by the ProtocolBuffer-based + fsimage ({{{https://issues.apache.org/jira/browse/HDFS-5698}HDFS-5698}}), + OfflineImageViewer consumes excessive amount of memory and loses some + functions such as Indented and Delimited processor. If you want to process + without large amount of memory or use these processors, you can use + <<>> command (same as <<>> in Hadoop 2.3). + +** Usage + + 1. Set <<>> to an appropriate directory + to make standby NameNode or SecondaryNameNode save its namespace in the + old fsimage format during checkpointing. + + 2. Use <<>> command to the old format fsimage. + +---- + bash$ bin/hdfs oiv_legacy -i fsimage_old -o output +---- + +** Options + +*-----------------------:-----------------------------------+ +| <> | <> | +*-----------------------:-----------------------------------+ +| <<<-i>>>\|<<<--inputFile>>> | Specify the input fsimage file to +| | process. Required. +*-----------------------:-----------------------------------+ +| <<<-o>>>\|<<<--outputFile>>> | Specify the output filename, if +| | the specified output processor generates one. If the +| | specified file already exists, it is silently +| | overwritten. Required. +*-----------------------:-----------------------------------+ +| <<<-p>>>\|<<<--processor>>> | Specify the image processor to +| | apply against the image file. Valid options are +| | Ls (default), XML, Delimited, Indented, and +| | FileDistribution. +*-----------------------:-----------------------------------+ +| <<<-skipBlocks>>> | Do not enumerate individual blocks within files. This +| | may save processing time and outfile file space on +| | namespaces with very large files. The Ls processor +| | reads the blocks to correctly determine file sizes +| | and ignores this option. +*-----------------------:-----------------------------------+ +| <<<-printToScreen>>> | Pipe output of processor to console as well as +| | specified file. On extremely large namespaces, this +| | may increase processing time by an order of +| | magnitude. +*-----------------------:-----------------------------------+ +| <<<-delimiter>>> | When used in conjunction with the Delimited +| | processor, replaces the default tab delimiter with +| | the string specified by . +*-----------------------:-----------------------------------+ +| <<<-h>>>\|<<<--help>>>| Display the tool usage and help information and exit. +*-----------------------:-----------------------------------+