HDFS-10847. Complete the document for FileDistribution processor in OfflineImageViewer. Contributed by Yiqun Lin.

This commit is contained in:
Akira Ajisaka 2016-09-09 01:34:34 +09:00
parent 2d1bf53c7e
commit 20a20c2f6e
2 changed files with 12 additions and 3 deletions

View File

@ -261,6 +261,9 @@ Usage: `hdfs oiv_legacy [OPTIONS] -i INPUT_FILE -o OUTPUT_FILE`
| COMMAND\_OPTION | Description |
|:---- |:---- |
| `-p`\|`--processor` *processor* | Specify the image processor to apply against the image file. Valid options are Ls (default), XML, Delimited, Indented, and FileDistribution. |
| `-maxSize` *size* | Specify the range [0, maxSize] of file sizes to be analyzed in bytes (128GB by default). This option is used with FileDistribution processor. |
| `-step` *size* | Specify the granularity of the distribution in bytes (2MB by default). This option is used with FileDistribution processor. |
| `-format` | Format the output result in a human-readable fashion rather than a number of bytes. (false by default). This option is used with FileDistribution processor. |
| `-skipBlocks` | Do not enumerate individual blocks within files. This may save processing time and outfile file space on namespaces with very large files. The Ls processor reads the blocks to correctly determine file sizes and ignores this option. |
| `-printToScreen` | Pipe output of processor to console as well as specified file. On extremely large namespaces, this may increase processing time by an order of magnitude. |
| `-delimiter` *arg* | When used in conjunction with the Delimited processor, replaces the default tab delimiter with the string specified by *arg*. |

View File

@ -50,10 +50,13 @@ The Offline Image Viewer provides several output processors:
..., s[n-1], maxSize], and the processor calculates how many files
in the system fall into each segment [s[i-1], s[i]). Note that
files larger than maxSize always fall into the very last segment.
The output file is formatted as a tab separated two column table:
Size and NumFiles. Where Size represents the start of the segment,
By default, the output file is formatted as a tab separated two column
table: Size and NumFiles. Where Size represents the start of the segment,
and numFiles is the number of files form the image which size falls
in this segment.
in this segment. By specifying the option -format, the output file will be
formatted in a human-readable fashion rather than a number of bytes that
showed in Size column. In addition, the Size column will be changed to the
Size Range column.
4. Delimited (experimental): Generate a text file with all of the elements
common to both inodes and inodes-under-construction, separated by a
@ -182,6 +185,9 @@ Due to the internal layout changes introduced by the ProtocolBuffer-based fsimag
| `-i`\|`--inputFile` *input file* | Specify the input fsimage file to process. Required. |
| `-o`\|`--outputFile` *output file* | Specify the output filename, if the specified output processor generates one. If the specified file already exists, it is silently overwritten. Required. |
| `-p`\|`--processor` *processor* | Specify the image processor to apply against the image file. Valid options are Ls (default), XML, Delimited, Indented, and FileDistribution. |
| `-maxSize` *size* | Specify the range [0, maxSize] of file sizes to be analyzed in bytes (128GB by default). This option is used with FileDistribution processor. |
| `-step` *size* | Specify the granularity of the distribution in bytes (2MB by default). This option is used with FileDistribution processor. |
| `-format` | Format the output result in a human-readable fashion rather than a number of bytes. (false by default). This option is used with FileDistribution processor. |
| `-skipBlocks` | Do not enumerate individual blocks within files. This may save processing time and outfile file space on namespaces with very large files. The Ls processor reads the blocks to correctly determine file sizes and ignores this option. |
| `-printToScreen` | Pipe output of processor to console as well as specified file. On extremely large namespaces, this may increase processing time by an order of magnitude. |
| `-delimiter` *arg* | When used in conjunction with the Delimited processor, replaces the default tab delimiter with the string specified by *arg*. |