HBASE-21737 Fix typos in "Appendix A: HFile format" section in the doc

Signed-off-by: Sean Busbey <busbey@apache.org>
This commit is contained in:
Sakthi 2019-01-17 16:57:31 -08:00 committed by Sean Busbey
parent 44dc872b7b
commit 3d23490e88
1 changed files with 16 additions and 16 deletions

View File

@ -106,11 +106,11 @@ In the version 2 every block in the data section contains the following fields:
.. BLOOM_CHUNK Bloom filter chunks .. BLOOM_CHUNK Bloom filter chunks
.. META meta blocks (not used for Bloom filters in version 2 anymore) .. META meta blocks (not used for Bloom filters in version 2 anymore)
.. INTERMEDIATE_INDEX intermediate-level index blocks in a multi-level blockindex .. INTERMEDIATE_INDEX intermediate-level index blocks in a multi-level blockindex
.. ROOT_INDEX root>level index blocks in a multi>level block index .. ROOT_INDEX root-level index blocks in a multi-level block index
.. FILE_INFO the ``file info'' block, a small key>value map of metadata .. FILE_INFO the ''file info'' block, a small key-value map of metadata
.. BLOOM_META a Bloom filter metadata block in the load>on>open section .. BLOOM_META a Bloom filter metadata block in the load-on-open section
.. TRAILER a fixed>size file trailer. .. TRAILER a fixed-size file trailer.
As opposed to the above, this is not an HFile v2 block but a fixed>size (for each HFile version) data structure As opposed to the above, this is not an HFile v2 block but a fixed-size (for each HFile version) data structure
.. INDEX_V1 this block type is only used for legacy HFile v1 block .. INDEX_V1 this block type is only used for legacy HFile v1 block
. Compressed size of the block's data, not including the header (int). . Compressed size of the block's data, not including the header (int).
+ +
@ -127,7 +127,7 @@ The above format of blocks is used in the following HFile sections:
Scanned block section:: Scanned block section::
The section is named so because it contains all data blocks that need to be read when an HFile is scanned sequentially. The section is named so because it contains all data blocks that need to be read when an HFile is scanned sequentially.
Also contains leaf block index and Bloom chunk blocks. Also contains Leaf index blocks and Bloom chunk blocks.
Non-scanned block section:: Non-scanned block section::
This section still contains unified-format v2 blocks but it does not have to be read when doing a sequential scan. This section still contains unified-format v2 blocks but it does not have to be read when doing a sequential scan.
This section contains "meta" blocks and intermediate-level index blocks. This section contains "meta" blocks and intermediate-level index blocks.
@ -140,10 +140,10 @@ There are three types of block indexes in HFile version 2, stored in two differe
. Data index -- version 2 multi-level block index, consisting of: . Data index -- version 2 multi-level block index, consisting of:
.. Version 2 root index, stored in the data block index section of the file .. Version 2 root index, stored in the data block index section of the file
.. Optionally, version 2 intermediate levels, stored in the non%root format in the data index section of the file. Intermediate levels can only be present if leaf level blocks are present .. Optionally, version 2 intermediate levels, stored in the non-root format in the data index section of the file. Intermediate levels can only be present if leaf level blocks are present
.. Optionally, version 2 leaf levels, stored in the non%root format inline with data blocks .. Optionally, version 2 leaf levels, stored in the non-root format inline with data blocks
. Meta index -- version 2 root index format only, stored in the meta index section of the file . Meta index -- version 2 root index format only, stored in the meta index section of the file
. Bloom index -- version 2 root index format only, stored in the ``load-on-open'' section as part of Bloom filter metadata. . Bloom index -- version 2 root index format only, stored in the ''load-on-open'' section as part of Bloom filter metadata.
==== Root block index format in version 2 ==== Root block index format in version 2
@ -156,7 +156,7 @@ A version 2 root index block is a sequence of entries of the following format, s
. Offset (long) . Offset (long)
+ +
This offset may point to a data block or to a deeper>level index block. This offset may point to a data block or to a deeper-level index block.
. On-disk size (int) . On-disk size (int)
. Key (a serialized byte array stored using Bytes.writeByteArray) . Key (a serialized byte array stored using Bytes.writeByteArray)
@ -172,7 +172,7 @@ For the data index and the meta index the number of entries is stored in the tra
For a multi-level block index we also store the following fields in the root index block in the load-on-open section of the HFile, in addition to the data structure described above: For a multi-level block index we also store the following fields in the root index block in the load-on-open section of the HFile, in addition to the data structure described above:
. Middle leaf index block offset . Middle leaf index block offset
. Middle leaf block on-disk size (meaning the leaf index block containing the reference to the ``middle'' data block of the file) . Middle leaf block on-disk size (meaning the leaf index block containing the reference to the ''middle'' data block of the file)
. The index of the mid-key (defined below) in the middle leaf-level block. . The index of the mid-key (defined below) in the middle leaf-level block.
@ -200,9 +200,9 @@ Every non-root index block is structured as follows.
. Entries. . Entries.
Each entry contains: Each entry contains:
+ +
. Offset of the block referenced by this entry in the file (long) .. Offset of the block referenced by this entry in the file (long)
. On>disk size of the referenced block (int) .. On-disk size of the referenced block (int)
. Key. .. Key.
The length can be calculated from entryOffsets. The length can be calculated from entryOffsets.
@ -214,7 +214,7 @@ In contrast with version 1, in a version 2 HFile Bloom filter metadata is stored
+ +
. Bloom filter version = 3 (int). There used to be a DynamicByteBloomFilter class that had the Bloom filter version number 2 . Bloom filter version = 3 (int). There used to be a DynamicByteBloomFilter class that had the Bloom filter version number 2
. The total byte size of all compound Bloom filter chunks (long) . The total byte size of all compound Bloom filter chunks (long)
. Number of hash functions (int . Number of hash functions (int)
. Type of hash functions (int) . Type of hash functions (int)
. The total key count inserted into the Bloom filter (long) . The total key count inserted into the Bloom filter (long)
. The maximum total number of keys in the Bloom filter (long) . The maximum total number of keys in the Bloom filter (long)
@ -246,7 +246,7 @@ This is because we need to know the comparator at the time of parsing the load-o
==== Fixed file trailer format differences between versions 1 and 2 ==== Fixed file trailer format differences between versions 1 and 2
The following table shows common and different fields between fixed file trailers in versions 1 and 2. The following table shows common and different fields between fixed file trailers in versions 1 and 2.
Note that the size of the trailer is different depending on the version, so it is ``fixed'' only within one version. Note that the size of the trailer is different depending on the version, so it is ''fixed'' only within one version.
However, the version is always stored as the last four-byte integer in the file. However, the version is always stored as the last four-byte integer in the file.
.Differences between HFile Versions 1 and 2 .Differences between HFile Versions 1 and 2