12f66e3060
When we read from HDFS, we overread to pick up the next blocks header. Doing this saves a seek as we move through the hfile; we save having to do an explicit seek just to read the block header every time we need to read the body. We used to read in the next header as part of the current blocks buffer. This buffer was then what got persisted to blockcache; so we were over-persisting: our block plus the next blocks' header (33 bytes). This patch undoes this over-persisting. Removes support for version 1 blocks (0.2 was added in hbase-0.92.0). Not needed any more. There is an open question on whether checksums should be persisted when caching. The code seems to say no but if cache is SSD backed or backed by anything that does not do error correction, we'll want checksums. Adds loads of documentation. M hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java (write) Add writing from a ByteBuff. M hbase-common/src/main/java/org/apache/hadoop/hbase/nio/ByteBuff.java (toString) Add one so ByteBuff looks like ByteBuffer when you click on it in IDE M hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java Remove support for version 1 blocks. Cleaned up handling of metadata added when we serialize a block to caches. Metadata is smaller now. When we serialize (used when caching), do not persist the next blocks header if present. Removed a bunch of methods, a few of which had overlapping functionality and others that exposed too much of our internals. Also removed a bunch of constructors and unified the constructors we had left over making them share a common init method. Shutdown access to defines that should only be used internally here. Renamed all to do w/ 'EXTRA' and 'extraSerialization' to instead talk about metadata saved to caches; was unclear previously what EXTRA was about. Renamed static final declarations as all uppercase. (readBlockDataInternal): Redid. Couldn't make sense of it previously. Undid heavy-duty parse of header by constructing HFileBlock. Other cleanups. Its 1/3rd the length it used to be. More to do in here. |
||
---|---|---|
bin | ||
conf | ||
dev-support | ||
hbase-annotations | ||
hbase-archetypes | ||
hbase-assembly | ||
hbase-checkstyle | ||
hbase-client | ||
hbase-common | ||
hbase-examples | ||
hbase-external-blockcache | ||
hbase-hadoop-compat | ||
hbase-hadoop2-compat | ||
hbase-it | ||
hbase-native-client | ||
hbase-prefix-tree | ||
hbase-procedure | ||
hbase-protocol | ||
hbase-resource-bundle | ||
hbase-rest | ||
hbase-rsgroup | ||
hbase-server | ||
hbase-shaded | ||
hbase-shell | ||
hbase-spark | ||
hbase-testing-util | ||
hbase-thrift | ||
src/main | ||
.arcconfig | ||
.gitattributes | ||
.gitignore | ||
CHANGES.txt | ||
LICENSE.txt | ||
NOTICE.txt | ||
README.txt | ||
pom.xml |
README.txt
Apache HBase [1] is an open-source, distributed, versioned, column-oriented store modeled after Google' Bigtable: A Distributed Storage System for Structured Data by Chang et al.[2] Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Apache Hadoop [3]. To get started using HBase, the full documentation for this release can be found under the doc/ directory that accompanies this README. Using a browser, open the docs/index.html to view the project home page (or browse to [1]). The hbase 'book' at http://hbase.apache.org/book.html has a 'quick start' section and is where you should being your exploration of the hbase project. The latest HBase can be downloaded from an Apache Mirror [4]. The source code can be found at [5] The HBase issue tracker is at [6] Apache HBase is made available under the Apache License, version 2.0 [7] The HBase mailing lists and archives are listed here [8]. The HBase distribution includes cryptographic software. See the export control notice here [9]. 1. http://hbase.apache.org 2. http://research.google.com/archive/bigtable.html 3. http://hadoop.apache.org 4. http://www.apache.org/dyn/closer.cgi/hbase/ 5. https://hbase.apache.org/source-repository.html 6. https://hbase.apache.org/issue-tracking.html 7. http://hbase.apache.org/license.html 8. http://hbase.apache.org/mail-lists.html 9. https://hbase.apache.org/export_control.html