Note that bulk loading skirts replication

git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1545777 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Michael Stack 2013-11-26 18:47:54 +00:00
parent 997d1bb727
commit a4b414b133
1 changed files with 10 additions and 1 deletions

View File

@ -2189,6 +2189,11 @@ All the settings that apply to normal compactions (file size limits, etc.) apply
simply using the HBase API.
</para>
</section>
<section xml:id="arch.bulk.load.limitations"><title>Bulk Load Limitations</title>
<para>As bulk loading bypasses the write path, the WAL doesnt get written to as part of the process.
Replication works by reading the WAL files so it wont see the bulk loaded data and the same goes for the edits that use Put.setWriteToWAL(true).
One way to handle that is to ship the raw files or the HFiles to the other cluster and do the other processing there.</para>
</section>
<section xml:id="arch.bulk.load.arch"><title>Bulk Load Architecture</title>
<para>
The HBase bulk load process consists of two main steps.
@ -2270,6 +2275,10 @@ All the settings that apply to normal compactions (file size limits, etc.) apply
<section xml:id="arch.bulk.load.also"><title>See Also</title>
<para>For more information about the referenced utilities, see <xref linkend="importtsv"/> and <xref linkend="completebulkload"/>.
</para>
<para>
See <link xlink:ref="http://blog.cloudera.com/blog/2013/09/how-to-use-hbase-bulk-loading-and-why/">How-to: Use HBase Bulk Loading, and Why</link>
for a recent blog on current state of bulk loading.
</para>
</section>
<section xml:id="arch.bulk.load.adv"><title>Advanced Usage</title>
<para>