Note that bulk loading skirts replication

git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1545777 13f79535-47bb-0310-9956-ffa450edef68
2013-11-26 18:47:54 +00:00 · 2013-11-26 18:47:54 +00:00 · a4b414b133
parent 997d1bb727
commit a4b414b133
1 changed files with 10 additions and 1 deletions
--- a/src/main/docbkx/book.xml
+++ b/src/main/docbkx/book.xml
@ -2189,6 +2189,11 @@ All the settings that apply to normal compactions (file size limits, etc.) apply
        simply using the HBase API.
      </para>
    </section>
+    <section xml:id="arch.bulk.load.limitations"><title>Bulk Load Limitations</title>
+        <para>As bulk loading bypasses the write path, the WAL doesn’t get written to as part of the process.
+            Replication works by reading the WAL files so it won’t see the bulk loaded data – and the same goes for the edits that use Put.setWriteToWAL(true).
+            One way to handle that is to ship the raw files or the HFiles to the other cluster and do the other processing there.</para>
+    </section>
    <section xml:id="arch.bulk.load.arch"><title>Bulk Load Architecture</title>
      <para>
        The HBase bulk load process consists of two main steps.
@ -2270,6 +2275,10 @@ All the settings that apply to normal compactions (file size limits, etc.) apply
    <section xml:id="arch.bulk.load.also"><title>See Also</title>
      <para>For more information about the referenced utilities, see <xref linkend="importtsv"/> and  <xref linkend="completebulkload"/>.
      </para>
+      <para>
+          See <link xlink:ref="http://blog.cloudera.com/blog/2013/09/how-to-use-hbase-bulk-loading-and-why/">How-to: Use HBase Bulk Loading, and Why</link>
+          for a recent blog on current state of bulk loading.
+      </para>
    </section>
    <section xml:id="arch.bulk.load.adv"><title>Advanced Usage</title>
      <para>