HBASE-3500 Documentation update for replication

git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1067598 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Michael Stack 2011-02-06 05:46:39 +00:00
parent 1cd8c4ae72
commit 5f64c2d27b
3 changed files with 29 additions and 10 deletions

View File

@ -45,6 +45,7 @@ Release 0.91.0 - Unreleased
HBASE-3502 Can't open region because can't open .regioninfo because
AlreadyBeingCreatedException
HBASE-3501 Remove the deletion limit in LogCleaner
HBASE-3500 Documentation update for replicatio
IMPROVEMENTS

View File

@ -92,6 +92,7 @@ to another.
<name>hbase.replication</name>
<value>true</value>
&lt;/property&gt;</pre>
deploy the files, and then restart HBase if it was running.
</li>
<li>Run the following command in the master's shell while it's running
<pre>add_peer</pre>
@ -100,6 +101,18 @@ to another.
to use a different <b>zookeeper.znode.parent</b> since they can't
write in the same folder.
</li>
<li>
Once you have a peer, you need to enable replication on your column families.
One way to do it is to alter the table and to set the scope like this:
<pre>
disable 'your_table'
alter 'your_table', {NAME => 'family_name', REPLICATION_SCOPE => '1'}
enable 'your_table'
</pre>
Currently, a scope of 0 (default) means that it won't be replicated and a
scope of 1 means it's going to be. In the future, different scope can be
used for routing policies.
</li>
</ol>
You can confirm that your setup works by looking at any region server's log

View File

@ -43,7 +43,10 @@
other well known solutions like MySQL master/slave replication where
theres only one bin log to keep track of. One master cluster can
replicate to any number of slave clusters, and each region server will
participate to replicate their own stream of edits.
participate to replicate their own stream of edits. For more information
on the different properties of master/slave replication and other types
of replication, please consult <a href="http://highscalability.com/blog/2009/8/24/how-google-serves-data-from-multiple-datacenters.html">
How Google Serves Data From Multiple Datacenters</a>.
</p>
<p>
The replication is done asynchronously, meaning that the clusters can
@ -73,6 +76,17 @@
</p>
<img src="images/replication_overview.png"/>
</section>
<section name="Enabling replication">
<p>
The guide on enabling and using cluster replication is contained
in the API documentation shipped with your HBase distribution.
</p>
<p>
The most up-to-date documentation is
<a href="apidocs/org/apache/hadoop/hbase/replication/package-summary.html#requirements">
available at this address</a>.
</p>
</section>
<section name="Life of a log edit">
<p>
The following sections describe the life of a single edit going from a
@ -350,15 +364,6 @@
</section>
</section>
<section name="FAQ">
<section name="Why do all clusters need to be in the same timezone?">
<p>
Suppose an edit to cell X happens in a EST cluster, then 2 minutes
later a new edits happens to the same cell in a PST cluster and that
both clusters are in a master-master replication. The second edit is
considered younger, so the first will always hide it while in fact the
second is older.
</p>
</section>
<section name="GLOBAL means replicate? Any provision to replicate only to cluster X and not to cluster Y? or is that for later?">
<p>
Yes, this is for much later.