Edited and additions to pseudo-distributed section after trying it and finding what was there missing

git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1521252 13f79535-47bb-0310-9956-ffa450edef68
2013-09-09 19:23:18 +00:00 · 2013-09-09 19:23:18 +00:00 · 65c68a146e
parent 87b4bfefa0
commit 65c68a146e
2 changed files with 59 additions and 39 deletions
--- a/src/main/docbkx/book.xml
+++ b/src/main/docbkx/book.xml
@ -39,7 +39,7 @@
           </inlinemediaobject>
       </link>
    </subtitle>
-    <copyright><year>2012</year><holder>Apache Software Foundation.
+    <copyright><year>2013</year><holder>Apache Software Foundation.
        All Rights Reserved.  Apache Hadoop, Hadoop, MapReduce, HDFS, Zookeeper, HBase, and the HBase project logo are trademarks of the Apache Software Foundation.
        </holder>
    </copyright>
--- a/src/main/docbkx/configuration.xml
+++ b/src/main/docbkx/configuration.xml
@ -351,12 +351,8 @@ to ensure well-formedness of your document after an edit session.
      <title>HBase run modes: Standalone and Distributed</title>
      <para>HBase has two run modes: <xref linkend="standalone" /> and <xref linkend="distributed" />. Out of the box, HBase runs in
-      standalone mode. To set up a distributed deploy, you will need to
+          standalone mode.  Whatever your mode, you will need to configure HBase by editing files in the HBase <filename>conf</filename>
-      configure HBase by editing files in the HBase <filename>conf</filename>
+      directory.  At a minimum, you must edit <code>conf/hbase-env.sh</code> to tell HBase which
      directory.</para>
      <para>Whatever your mode, you will need to edit
      <code>conf/hbase-env.sh</code> to tell HBase which
      <command>java</command> to use. In this file you set HBase environment
      variables such as the heapsize and other options for the
      <application>JVM</application>, the preferred location for log files,
@ -386,11 +382,12 @@ to ensure well-formedness of your document after an edit session.
            comes from Hadoop.</para>
          </footnote>.</para>
-        <para>Distributed modes require an instance of the <emphasis>Hadoop
+          <para>Pseudo-distributed mode can run against the local filesystem or
-        Distributed File System</emphasis> (HDFS). See the Hadoop <link
+              it can run against an instance of the <emphasis>Hadoop
                  Distributed File System</emphasis> (HDFS). Fully-distributed mode can
              ONLY run on HDFS. See the Hadoop <link
        xlink:href="http://hadoop.apache.org/common/docs/r1.1.1/api/overview-summary.html#overview_description">
-        requirements and instructions</link> for how to set up a HDFS. Before
+        requirements and instructions</link> for how to set up HDFS.</para>
        proceeding, ensure you have an appropriate, working HDFS.</para>
        <para>Below we describe the different distributed setups. Starting,
        verification and exploration of your install, whether a
@ -399,45 +396,65 @@ to ensure well-formedness of your document after an edit session.
        section that follows, <xref linkend="confirm" />. The same verification script applies to both
        deploy types.</para>
        <section xml:id="pseudo">
          <title>Pseudo-distributed</title>
-          <para>A pseudo-distributed mode is simply a distributed mode run on
+          <para>A pseudo-distributed mode is simply a fully-distributed mode run on
          a single host. Use this configuration testing and prototyping on
          HBase. Do not use this configuration for production nor for
          evaluating HBase performance.</para>
-	      <para>First, setup your HDFS in <link xlink:href="http://hadoop.apache.org/docs/r1.0.3/single_node_setup.html">pseudo-distributed mode</link>.
+      <para>First, if you want to run on HDFS rather than on the local filesystem,
-   	      </para>
+          setup your HDFS.  You can set up HDFS also in
-	      <para>Next, configure HBase.  Below is an example <filename>conf/hbase-site.xml</filename>.
+          <link xlink:href="http://hadoop.apache.org/docs/r1.0.3/single_node_setup.html">pseudo-distributed mode</link>.
-          This is the file into
+          Ensure you have a working HDFS before proceeding.
          which you add local customizations and overrides for
          <xref linkend="hbase_default_configurations" /> and <xref linkend="hdfs_client_conf" />.
              Note that the <varname>hbase.rootdir</varname> property points to the
              local HDFS instance.
   	      </para>
-          <para>Now skip to <xref linkend="confirm" /> for how to start and verify your
+          <para>Next, configure HBase.  Edit <filename>conf/hbase-site.xml</filename>.
-          pseudo-distributed install. <footnote>
+              This is the file into which you add local customizations and overrides.
-              <para>See <xref linkend="pseudo.extras">Pseudo-distributed
+          At a minimum, you must tell HBase to run in (pseudo-)distributed mode rather than
-              mode extras</xref> for notes on how to start extra Masters and
+          in default standalone mode.  To do this, set the <varname>hbase.cluster.distributed</varname>
-              RegionServers when running pseudo-distributed.</para>
+          property to true (Its default is <varname>false</varname>).  The absolute bare-minimum
-            </footnote></para>
+          <filename>hbase-site.xml</filename> is therefore as follows:
 <programlisting>
 &lt;configuration&gt;
  &lt;property&gt;
    &lt;name&gt;hbase.cluster.distributed&lt;/name&gt;
    &lt;value&gt;true&lt;/value&gt;
  &lt;/property&gt;
 &lt;/configuration&gt;
 </programlisting>
 With this configuration, HBase will start up an HBase Master process, a ZooKeeper server,
 and a RegionServer process running against the
 local filesystem writing to wherever your operating system stores temporary files into a directory
 named <filename>hbase-YOUR_USER_NAME</filename>.</para>
 <para>Such a setup, using the local filesystem and
 writing to the operating systems's temporary directory is an ephemeral setup; the Hadoop
 local filesystem -- which is what HBase uses when it is writing the local filesytem does not
 support <command>sync</command> so unless the system is shutdown properly, the data will be lost.  Writing to
 the operating system's temporary directory can also make for data loss when the machine
 is restarted as this directory is usually cleared on reboot.  For a more permanent
 setup, see the next example where we make use of an instance of HDFS; HBase data will
 be written to the Hadoop distributed filesystem rather than to the local filesystem's
 tmp directory.</para>
 <para>In this <filename>conf/hbase-site.xml</filename> example, the
 <varname>hbase.rootdir</varname> property points to the local HDFS instance
 homed on the node <varname>h-24-30.example.com</varname>.
          <note>
              <title>Let HBase create <filename>${hbase.rootdir}</filename></title>
            <para>Let HBase create the <varname>hbase.rootdir</varname>
            directory. If you don't, you'll get warning saying HBase needs a
            migration run because the directory is missing files expected by
            HBase (it'll create them if you let it).</para>
          </note>
  		  <section xml:id="pseudo.config">
  		  	<title>Pseudo-distributed Configuration File</title>
 			<para>Below is a sample pseudo-distributed file for the node <varname>h-24-30.example.com</varname>.
 <filename>hbase-site.xml</filename>
 <programlisting>
 &lt;configuration&gt;
  ...
  &lt;property&gt;
    &lt;name&gt;hbase.rootdir&lt;/name&gt;
    &lt;value&gt;hdfs://h-24-30.sfo.stumble.net:8020/hbase&lt;/value&gt;
@ -446,16 +463,15 @@ to ensure well-formedness of your document after an edit session.
    &lt;name&gt;hbase.cluster.distributed&lt;/name&gt;
    &lt;value&gt;true&lt;/value&gt;
  &lt;/property&gt;
  &lt;property&gt;
    &lt;name&gt;hbase.zookeeper.quorum&lt;/name&gt;
    &lt;value&gt;h-24-30.sfo.stumble.net&lt;/value&gt;
  &lt;/property&gt;
  ...
 &lt;/configuration&gt;
 </programlisting>
 </para>
-
+          <para>Now skip to <xref linkend="confirm" /> for how to start and verify your
-  		  </section>
+          pseudo-distributed install. <footnote>
              <para>See <xref linkend="pseudo.extras">Pseudo-distributed
              mode extras</xref> for notes on how to start extra Masters and
              RegionServers when running pseudo-distributed.</para>
            </footnote></para>
 		  <section xml:id="pseudo.extras">
 		    <title>Pseudo-distributed Extras</title>
@ -495,6 +511,10 @@ to ensure well-formedness of your document after an edit session.
        </section>
        <section xml:id="fully_dist">
          <title>Fully-distributed</title>