Add in Andrew Purtell's BigTop pointer
git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1400526 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
a5bd102cd8
commit
77707f9b0a
|
@ -30,10 +30,10 @@
|
||||||
<para>This chapter is the Not-So-Quick start guide to HBase configuration. It goes
|
<para>This chapter is the Not-So-Quick start guide to HBase configuration. It goes
|
||||||
over system requirements, Hadoop setup, the different HBase run modes, and the
|
over system requirements, Hadoop setup, the different HBase run modes, and the
|
||||||
various configurations in HBase. Please read this chapter carefully. At a mimimum
|
various configurations in HBase. Please read this chapter carefully. At a mimimum
|
||||||
ensure that all <xref linkend="basic.prerequisites" /> have
|
ensure that all <xref linkend="basic.prerequisites" /> have
|
||||||
been satisfied. Failure to do so will cause you (and us) grief debugging strange errors
|
been satisfied. Failure to do so will cause you (and us) grief debugging strange errors
|
||||||
and/or data loss.</para>
|
and/or data loss.</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
HBase uses the same configuration system as Hadoop.
|
HBase uses the same configuration system as Hadoop.
|
||||||
To configure a deploy, edit a file of environment variables
|
To configure a deploy, edit a file of environment variables
|
||||||
|
@ -57,7 +57,7 @@ to ensure well-formedness of your document after an edit session.
|
||||||
content of the <filename>conf</filename> directory to
|
content of the <filename>conf</filename> directory to
|
||||||
all nodes of the cluster. HBase will not do this for you.
|
all nodes of the cluster. HBase will not do this for you.
|
||||||
Use <command>rsync</command>.</para>
|
Use <command>rsync</command>.</para>
|
||||||
|
|
||||||
<section xml:id="basic.prerequisites">
|
<section xml:id="basic.prerequisites">
|
||||||
<title>Basic Prerequisites</title>
|
<title>Basic Prerequisites</title>
|
||||||
<para>This section lists required services and some required system configuration.
|
<para>This section lists required services and some required system configuration.
|
||||||
|
@ -69,7 +69,7 @@ to ensure well-formedness of your document after an edit session.
|
||||||
xlink:href="http://www.java.com/download/">Oracle</link>.</para>
|
xlink:href="http://www.java.com/download/">Oracle</link>.</para>
|
||||||
</section>
|
</section>
|
||||||
<section xml:id="os">
|
<section xml:id="os">
|
||||||
<title>Operating System</title>
|
<title>Operating System</title>
|
||||||
<section xml:id="ssh">
|
<section xml:id="ssh">
|
||||||
<title>ssh</title>
|
<title>ssh</title>
|
||||||
|
|
||||||
|
@ -151,9 +151,9 @@ to ensure well-formedness of your document after an edit session.
|
||||||
2010-04-06 03:04:37,542 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-6935524980745310745_1391901
|
2010-04-06 03:04:37,542 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-6935524980745310745_1391901
|
||||||
</programlisting> Do yourself a favor and change the upper bound on the
|
</programlisting> Do yourself a favor and change the upper bound on the
|
||||||
number of file descriptors. Set it to north of 10k. The math runs roughly as follows: per ColumnFamily
|
number of file descriptors. Set it to north of 10k. The math runs roughly as follows: per ColumnFamily
|
||||||
there is at least one StoreFile and possibly up to 5 or 6 if the region is under load. Multiply the
|
there is at least one StoreFile and possibly up to 5 or 6 if the region is under load. Multiply the
|
||||||
average number of StoreFiles per ColumnFamily times the number of regions per RegionServer. For example, assuming
|
average number of StoreFiles per ColumnFamily times the number of regions per RegionServer. For example, assuming
|
||||||
that a schema had 3 ColumnFamilies per region with an average of 3 StoreFiles per ColumnFamily,
|
that a schema had 3 ColumnFamilies per region with an average of 3 StoreFiles per ColumnFamily,
|
||||||
and there are 100 regions per RegionServer, the JVM will open 3 * 3 * 100 = 900 file descriptors
|
and there are 100 regions per RegionServer, the JVM will open 3 * 3 * 100 = 900 file descriptors
|
||||||
(not counting open jar files, config files, etc.)
|
(not counting open jar files, config files, etc.)
|
||||||
</para>
|
</para>
|
||||||
|
@ -216,13 +216,13 @@ to ensure well-formedness of your document after an edit session.
|
||||||
xlink:href="http://cygwin.com/">Cygwin</link> to have a *nix-like
|
xlink:href="http://cygwin.com/">Cygwin</link> to have a *nix-like
|
||||||
environment for the shell scripts. The full details are explained in
|
environment for the shell scripts. The full details are explained in
|
||||||
the <link xlink:href="http://hbase.apache.org/cygwin.html">Windows
|
the <link xlink:href="http://hbase.apache.org/cygwin.html">Windows
|
||||||
Installation</link> guide. Also
|
Installation</link> guide. Also
|
||||||
<link xlink:href="http://search-hadoop.com/?q=hbase+windows&fc_project=HBase&fc_type=mail+_hash_+dev">search our user mailing list</link> to pick
|
<link xlink:href="http://search-hadoop.com/?q=hbase+windows&fc_project=HBase&fc_type=mail+_hash_+dev">search our user mailing list</link> to pick
|
||||||
up latest fixes figured by Windows users.</para>
|
up latest fixes figured by Windows users.</para>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
</section> <!-- OS -->
|
</section> <!-- OS -->
|
||||||
|
|
||||||
<section xml:id="hadoop">
|
<section xml:id="hadoop">
|
||||||
<title><link
|
<title><link
|
||||||
xlink:href="http://hadoop.apache.org">Hadoop</link><indexterm>
|
xlink:href="http://hadoop.apache.org">Hadoop</link><indexterm>
|
||||||
|
@ -289,7 +289,7 @@ to ensure well-formedness of your document after an edit session.
|
||||||
<link xlink:href="http://www.cloudera.com/">Cloudera</link> or
|
<link xlink:href="http://www.cloudera.com/">Cloudera</link> or
|
||||||
<link xlink:href="http://www.mapr.com/">MapR</link> distributions.
|
<link xlink:href="http://www.mapr.com/">MapR</link> distributions.
|
||||||
Cloudera' <link xlink:href="http://archive.cloudera.com/docs/">CDH3</link>
|
Cloudera' <link xlink:href="http://archive.cloudera.com/docs/">CDH3</link>
|
||||||
is Apache Hadoop 0.20.x plus patches including all of the
|
is Apache Hadoop 0.20.x plus patches including all of the
|
||||||
<link xlink:href="http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-append/">branch-0.20-append</link>
|
<link xlink:href="http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-append/">branch-0.20-append</link>
|
||||||
additions needed to add a durable sync. Use the released, most recent version of CDH3. In CDH, append
|
additions needed to add a durable sync. Use the released, most recent version of CDH3. In CDH, append
|
||||||
support is enabled by default so you do not need to make the above mentioned edits to
|
support is enabled by default so you do not need to make the above mentioned edits to
|
||||||
|
@ -311,6 +311,16 @@ to ensure well-formedness of your document after an edit session.
|
||||||
replace the jar in HBase everywhere on your cluster. Hadoop version
|
replace the jar in HBase everywhere on your cluster. Hadoop version
|
||||||
mismatch issues have various manifestations but often all looks like
|
mismatch issues have various manifestations but often all looks like
|
||||||
its hung up.</para>
|
its hung up.</para>
|
||||||
|
<note xml:id="bigtop"><title>Packaging and Apache BigTop</title>
|
||||||
|
<para><link xlink:href="http://bigtop.apache.org">Apache Bigtop</link>
|
||||||
|
is an umbrella for packaging and tests of the Apache Hadoop
|
||||||
|
ecosystem, including Apache HBase. Bigtop performs testing at various
|
||||||
|
levels (packaging, platform, runtime, upgrade, etc...), developed by a
|
||||||
|
community, with a focus on the system as a whole, rather than individual
|
||||||
|
projects. We recommend installing Apache HBase packages as provided by a
|
||||||
|
Bigtop release rather than rolling your own piecemeal integration of
|
||||||
|
various component releases.</para>
|
||||||
|
</note>
|
||||||
|
|
||||||
<section xml:id="hadoop.security">
|
<section xml:id="hadoop.security">
|
||||||
<title>HBase on Secure Hadoop</title>
|
<title>HBase on Secure Hadoop</title>
|
||||||
|
@ -320,7 +330,7 @@ to ensure well-formedness of your document after an edit session.
|
||||||
with the secure version. If you want to read more about how to setup
|
with the secure version. If you want to read more about how to setup
|
||||||
Secure HBase, see <xref linkend="hbase.secure.configuration" />.</para>
|
Secure HBase, see <xref linkend="hbase.secure.configuration" />.</para>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
<section xml:id="dfs.datanode.max.xcievers">
|
<section xml:id="dfs.datanode.max.xcievers">
|
||||||
<title><varname>dfs.datanode.max.xcievers</varname><indexterm>
|
<title><varname>dfs.datanode.max.xcievers</varname><indexterm>
|
||||||
<primary>xcievers</primary>
|
<primary>xcievers</primary>
|
||||||
|
@ -354,7 +364,7 @@ to ensure well-formedness of your document after an edit session.
|
||||||
<para>See also <xref linkend="casestudies.xceivers"/>
|
<para>See also <xref linkend="casestudies.xceivers"/>
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
</section> <!-- hadoop -->
|
</section> <!-- hadoop -->
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
|
@ -418,7 +428,7 @@ to ensure well-formedness of your document after an edit session.
|
||||||
HBase. Do not use this configuration for production nor for
|
HBase. Do not use this configuration for production nor for
|
||||||
evaluating HBase performance.</para>
|
evaluating HBase performance.</para>
|
||||||
|
|
||||||
<para>First, setup your HDFS in <link xlink:href="http://hadoop.apache.org/docs/r1.0.3/single_node_setup.html">pseudo-distributed mode</link>.
|
<para>First, setup your HDFS in <link xlink:href="http://hadoop.apache.org/docs/r1.0.3/single_node_setup.html">pseudo-distributed mode</link>.
|
||||||
</para>
|
</para>
|
||||||
<para>Next, configure HBase. Below is an example <filename>conf/hbase-site.xml</filename>.
|
<para>Next, configure HBase. Below is an example <filename>conf/hbase-site.xml</filename>.
|
||||||
This is the file into
|
This is the file into
|
||||||
|
@ -501,10 +511,10 @@ to ensure well-formedness of your document after an edit session.
|
||||||
</programlisting>
|
</programlisting>
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
<section xml:id="fully_dist">
|
<section xml:id="fully_dist">
|
||||||
<title>Fully-distributed</title>
|
<title>Fully-distributed</title>
|
||||||
|
@ -600,7 +610,7 @@ to ensure well-formedness of your document after an edit session.
|
||||||
<section xml:id="confirm">
|
<section xml:id="confirm">
|
||||||
<title>Running and Confirming Your Installation</title>
|
<title>Running and Confirming Your Installation</title>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
<para>Make sure HDFS is running first. Start and stop the Hadoop HDFS
|
<para>Make sure HDFS is running first. Start and stop the Hadoop HDFS
|
||||||
daemons by running <filename>bin/start-hdfs.sh</filename> over in the
|
daemons by running <filename>bin/start-hdfs.sh</filename> over in the
|
||||||
|
@ -610,31 +620,31 @@ to ensure well-formedness of your document after an edit session.
|
||||||
not normally use the mapreduce daemons. These do not need to be
|
not normally use the mapreduce daemons. These do not need to be
|
||||||
started.</para>
|
started.</para>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
<para><emphasis>If</emphasis> you are managing your own ZooKeeper,
|
<para><emphasis>If</emphasis> you are managing your own ZooKeeper,
|
||||||
start it and confirm its running else, HBase will start up ZooKeeper
|
start it and confirm its running else, HBase will start up ZooKeeper
|
||||||
for you as part of its start process.</para>
|
for you as part of its start process.</para>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
<para>Start HBase with the following command:</para>
|
<para>Start HBase with the following command:</para>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
<programlisting>bin/start-hbase.sh</programlisting>
|
<programlisting>bin/start-hbase.sh</programlisting>
|
||||||
|
|
||||||
Run the above from the
|
Run the above from the
|
||||||
|
|
||||||
<varname>HBASE_HOME</varname>
|
<varname>HBASE_HOME</varname>
|
||||||
|
|
||||||
directory.
|
directory.
|
||||||
|
|
||||||
<para>You should now have a running HBase instance. HBase logs can be
|
<para>You should now have a running HBase instance. HBase logs can be
|
||||||
found in the <filename>logs</filename> subdirectory. Check them out
|
found in the <filename>logs</filename> subdirectory. Check them out
|
||||||
especially if HBase had trouble starting.</para>
|
especially if HBase had trouble starting.</para>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
<para>HBase also puts up a UI listing vital attributes. By default its
|
<para>HBase also puts up a UI listing vital attributes. By default its
|
||||||
deployed on the Master host at port 60010 (HBase RegionServers listen
|
deployed on the Master host at port 60010 (HBase RegionServers listen
|
||||||
|
@ -644,13 +654,13 @@ to ensure well-formedness of your document after an edit session.
|
||||||
Master's homepage you'd point your browser at
|
Master's homepage you'd point your browser at
|
||||||
<filename>http://master.example.org:60010</filename>.</para>
|
<filename>http://master.example.org:60010</filename>.</para>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
<para>Once HBase has started, see the <xref linkend="shell_exercises" /> for how to
|
<para>Once HBase has started, see the <xref linkend="shell_exercises" /> for how to
|
||||||
create tables, add data, scan your insertions, and finally disable and
|
create tables, add data, scan your insertions, and finally disable and
|
||||||
drop your tables.</para>
|
drop your tables.</para>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
<para>To stop HBase after exiting the HBase shell enter
|
<para>To stop HBase after exiting the HBase shell enter
|
||||||
<programlisting>$ ./bin/stop-hbase.sh
|
<programlisting>$ ./bin/stop-hbase.sh
|
||||||
|
@ -660,15 +670,15 @@ stopping hbase...............</programlisting> Shutdown can take a moment to
|
||||||
until HBase has shut down completely before stopping the Hadoop
|
until HBase has shut down completely before stopping the Hadoop
|
||||||
daemons.</para>
|
daemons.</para>
|
||||||
|
|
||||||
|
|
||||||
</section>
|
</section>
|
||||||
</section> <!-- run modes -->
|
</section> <!-- run modes -->
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
<section xml:id="config.files">
|
<section xml:id="config.files">
|
||||||
<title>Configuration Files</title>
|
<title>Configuration Files</title>
|
||||||
|
|
||||||
<section xml:id="hbase.site">
|
<section xml:id="hbase.site">
|
||||||
<title><filename>hbase-site.xml</filename> and <filename>hbase-default.xml</filename></title>
|
<title><filename>hbase-site.xml</filename> and <filename>hbase-default.xml</filename></title>
|
||||||
<para>Just as in Hadoop where you add site-specific HDFS configuration
|
<para>Just as in Hadoop where you add site-specific HDFS configuration
|
||||||
|
@ -744,11 +754,11 @@ stopping hbase...............</programlisting> Shutdown can take a moment to
|
||||||
Minimally, a client of HBase needs several libraries in its <varname>CLASSPATH</varname> when connecting to a cluster, including:
|
Minimally, a client of HBase needs several libraries in its <varname>CLASSPATH</varname> when connecting to a cluster, including:
|
||||||
<programlisting>
|
<programlisting>
|
||||||
commons-configuration (commons-configuration-1.6.jar)
|
commons-configuration (commons-configuration-1.6.jar)
|
||||||
commons-lang (commons-lang-2.5.jar)
|
commons-lang (commons-lang-2.5.jar)
|
||||||
commons-logging (commons-logging-1.1.1.jar)
|
commons-logging (commons-logging-1.1.1.jar)
|
||||||
hadoop-core (hadoop-core-1.0.0.jar)
|
hadoop-core (hadoop-core-1.0.0.jar)
|
||||||
hbase (hbase-0.92.0.jar)
|
hbase (hbase-0.92.0.jar)
|
||||||
log4j (log4j-1.2.16.jar)
|
log4j (log4j-1.2.16.jar)
|
||||||
slf4j-api (slf4j-api-1.5.8.jar)
|
slf4j-api (slf4j-api-1.5.8.jar)
|
||||||
slf4j-log4j (slf4j-log4j12-1.5.8.jar)
|
slf4j-log4j (slf4j-log4j12-1.5.8.jar)
|
||||||
zookeeper (zookeeper-3.4.2.jar)</programlisting>
|
zookeeper (zookeeper-3.4.2.jar)</programlisting>
|
||||||
|
@ -769,7 +779,7 @@ zookeeper (zookeeper-3.4.2.jar)</programlisting>
|
||||||
</configuration>
|
</configuration>
|
||||||
]]></programlisting>
|
]]></programlisting>
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<section xml:id="java.client.config">
|
<section xml:id="java.client.config">
|
||||||
<title>Java client configuration</title>
|
<title>Java client configuration</title>
|
||||||
<para>The configuration used by a Java client is kept
|
<para>The configuration used by a Java client is kept
|
||||||
|
@ -778,15 +788,15 @@ zookeeper (zookeeper-3.4.2.jar)</programlisting>
|
||||||
on invocation, will read in the content of the first <filename>hbase-site.xml</filename> found on
|
on invocation, will read in the content of the first <filename>hbase-site.xml</filename> found on
|
||||||
the client's <varname>CLASSPATH</varname>, if one is present
|
the client's <varname>CLASSPATH</varname>, if one is present
|
||||||
(Invocation will also factor in any <filename>hbase-default.xml</filename> found;
|
(Invocation will also factor in any <filename>hbase-default.xml</filename> found;
|
||||||
an hbase-default.xml ships inside the <filename>hbase.X.X.X.jar</filename>).
|
an hbase-default.xml ships inside the <filename>hbase.X.X.X.jar</filename>).
|
||||||
It is also possible to specify configuration directly without having to read from a
|
It is also possible to specify configuration directly without having to read from a
|
||||||
<filename>hbase-site.xml</filename>. For example, to set the ZooKeeper
|
<filename>hbase-site.xml</filename>. For example, to set the ZooKeeper
|
||||||
ensemble for the cluster programmatically do as follows:
|
ensemble for the cluster programmatically do as follows:
|
||||||
<programlisting>Configuration config = HBaseConfiguration.create();
|
<programlisting>Configuration config = HBaseConfiguration.create();
|
||||||
config.set("hbase.zookeeper.quorum", "localhost"); // Here we are running zookeeper locally</programlisting>
|
config.set("hbase.zookeeper.quorum", "localhost"); // Here we are running zookeeper locally</programlisting>
|
||||||
If multiple ZooKeeper instances make up your ZooKeeper ensemble,
|
If multiple ZooKeeper instances make up your ZooKeeper ensemble,
|
||||||
they may be specified in a comma-separated list (just as in the <filename>hbase-site.xml</filename> file).
|
they may be specified in a comma-separated list (just as in the <filename>hbase-site.xml</filename> file).
|
||||||
This populated <classname>Configuration</classname> instance can then be passed to an
|
This populated <classname>Configuration</classname> instance can then be passed to an
|
||||||
<link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html">HTable</link>,
|
<link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html">HTable</link>,
|
||||||
and so on.
|
and so on.
|
||||||
</para>
|
</para>
|
||||||
|
@ -794,7 +804,7 @@ config.set("hbase.zookeeper.quorum", "localhost"); // Here we are running zooke
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
</section> <!-- config files -->
|
</section> <!-- config files -->
|
||||||
|
|
||||||
<section xml:id="example_config">
|
<section xml:id="example_config">
|
||||||
<title>Example Configurations</title>
|
<title>Example Configurations</title>
|
||||||
|
|
||||||
|
@ -886,7 +896,7 @@ config.set("hbase.zookeeper.quorum", "localhost"); // Here we are running zooke
|
||||||
1G.</para>
|
1G.</para>
|
||||||
|
|
||||||
<programlisting>
|
<programlisting>
|
||||||
|
|
||||||
$ git diff hbase-env.sh
|
$ git diff hbase-env.sh
|
||||||
diff --git a/conf/hbase-env.sh b/conf/hbase-env.sh
|
diff --git a/conf/hbase-env.sh b/conf/hbase-env.sh
|
||||||
index e70ebc6..96f8c27 100644
|
index e70ebc6..96f8c27 100644
|
||||||
|
@ -894,11 +904,11 @@ index e70ebc6..96f8c27 100644
|
||||||
+++ b/conf/hbase-env.sh
|
+++ b/conf/hbase-env.sh
|
||||||
@@ -31,7 +31,7 @@ export JAVA_HOME=/usr/lib//jvm/java-6-sun/
|
@@ -31,7 +31,7 @@ export JAVA_HOME=/usr/lib//jvm/java-6-sun/
|
||||||
# export HBASE_CLASSPATH=
|
# export HBASE_CLASSPATH=
|
||||||
|
|
||||||
# The maximum amount of heap to use, in MB. Default is 1000.
|
# The maximum amount of heap to use, in MB. Default is 1000.
|
||||||
-# export HBASE_HEAPSIZE=1000
|
-# export HBASE_HEAPSIZE=1000
|
||||||
+export HBASE_HEAPSIZE=4096
|
+export HBASE_HEAPSIZE=4096
|
||||||
|
|
||||||
# Extra Java runtime options.
|
# Extra Java runtime options.
|
||||||
# Below are what we set by default. May only work with SUN JVM.
|
# Below are what we set by default. May only work with SUN JVM.
|
||||||
|
|
||||||
|
@ -910,8 +920,8 @@ index e70ebc6..96f8c27 100644
|
||||||
</section>
|
</section>
|
||||||
</section>
|
</section>
|
||||||
</section> <!-- example config -->
|
</section> <!-- example config -->
|
||||||
|
|
||||||
|
|
||||||
<section xml:id="important_configurations">
|
<section xml:id="important_configurations">
|
||||||
<title>The Important Configurations</title>
|
<title>The Important Configurations</title>
|
||||||
<para>Below we list what the <emphasis>important</emphasis>
|
<para>Below we list what the <emphasis>important</emphasis>
|
||||||
|
@ -935,7 +945,7 @@ index e70ebc6..96f8c27 100644
|
||||||
configuration under control otherwise, a long garbage collection that lasts
|
configuration under control otherwise, a long garbage collection that lasts
|
||||||
beyond the ZooKeeper session timeout will take out
|
beyond the ZooKeeper session timeout will take out
|
||||||
your RegionServer (You might be fine with this -- you probably want recovery to start
|
your RegionServer (You might be fine with this -- you probably want recovery to start
|
||||||
on the server if a RegionServer has been in GC for a long period of time).</para>
|
on the server if a RegionServer has been in GC for a long period of time).</para>
|
||||||
|
|
||||||
<para>To change this configuration, edit <filename>hbase-site.xml</filename>,
|
<para>To change this configuration, edit <filename>hbase-site.xml</filename>,
|
||||||
copy the changed file around the cluster and restart.</para>
|
copy the changed file around the cluster and restart.</para>
|
||||||
|
@ -1011,7 +1021,7 @@ index e70ebc6..96f8c27 100644
|
||||||
cluster (You can always later manually split the big Regions should one prove
|
cluster (You can always later manually split the big Regions should one prove
|
||||||
hot and you want to spread the request load over the cluster). A lower number of regions is
|
hot and you want to spread the request load over the cluster). A lower number of regions is
|
||||||
preferred, generally in the range of 20 to low-hundreds
|
preferred, generally in the range of 20 to low-hundreds
|
||||||
per RegionServer. Adjust the regionsize as appropriate to achieve this number.
|
per RegionServer. Adjust the regionsize as appropriate to achieve this number.
|
||||||
</para>
|
</para>
|
||||||
<para>For the 0.90.x codebase, the upper-bound of regionsize is about 4Gb, with a default of 256Mb.
|
<para>For the 0.90.x codebase, the upper-bound of regionsize is about 4Gb, with a default of 256Mb.
|
||||||
For 0.92.x codebase, due to the HFile v2 change much larger regionsizes can be supported (e.g., 20Gb).
|
For 0.92.x codebase, due to the HFile v2 change much larger regionsizes can be supported (e.g., 20Gb).
|
||||||
|
@ -1019,10 +1029,10 @@ index e70ebc6..96f8c27 100644
|
||||||
<para>You may need to experiment with this setting based on your hardware configuration and application needs.
|
<para>You may need to experiment with this setting based on your hardware configuration and application needs.
|
||||||
</para>
|
</para>
|
||||||
<para>Adjust <code>hbase.hregion.max.filesize</code> in your <filename>hbase-site.xml</filename>.
|
<para>Adjust <code>hbase.hregion.max.filesize</code> in your <filename>hbase-site.xml</filename>.
|
||||||
RegionSize can also be set on a per-table basis via
|
RegionSize can also be set on a per-table basis via
|
||||||
<link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HTableDescriptor.html">HTableDescriptor</link>.
|
<link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HTableDescriptor.html">HTableDescriptor</link>.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
</section>
|
</section>
|
||||||
<section xml:id="disable.splitting">
|
<section xml:id="disable.splitting">
|
||||||
<title>Managed Splitting</title>
|
<title>Managed Splitting</title>
|
||||||
|
@ -1075,22 +1085,22 @@ of all regions.
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
<section xml:id="managed.compactions"><title>Managed Compactions</title>
|
<section xml:id="managed.compactions"><title>Managed Compactions</title>
|
||||||
<para>A common administrative technique is to manage major compactions manually, rather than letting
|
<para>A common administrative technique is to manage major compactions manually, rather than letting
|
||||||
HBase do it. By default, <varname>HConstants.MAJOR_COMPACTION_PERIOD</varname> is one day and major compactions
|
HBase do it. By default, <varname>HConstants.MAJOR_COMPACTION_PERIOD</varname> is one day and major compactions
|
||||||
may kick in when you least desire it - especially on a busy system. To turn off automatic major compactions set
|
may kick in when you least desire it - especially on a busy system. To turn off automatic major compactions set
|
||||||
the value to <varname>0</varname>.
|
the value to <varname>0</varname>.
|
||||||
</para>
|
</para>
|
||||||
<para>It is important to stress that major compactions are absolutely necessary for StoreFile cleanup, the only variant is when
|
<para>It is important to stress that major compactions are absolutely necessary for StoreFile cleanup, the only variant is when
|
||||||
they occur. They can be administered through the HBase shell, or via
|
they occur. They can be administered through the HBase shell, or via
|
||||||
<link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html#majorCompact%28java.lang.String%29">HBaseAdmin</link>.
|
<link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html#majorCompact%28java.lang.String%29">HBaseAdmin</link>.
|
||||||
</para>
|
</para>
|
||||||
<para>For more information about compactions and the compaction file selection process, see <xref linkend="compaction"/></para>
|
<para>For more information about compactions and the compaction file selection process, see <xref linkend="compaction"/></para>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
<section xml:id="spec.ex"><title>Speculative Execution</title>
|
<section xml:id="spec.ex"><title>Speculative Execution</title>
|
||||||
<para>Speculative Execution of MapReduce tasks is on by default, and for HBase clusters it is generally advised to turn off
|
<para>Speculative Execution of MapReduce tasks is on by default, and for HBase clusters it is generally advised to turn off
|
||||||
Speculative Execution at a system-level unless you need it for a specific case, where it can be configured per-job.
|
Speculative Execution at a system-level unless you need it for a specific case, where it can be configured per-job.
|
||||||
Set the properties <varname>mapred.map.tasks.speculative.execution</varname> and
|
Set the properties <varname>mapred.map.tasks.speculative.execution</varname> and
|
||||||
<varname>mapred.reduce.tasks.speculative.execution</varname> to false.
|
<varname>mapred.reduce.tasks.speculative.execution</varname> to false.
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
|
@ -1118,9 +1128,9 @@ of all regions.
|
||||||
<link xlink:href="http://search-hadoop.com/m/pduLg2fydtE/Inconsistent+scan+performance+with+caching+set+&subj=Re+Inconsistent+scan+performance+with+caching+set+to+1">Inconsistent scan performance with caching set to 1</link>
|
<link xlink:href="http://search-hadoop.com/m/pduLg2fydtE/Inconsistent+scan+performance+with+caching+set+&subj=Re+Inconsistent+scan+performance+with+caching+set+to+1">Inconsistent scan performance with caching set to 1</link>
|
||||||
and the issue cited therein where setting notcpdelay improved scan speeds.</para>
|
and the issue cited therein where setting notcpdelay improved scan speeds.</para>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
</section> <!-- important config -->
|
</section> <!-- important config -->
|
||||||
|
|
||||||
</chapter>
|
</chapter>
|
||||||
|
|
Loading…
Reference in New Issue