HBASE-1932 Encourage use of 'lzo' compression... add the wiki page to getting started

git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1029952 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Michael Stack 2010-11-02 04:57:25 +00:00
parent 1351769c16
commit c48a1061a7
2 changed files with 71 additions and 12 deletions

View File

@ -640,6 +640,8 @@ Release 0.21.0 - Unreleased
HBASE-3179 Enable ReplicationLogsCleaner only if replication is, HBASE-3179 Enable ReplicationLogsCleaner only if replication is,
and fix its test and fix its test
HBASE-3185 User-triggered compactions are triggering splits! HBASE-3185 User-triggered compactions are triggering splits!
HBASE-1932 Encourage use of 'lzo' compression... add the wiki page to
getting started
IMPROVEMENTS IMPROVEMENTS

View File

@ -230,6 +230,7 @@ stopping hbase...............</programlisting></para>
different HBase run modes: standalone, what is described above in <link different HBase run modes: standalone, what is described above in <link
linkend="quickstart">Quick Start,</link> pseudo-distributed where all linkend="quickstart">Quick Start,</link> pseudo-distributed where all
daemons run on a single server, and distributed.</para> daemons run on a single server, and distributed.</para>
<para>Be sure to read the <link linkend="important_configurations">Important Configurations</link>.</para>
</section> </section>
</chapter> </chapter>
@ -242,7 +243,6 @@ stopping hbase...............</programlisting></para>
<title><filename>hbase-site.xml</filename> and <filename>hbase-default.xml</filename></title> <title><filename>hbase-site.xml</filename> and <filename>hbase-default.xml</filename></title>
<para>What are these? <para>What are these?
</para> </para>
<para> <para>
Not all configuration options make it out to Not all configuration options make it out to
<filename>hbase-default.xml</filename>. Configuration <filename>hbase-default.xml</filename>. Configuration
@ -250,37 +250,94 @@ stopping hbase...............</programlisting></para>
in code; the only way to turn up the configurations is in code; the only way to turn up the configurations is
via a reading of the source code. via a reading of the source code.
</para> </para>
<!--The file hbase-default.xml is generated as part of <!--The file hbase-default.xml is generated as part of
the build of the hbase site. See the hbase pom.xml. the build of the hbase site. See the hbase pom.xml.
The generated file is a docbook section with a glossary The generated file is a docbook section with a glossary
in it--> in it-->
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" <xi:include xmlns:xi="http://www.w3.org/2001/XInclude"
href="../../target/site/hbase-default.xml" /> href="../../target/site/hbase-default.xml" />
</section> </section>
<section> <section>
<title><filename>hbase-env.sh</filename></title> <title><filename>hbase-env.sh</filename></title>
<para></para> <para></para>
</section> </section>
<section> <section>
<title><filename>log4j.properties</filename></title> <title><filename>log4j.properties</filename></title>
<para></para> <para></para>
</section> </section>
<section>
<title>Noteworthy Configuration</title> <section xml:id="important_configurations">
<para>Below we review a couple of the key configurations. <title>The Important Configurations</title>
We'll list those you must to change to suit your context <para>Below we list the important Configurations. We've divided this section into
and others that you should review and consider moving on required configuration and worth-a-look recommended configs.
from defaults after guaging your deploys load and query profiles.
</para> </para>
<section>
<section xml:id="required_configuration"><title>Required Configurations</title>
<para>Here are some configurations you must configure to suit
your deploy.
</para>
<section xml:id="ulimit">
<title><varname>ulimit</varname></title>
<para>HBase is a database, it uses a lot of files at the same time.
The default ulimit -n of 1024 on *nix systems is insufficient.
Any significant amount of loading will lead you to
<link xlink:href="http://wiki.apache.org/hadoop/Hbase/FAQ#A6">FAQ: Why do I see "java.io.IOException...(Too many open files)" in my logs?</link>.
You will also notice errors like:
<programlisting>2010-04-06 03:04:37,542 INFO org.apache.hadoop.hdfs.DFSClient: Exception increateBlockOutputStream java.io.EOFException
2010-04-06 03:04:37,542 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-6935524980745310745_1391901
</programlisting>
Do yourself a favor and change the upper bound on the number of file descriptors.
Set it to north of 10k. See the above referenced FAQ for how.</para>
<para>To be clear, upping the file descriptors for the user who is
running the HBase process is an operating system configuration, not an
HBase configuration.
</para>
</section>
<section xml:id="dfs.datanode.max.xcievers">
<title><varname>dfs.datanode.max.xcievers</varname></title>
<para>
Hadoop HDFS has an upper bound of files that it will serve at one same time,
called <varname>xcievers</varname> (yes, this is misspelled). Again, before
doing any loading, make sure you have configured Hadoop's <filename>conf/hdfs-site.xml</filename>
setting the <varname>xceivers</varname> value to at least the following:
<programlisting>
&lt;property&gt;
&lt;name&gt;dfs.datanode.max.xcievers&lt;/name&gt;
&lt;value&gt;2047&lt;/value&gt;
&lt;/property&gt;
</programlisting>
</para>
</section>
</section>
<section xml:id="recommended_configurations"><title>Recommended Configuations</title>
<section xml:id="lzo">
<title>LZO compression</title> <title>LZO compression</title>
<para>You should consider enabling LZO compression. Its <para>You should consider enabling LZO compression. Its
near-frictionless and in most all cases boosts performance. near-frictionless and in most all cases boosts performance.
To enable LZO, TODO... </para>
<para>Unfortunately, HBase cannot ship with LZO because of
the licensing issues; HBase is Apache-licensed, LZO is GPL.
Therefore LZO install is to be done post-HBase install.
See the <link xlink:href="http://wiki.apache.org/hadoop/UsingLzoCompression">Using LZO Compression</link>
wiki page for how to make LZO work with HBase.
</para>
<para>A common problem users run into when using LZO is that while initial
setup of the cluster runs smooth, a month goes by and some sysadmin goes to
add a machine to the cluster only they'll have forgotten to do the LZO
fixup on the new machine. In versions since HBase 0.90.0, we should
fail in a way that makes it plain what the problem is, but maybe not.
Remember you read this paragraph<footnote><para>See
<link linkend="hbase.regionserver.codec">hbase.regionserver.codec</link>
for a feature to help protect against failed LZO install</para></footnote>.
</para> </para>
</section> </section>
</section>
</section> </section>
</chapter> </chapter>
@ -1201,7 +1258,7 @@ stopping hbase...............</programlisting></para>
xlink:href="http://en.wikipedia.org/wiki/Write-ahead_logging"> Write-Ahead xlink:href="http://en.wikipedia.org/wiki/Write-ahead_logging"> Write-Ahead
Log</link></subtitle> Log</link></subtitle>
<para>Each RegionServer adds updates to its <link linkend="???">WAL</link> <para>Each RegionServer adds updates to its Write-ahead Log (WAL)
first, and then to memory.</para> first, and then to memory.</para>
<section> <section>