hbase-4943. book updates (more FAQ, add to appendix for other resources)

git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1210010 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Doug Meil 2011-12-03 21:28:58 +00:00
parent 73b59715e9
commit ca79114d4c
3 changed files with 156 additions and 122 deletions

View File

@ -1938,119 +1938,11 @@ scan.setFilter(filter);
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="ops_mgt.xml" />
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="developer.xml" />
<appendix xml:id="compression">
<title >Compression In HBase<indexterm><primary>Compression</primary></indexterm></title>
<section xml:id="compression.test">
<title>CompressionTest Tool</title>
<para>
HBase includes a tool to test compression is set up properly.
To run it, type <code>/bin/hbase org.apache.hadoop.hbase.util.CompressionTest</code>.
This will emit usage on how to run the tool.
</para>
</section>
<section xml:id="hbase.regionserver.codecs">
<title>
<varname>
hbase.regionserver.codecs
</varname>
</title>
<para>
To have a RegionServer test a set of codecs and fail-to-start if any
code is missing or misinstalled, add the configuration
<varname>
hbase.regionserver.codecs
</varname>
to your <filename>hbase-site.xml</filename> with a value of
codecs to test on startup. For example if the
<varname>
hbase.regionserver.codecs
</varname> value is <code>lzo,gz</code> and if lzo is not present
or improperly installed, the misconfigured RegionServer will fail
to start.
</para>
<para>
Administrators might make use of this facility to guard against
the case where a new server is added to cluster but the cluster
requires install of a particular coded.
</para>
</section>
<section xml:id="lzo.compression">
<title>
LZO
</title>
<para>Unfortunately, HBase cannot ship with LZO because of
the licensing issues; HBase is Apache-licensed, LZO is GPL.
Therefore LZO install is to be done post-HBase install.
See the <link xlink:href="http://wiki.apache.org/hadoop/UsingLzoCompression">Using LZO Compression</link>
wiki page for how to make LZO work with HBase.
</para>
<para>A common problem users run into when using LZO is that while initial
setup of the cluster runs smooth, a month goes by and some sysadmin goes to
add a machine to the cluster only they'll have forgotten to do the LZO
fixup on the new machine. In versions since HBase 0.90.0, we should
fail in a way that makes it plain what the problem is, but maybe not. </para>
<para>See <xref linkend="hbase.regionserver.codecs" />
for a feature to help protect against failed LZO install.</para>
</section>
<section xml:id="gzip.compression">
<title>
GZIP
</title>
<para>
GZIP will generally compress better than LZO though slower.
For some setups, better compression may be preferred.
Java will use java's GZIP unless the native Hadoop libs are
available on the CLASSPATH; in this case it will use native
compressors instead (If the native libs are NOT present,
you will see lots of <emphasis>Got brand-new compressor</emphasis>
reports in your logs; see <xref linkend="brand.new.compressor" />).
</para>
</section>
<section xml:id="snappy.compression">
<title>
SNAPPY
</title>
<para>
If snappy is installed, HBase can make use of it (courtesy of
<link xlink:href="http://code.google.com/p/hadoop-snappy/">hadoop-snappy</link>
<footnote><para>See <link xlink:href="http://search-hadoop.com/m/Ds8d51c263B1/%2522Hadoop-Snappy+in+synch+with+Hadoop+trunk%2522&amp;subj=Hadoop+Snappy+in+synch+with+Hadoop+trunk">Alejandro's note</link> up on the list on difference between Snappy in Hadoop
and Snappy in HBase</para></footnote>).
<orderedlist>
<listitem>
<para>
Build and install <link xlink:href="http://code.google.com/p/snappy/">snappy</link> on all nodes
of your cluster.
</para>
</listitem>
<listitem>
<para>
Use CompressionTest to verify snappy support is enabled and the libs can be loaded ON ALL NODES of your cluster:
<programlisting>$ hbase org.apache.hadoop.hbase.util.CompressionTest hdfs://host/path/to/hbase snappy</programlisting>
</para>
</listitem>
<listitem>
<para>
Create a column family with snappy compression and verify it in the hbase shell:
<programlisting>$ hbase> create 't1', { NAME => 'cf1', COMPRESSION => 'SNAPPY' }
hbase> describe 't1'</programlisting>
In the output of the "describe" command, you need to ensure it lists "COMPRESSION => 'SNAPPY'"
</para>
</listitem>
</orderedlist>
</para>
</section>
</appendix>
<appendix xml:id="faq">
<bibliography>
<section>
</section>
</bibliography>
<appendix xml:id="faq">
<title >FAQ</title>
<qandaset defaultlabel='faq'>
<qandadiv><title>General</title>
@ -2223,21 +2115,131 @@ hbase> describe 't1'</programlisting>
</answer>
</qandaentry>
</qandadiv>
<qandadiv><title>Building HBase</title>
<qandadiv><title>HBase in Action</title>
<qandaentry>
<question><para>
When I build, why do I always get <code>Unable to find resource 'VM_global_library.vm'</code>?
</para></question>
<question><para>Where can I find interesting videos and presentations on HBase?</para></question>
<answer>
<para>
Ignore it. Its not an error. It is <link xlink:href="http://jira.codehaus.org/browse/MSITE-286">officially ugly</link> though.
See <xref linkend="other.info" />
</para>
</answer>
</qandaentry>
</qandadiv>
</qandadiv>
</qandaset>
</appendix>
<appendix xml:id="compression">
<title >Compression In HBase<indexterm><primary>Compression</primary></indexterm></title>
<section xml:id="compression.test">
<title>CompressionTest Tool</title>
<para>
HBase includes a tool to test compression is set up properly.
To run it, type <code>/bin/hbase org.apache.hadoop.hbase.util.CompressionTest</code>.
This will emit usage on how to run the tool.
</para>
</section>
<section xml:id="hbase.regionserver.codecs">
<title>
<varname>
hbase.regionserver.codecs
</varname>
</title>
<para>
To have a RegionServer test a set of codecs and fail-to-start if any
code is missing or misinstalled, add the configuration
<varname>
hbase.regionserver.codecs
</varname>
to your <filename>hbase-site.xml</filename> with a value of
codecs to test on startup. For example if the
<varname>
hbase.regionserver.codecs
</varname> value is <code>lzo,gz</code> and if lzo is not present
or improperly installed, the misconfigured RegionServer will fail
to start.
</para>
<para>
Administrators might make use of this facility to guard against
the case where a new server is added to cluster but the cluster
requires install of a particular coded.
</para>
</section>
<section xml:id="lzo.compression">
<title>
LZO
</title>
<para>Unfortunately, HBase cannot ship with LZO because of
the licensing issues; HBase is Apache-licensed, LZO is GPL.
Therefore LZO install is to be done post-HBase install.
See the <link xlink:href="http://wiki.apache.org/hadoop/UsingLzoCompression">Using LZO Compression</link>
wiki page for how to make LZO work with HBase.
</para>
<para>A common problem users run into when using LZO is that while initial
setup of the cluster runs smooth, a month goes by and some sysadmin goes to
add a machine to the cluster only they'll have forgotten to do the LZO
fixup on the new machine. In versions since HBase 0.90.0, we should
fail in a way that makes it plain what the problem is, but maybe not. </para>
<para>See <xref linkend="hbase.regionserver.codecs" />
for a feature to help protect against failed LZO install.</para>
</section>
<section xml:id="gzip.compression">
<title>
GZIP
</title>
<para>
GZIP will generally compress better than LZO though slower.
For some setups, better compression may be preferred.
Java will use java's GZIP unless the native Hadoop libs are
available on the CLASSPATH; in this case it will use native
compressors instead (If the native libs are NOT present,
you will see lots of <emphasis>Got brand-new compressor</emphasis>
reports in your logs; see <xref linkend="brand.new.compressor" />).
</para>
</section>
<section xml:id="snappy.compression">
<title>
SNAPPY
</title>
<para>
If snappy is installed, HBase can make use of it (courtesy of
<link xlink:href="http://code.google.com/p/hadoop-snappy/">hadoop-snappy</link>
<footnote><para>See <link xlink:href="http://search-hadoop.com/m/Ds8d51c263B1/%2522Hadoop-Snappy+in+synch+with+Hadoop+trunk%2522&amp;subj=Hadoop+Snappy+in+synch+with+Hadoop+trunk">Alejandro's note</link> up on the list on difference between Snappy in Hadoop
and Snappy in HBase</para></footnote>).
<orderedlist>
<listitem>
<para>
Build and install <link xlink:href="http://code.google.com/p/snappy/">snappy</link> on all nodes
of your cluster.
</para>
</listitem>
<listitem>
<para>
Use CompressionTest to verify snappy support is enabled and the libs can be loaded ON ALL NODES of your cluster:
<programlisting>$ hbase org.apache.hadoop.hbase.util.CompressionTest hdfs://host/path/to/hbase snappy</programlisting>
</para>
</listitem>
<listitem>
<para>
Create a column family with snappy compression and verify it in the hbase shell:
<programlisting>$ hbase> create 't1', { NAME => 'cf1', COMPRESSION => 'SNAPPY' }
hbase> describe 't1'</programlisting>
In the output of the "describe" command, you need to ensure it lists "COMPRESSION => 'SNAPPY'"
</para>
</listitem>
</orderedlist>
</para>
</section>
</appendix>
<appendix>
<title xml:id="ycsb"><link xlink:href="https://github.com/brianfrankcooper/YCSB/">YCSB: The Yahoo! Cloud Serving Benchmark</link> and HBase</title>
<para>TODO: Describe how YCSB is poor for putting up a decent cluster load.</para>
@ -2246,7 +2248,6 @@ When I build, why do I always get <code>Unable to find resource 'VM_global_libra
</appendix>
<appendix xml:id="hfilev2">
<title>HFile format version 2</title>
@ -2710,9 +2711,34 @@ Comparator class used for Bloom filter keys, a UTF>8 encoded string stored usi
</informaltable>
<para/></section></section></appendix>
<appendix xml:id="other.info">
<title>Other Information about HBase</title>
<section xml:id="other.info.videos"><title>HBase Videos</title>
<para><link xlink:href="http://www.cloudera.com/videos/intorduction-hbase-todd-lipcon">Introduction to HBase</link> by Todd Lipcon.
</para>
<para><link xlink:href="http://www.cloudera.com/videos/hadoop-world-2011-presentation-video-building-realtime-big-data-services-at-facebook-with-hadoop-and-hbase">Building Real Time Services at Facebook with HBase</link> by Jonathan Gray.
</para>
</section>
<section xml:id="other.info.sites"><title>Sites</title>
<para><link xlink:href="http://www.cloudera.com/blog/category/hbase/">Cloudera's HBase Blog</link> has a lot of links to useful HBase information.
<itemizedlist>
<listitem><link xlink:href="http://www.cloudera.com/blog/2010/04/cap-confusion-problems-with-partition-tolerance/">CAP Confusion</link> is a relevant presentation for background information on
distributed storage systems.
</listitem>
</itemizedlist>
</para>
<para><link xlink:href="http://wiki.apache.org/hadoop/HBase/HBasePresentations">HBase Wiki</link> has a page with a number of presentations.
</para>
</section>
<section xml:id="other.info.sites"><title>Books</title>
<para><link xlink:href="http://shop.oreilly.com/product/0636920014348.do">HBase: The Definitive Guide</link> by Lars George.
</para>
</section>
</appendix>
<appendix xml:id="asf" ><title>HBase and the Apache Software Foundation</title>
<para>HBase is a project in the Apache Software Foundation and as such there are responsibilities to the ASF to ensure
a healthy project.
a healthy project.</para>
<section xml:id="asf.devprocess"><title>ASF Development Process</title>
<para>See the <link xlink:href="http://www.apache.org/dev/#committers">Apache Development Process page</link>
for all sorts of information on how the ASF is structured (e.g., PMC, committers, contributors), to tips on contributing
@ -2724,7 +2750,6 @@ Comparator class used for Bloom filter keys, a UTF>8 encoded string stored usi
lead and the committers. See <link xlink:href="http://www.apache.org/foundation/board/reporting">ASF board reporting</link> for more information.
</para>
</section>
</para>
</appendix>
<index xml:id="book_index">

View File

@ -160,6 +160,11 @@ Access restriction: The method getLong(Object, long) from the type Unsafe is not
[INFO] -----------------------------------------------------------------------</programlisting>
</para>
</section>
<section xml:id="build.gotchas"><title>Build Gotchas</title>
<para>If you see <code>Unable to find resource 'VM_global_library.vm'</code>, ignore it.
Its not an error. It is <link xlink:href="http://jira.codehaus.org/browse/MSITE-286">officially ugly</link> though.
</para>
</section>
</section> <!-- build -->
<section xml:id="maven.build.commands">

View File

@ -211,6 +211,10 @@ export HBASE_OPTS="-XX:NewSize=64m -XX:MaxNewSize=64m &lt;cms options from above
<link xlink:href="http://search-hadoop.com">search-hadoop.com</link> indexes all the mailing lists and is great for historical searches.
</para>
</section>
<section xml:id="trouble.resources.irc">
<title>IRC</title>
<para>#hbase on irc.freenode.net</para>
</section>
<section xml:id="trouble.resources.jira">
<title>JIRA</title>
<para>