HBASE-7264 Improve Snappy installation documentation

git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1416675 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Michael Stack 2012-12-03 21:30:14 +00:00
parent 1826577ffc
commit 8c4893078c
2 changed files with 15 additions and 15 deletions

View File

@ -761,14 +761,14 @@ System.out.println("md5 digest as string length: " + sbDigest.length); // ret
... (note: the lead byte is listed to the right as a comment.) Given that the first split is a '0' and the last split is an 'f',
everything is great, right? Not so fast.
</para>
<para>The problem is that all the data is going to pile up in the first 2 regions and the last region thus creating a "lumpy" (and
possibly "hot") region problem. To understand why, refer to an <link xlink:href="http://www.asciitable.com">ASCII Table</link>.
<para>The problem is that all the data is going to pile up in the first 2 regions and the last region thus creating a "lumpy" (and
possibly "hot") region problem. To understand why, refer to an <link xlink:href="http://www.asciitable.com">ASCII Table</link>.
'0' is byte 48, and 'f' is byte 102, but there is a huge gap in byte values (bytes 58 to 96) that will <emphasis>never appear in this
keyspace</emphasis> because the only values are [0-9] and [a-f]. Thus, the middle regions regions will
keyspace</emphasis> because the only values are [0-9] and [a-f]. Thus, the middle regions regions will
never be used. To make pre-spliting work with this example keyspace, a custom definition of splits (i.e., and not relying on the
built-in split method) is required.
built-in split method) is required.
</para>
<para>Lesson #1: Pre-splitting tables is generally a best practice, but you need to pre-split them in such a way that all the
<para>Lesson #1: Pre-splitting tables is generally a best practice, but you need to pre-split them in such a way that all the
regions are accessible in the keyspace. While this example demonstrated the problem with a hex-key keyspace, the same problem can happen
with <emphasis>any</emphasis> keyspace. Know your data.
</para>
@ -971,19 +971,19 @@ public static byte[][] getHexSplits(String startKey, String endKey, int numRegio
</para>
<para>Preference: Rows (generally speaking). To be clear, this guideline is in the context is in extremely wide cases, not in the
standard use-case where one needs to store a few dozen or hundred columns. But there is also a middle path between these two
options, and that is "Rows as Columns."
options, and that is "Rows as Columns."
</para>
</section>
<section xml:id="schema.smackdown.rowsascols"><title>Rows as Columns</title>
<para>The middle path between Rows vs. Columns is packing data that would be a separate row into columns, for certain rows.
OpenTSDB is the best example of this case where a single row represents a defined time-range, and then discrete events are treated as
columns. This approach is often more complex, and may require the additional complexity of re-writing your data, but has the
advantage of being I/O efficient. For an overview of this approach, see
<link xlink:href="http://www.cloudera.com/content/cloudera/en/resources/library/hbasecon/video-hbasecon-2012-lessons-learned-from-opentsdb.html">Lessons Learned from OpenTSDB</link>
columns. This approach is often more complex, and may require the additional complexity of re-writing your data, but has the
advantage of being I/O efficient. For an overview of this approach, see
<link xlink:href="http://www.cloudera.com/content/cloudera/en/resources/library/hbasecon/video-hbasecon-2012-lessons-learned-from-opentsdb.html">Lessons Learned from OpenTSDB</link>
from HBaseCon2012.
</para>
</section>
</section>
<section xml:id="schema.ops"><title>Operational and Performance Configuration Options</title>
<para>See the Performance section <xref linkend="perf.schema"/> for more information operational and performance
@ -3060,20 +3060,20 @@ hbase> describe 't1'</programlisting>
copied at the right place. Native libraries need to be installed under ${HBASE_HOME}/lib/native/Linux-amd64-64 or
${HBASE_HOME}/lib/native/Linux-i386-32 depending on your server architecture.
</para>
<para>
You will find the snappy library file under the .libs directory from your Snappy build (For example
You will find the snappy library file under the .libs directory from your Snappy build (For example
/home/hbase/snappy-1.0.5/.libs/). The file is called libsnappy.so.1.x.x where 1.x.x is the version of the snappy
code you are building. You can either copy this file into your hbase directory under libsnappy.so name, or simply
create a symbolic link to it.
</para>
<para>
The second file you need is the hadoop native library. You will find this file in your hadoop installation directory
under lib/native/Linux-amd64-64/ or lib/native/Linux-i386-32/. The file you are looking for is libhadoop.so.1.x.x.
Again, you can simply copy this file or link to it, under the name libhadoop.so.
</para>
<para>
At the end of the installation, you should have both libsnappy.so and libhadoop.so links or files present into
lib/native/Linux-amd64-64 or into lib/native/Linux-i386-32

View File

@ -157,7 +157,7 @@ mvn clean package -DskipTests
<section xml:id="build.snappy">
<title>Building in snappy compression support</title>
<para>Pass <code>-Dsnappy</code> to trigger the snappy maven profile for building
snappy native libs into hbase.</para>
snappy native libs into hbase. See also <xref linkend="snappy.compression" /></para>
</section>
<section xml:id="build.tgz">