From c48a1061a77a3486ffcb21a6a8a2581b35736202 Mon Sep 17 00:00:00 2001 From: Michael Stack Date: Tue, 2 Nov 2010 04:57:25 +0000 Subject: [PATCH] HBASE-1932 Encourage use of 'lzo' compression... add the wiki page to getting started git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1029952 13f79535-47bb-0310-9956-ffa450edef68 --- CHANGES.txt | 2 ++ src/docbkx/book.xml | 81 ++++++++++++++++++++++++++++++++++++++------- 2 files changed, 71 insertions(+), 12 deletions(-) diff --git a/CHANGES.txt b/CHANGES.txt index a1ef0107e02..0019d6d7af3 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -640,6 +640,8 @@ Release 0.21.0 - Unreleased HBASE-3179 Enable ReplicationLogsCleaner only if replication is, and fix its test HBASE-3185 User-triggered compactions are triggering splits! + HBASE-1932 Encourage use of 'lzo' compression... add the wiki page to + getting started IMPROVEMENTS diff --git a/src/docbkx/book.xml b/src/docbkx/book.xml index 961a279333a..b1bc71e9a6d 100644 --- a/src/docbkx/book.xml +++ b/src/docbkx/book.xml @@ -230,6 +230,7 @@ stopping hbase............... different HBase run modes: standalone, what is described above in Quick Start, pseudo-distributed where all daemons run on a single server, and distributed. + Be sure to read the Important Configurations. @@ -242,7 +243,6 @@ stopping hbase............... <filename>hbase-site.xml</filename> and <filename>hbase-default.xml</filename> What are these? - Not all configuration options make it out to hbase-default.xml. Configuration @@ -250,37 +250,94 @@ stopping hbase............... in code; the only way to turn up the configurations is via a reading of the source code. - - +
<filename>hbase-env.sh</filename>
+
<filename>log4j.properties</filename>
-
- Noteworthy Configuration - Below we review a couple of the key configurations. - We'll list those you must to change to suit your context - and others that you should review and consider moving on - from defaults after guaging your deploys load and query profiles. + +
+ The Important Configurations + Below we list the important Configurations. We've divided this section into + required configuration and worth-a-look recommended configs. -
+ + +
Required Configurations + Here are some configurations you must configure to suit + your deploy. + +
+ <varname>ulimit</varname> + HBase is a database, it uses a lot of files at the same time. + The default ulimit -n of 1024 on *nix systems is insufficient. + Any significant amount of loading will lead you to + FAQ: Why do I see "java.io.IOException...(Too many open files)" in my logs?. + You will also notice errors like: + 2010-04-06 03:04:37,542 INFO org.apache.hadoop.hdfs.DFSClient: Exception increateBlockOutputStream java.io.EOFException + 2010-04-06 03:04:37,542 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-6935524980745310745_1391901 + + Do yourself a favor and change the upper bound on the number of file descriptors. + Set it to north of 10k. See the above referenced FAQ for how. + To be clear, upping the file descriptors for the user who is + running the HBase process is an operating system configuration, not an + HBase configuration. + +
+
+ <varname>dfs.datanode.max.xcievers</varname> + + Hadoop HDFS has an upper bound of files that it will serve at one same time, + called xcievers (yes, this is misspelled). Again, before + doing any loading, make sure you have configured Hadoop's conf/hdfs-site.xml + setting the xceivers value to at least the following: + + <property> + <name>dfs.datanode.max.xcievers</name> + <value>2047</value> + </property> + + +
+
+ +
Recommended Configuations +
LZO compression You should consider enabling LZO compression. Its near-frictionless and in most all cases boosts performance. - To enable LZO, TODO... + + Unfortunately, HBase cannot ship with LZO because of + the licensing issues; HBase is Apache-licensed, LZO is GPL. + Therefore LZO install is to be done post-HBase install. + See the Using LZO Compression + wiki page for how to make LZO work with HBase. + + A common problem users run into when using LZO is that while initial + setup of the cluster runs smooth, a month goes by and some sysadmin goes to + add a machine to the cluster only they'll have forgotten to do the LZO + fixup on the new machine. In versions since HBase 0.90.0, we should + fail in a way that makes it plain what the problem is, but maybe not. + Remember you read this paragraphSee + hbase.regionserver.codec + for a feature to help protect against failed LZO install.
+
+ +
@@ -1201,7 +1258,7 @@ stopping hbase............... xlink:href="http://en.wikipedia.org/wiki/Write-ahead_logging"> Write-Ahead Log - Each RegionServer adds updates to its WAL + Each RegionServer adds updates to its Write-ahead Log (WAL) first, and then to memory.