From 2313e04223c3adf7620140fefdfb07efbc99191f Mon Sep 17 00:00:00 2001 From: Michael Stack Date: Tue, 28 Jan 2014 23:09:51 +0000 Subject: [PATCH] Address outstanding disqus comments on the refguide, fix copyright and some wording git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1562307 13f79535-47bb-0310-9956-ffa450edef68 --- src/main/docbkx/book.xml | 7 ++++--- src/main/docbkx/getting_started.xml | 17 +++++++++++------ src/main/docbkx/performance.xml | 4 ++-- src/main/docbkx/preface.xml | 3 ++- 4 files changed, 19 insertions(+), 12 deletions(-) diff --git a/src/main/docbkx/book.xml b/src/main/docbkx/book.xml index 33be7185096..24e22e3ace9 100644 --- a/src/main/docbkx/book.xml +++ b/src/main/docbkx/book.xml @@ -39,14 +39,14 @@ - 2013Apache Software Foundation. + 2014Apache Software Foundation. All Rights Reserved. Apache Hadoop, Hadoop, MapReduce, HDFS, Zookeeper, HBase, and the HBase project logo are trademarks of the Apache Software Foundation. This is the official reference guide of Apache HBase™, - a distributed, versioned, column-oriented database built on top of + a distributed, versioned, big data store built on top of Apache Hadoop™ and Apache ZooKeeper™. @@ -1846,6 +1846,7 @@ rs.close(); Third replica is written on the same rack as the second, but on a different node chosen randomly Subsequent replicas are written on random nodes on the cluster +See Replica Placement: The First Baby Steps on this page: HDFS Architecture Thus, HBase eventually achieves locality for a region after a flush or a compaction. @@ -1854,7 +1855,7 @@ rs.close(); in the region, or the table is compacted and StoreFiles are re-written, they will become "local" to the RegionServer. - For more information, see HDFS Design on Replica Placement + For more information, see Replica Placement: The First Baby Steps on this page: HDFS Architecture and also Lars George's blog on HBase and HDFS locality. diff --git a/src/main/docbkx/getting_started.xml b/src/main/docbkx/getting_started.xml index e3e471d38ee..cd47284c6e3 100644 --- a/src/main/docbkx/getting_started.xml +++ b/src/main/docbkx/getting_started.xml @@ -41,22 +41,27 @@ This guide describes setup of a standalone HBase instance. It will run against the local filesystem. In later sections we will take you through - how to run HBase on HDFS, a distributed filesystem. This section - leads you through creating a table, inserting - rows via the HBase shell, and then cleaning - up and shutting down your standalone, local filesystem HBase instance. The below exercise + how to run HBase on Apache Hadoop's HDFS, a distributed filesystem. This section + shows you how to create a table in HBase, inserting + rows into your new HBase table via the HBase shell, and then cleaning + up and shutting down your standalone, local filesystem-based HBase instance. The below exercise should take no more than ten minutes (not including download time). Local Filesystem and Durability Using HBase with a LocalFileSystem does not currently guarantee durability. + The HDFS local filesystem implementation will lose edits if files are not properly + closed -- which is very likely to happen when experimenting with a new download. You need to run HBase on HDFS to ensure all writes are preserved. Running against the local filesystem though will get you off the ground quickly and get you familiar with how the general system works so lets run with it for now. See and its associated issues for more details. Loopback IP - The below advice is for hbase-0.94.0 (and older) versions; we believe this fixed in hbase-0.96.0 and beyond (let us know if we have it wrong) -- there should be no need of modification to - /etc/hosts. + + The below advice is for hbase-0.94.x and older versions only. We believe this fixed in hbase-0.96.0 and beyond +(let us know if we have it wrong). There should be no need of the below modification to /etc/hosts in +later versions of HBase. + HBase expects the loopback IP address to be 127.0.0.1. Ubuntu and some other distributions, for example, will default to 127.0.1.1 and this will cause problems for you See Why does HBase care about /etc/hosts? for detail.. diff --git a/src/main/docbkx/performance.xml b/src/main/docbkx/performance.xml index 58656a71644..d4e09b49e4e 100644 --- a/src/main/docbkx/performance.xml +++ b/src/main/docbkx/performance.xml @@ -250,7 +250,7 @@ NONE for no bloom filters. If ROW, the hash of the row will be added to the bloom on each insert. If ROWCOL, the hash of the row + - column family + column family qualifier will be added to the bloom on + column family name + column family qualifier will be added to the bloom on each key insert. See HColumnDescriptor and for more information or this answer up in quora, @@ -529,7 +529,7 @@ htable.close();
Bloom Filters Enabling Bloom Filters can save your having to go to disk and - can help improve read latencys. + can help improve read latencies. Bloom filters were developed over in HBase-1200 Add bloomfilters. diff --git a/src/main/docbkx/preface.xml b/src/main/docbkx/preface.xml index 5308037d172..7d05abed2ee 100644 --- a/src/main/docbkx/preface.xml +++ b/src/main/docbkx/preface.xml @@ -48,7 +48,7 @@ xlink:href="https://issues.apache.org/jira/browse/HBASE">JIRA. - Heads-up + Heads-up if this is your first foray into the world of distributed computing... If this is your first foray into the wonderful world of Distributed Computing, then you are in for @@ -65,6 +65,7 @@ computing has been bound to a single box. Here is one good starting point: Fallacies of Distributed Computing. + That said, you are welcome. Its a fun place to be. Yours, the HBase Community.