diff --git a/pom.xml b/pom.xml index cc70e81e130..d4ea7470a81 100644 --- a/pom.xml +++ b/pom.xml @@ -42,7 +42,7 @@ 0.95-SNAPSHOT HBase - HBase is the &lt;a href="http://hadoop.apache.org"&rt;Hadoop</a&rt; database. Use it when you need + Apache HBase™ is the &lt;a href="http://hadoop.apache.org"&rt;Hadoop</a&rt; database. Use it when you need random, realtime read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware. @@ -396,7 +396,7 @@ profile on the command line as follows: $ ~/bin/mvn/bin/mvn -Papache-release release:perform - + I've also been kiling the release:prepare step mid-way to check the release.properties it generates at the top-level. Sometimes it refers to HEAD rather than to the svn branch. @@ -408,7 +408,7 @@ --> apache-release -Dmaven.test.skip.exec @@ -429,8 +429,8 @@ maven-surefire-plugin ${surefire.version} - org.apache.maven.surefire @@ -469,7 +469,7 @@ false ${surefire.secondPartThreadCount} classes - ${surefire.secondPartGroups} @@ -558,11 +558,11 @@ maven-eclipse-plugin 2.8 - - org.eclipse.m2e @@ -1336,7 +1336,7 @@ - hadoop-2.0 @@ -1364,7 +1364,7 @@ hadoop-annotations ${hadoop-two.version} - org.apache.hadoop diff --git a/src/docbkx/book.xml b/src/docbkx/book.xml index e87bb5cebf8..b2f5cc65e14 100644 --- a/src/docbkx/book.xml +++ b/src/docbkx/book.xml @@ -29,7 +29,7 @@ <link xlink:href="http://www.hbase.org"> - Apache HBase Reference Guide + The Apache HBase™ Reference Guide </link> @@ -39,10 +39,13 @@ - 2012Apache Software Foundation + 2012Apache Software Foundation. + All Rights Reserved. Apache Hadoop, Hadoop, MapReduce, HDFS, Zookeeper, HBase, and the HBase project logo are trademarks of the Apache Software Foundation. + + This is the official reference guide of - Apache HBase, + Apache HBase (TM), a distributed, versioned, column-oriented database built on top of Apache Hadoop and Apache ZooKeeper. @@ -175,7 +178,7 @@ from time stamp t9, the value of anchor:my.look.ca from time stamp t8. - For more information about the internals of how HBase stores data, see . + For more information about the internals of how Apache HBase stores data, see . @@ -197,7 +200,7 @@
Column Family<indexterm><primary>Column Family</primary></indexterm> - Columns in HBase are grouped into column families. + Columns in Apache HBase are grouped into column families. All column members of a column family have the same prefix. For example, the columns courses:history and courses:math are both members of the @@ -560,7 +563,7 @@ htable.put(put); HBase and Schema Design A good general introduction on the strength and weaknesses modelling on - the various non-rdbms datastores is Ian Varleys' Master thesis, + the various non-rdbms datastores is Ian Varley's Master thesis, No Relation: The Mixed Blessings of Non-Relational Databases. Recommended. Also, read for how HBase stores data internally. diff --git a/src/docbkx/case_studies.xml b/src/docbkx/case_studies.xml index a10f53c0016..2e3bba0432f 100644 --- a/src/docbkx/case_studies.xml +++ b/src/docbkx/case_studies.xml @@ -26,11 +26,11 @@ * limitations under the License. */ --> - Case Studies + Apache HBase (TM) Case Studies
Overview This chapter will describe a variety of performance and troubleshooting case studies that can - provide a useful blueprint on diagnosing cluster issues. + provide a useful blueprint on diagnosing Apache HBase (TM) cluster issues. For more information on Performance and Troubleshooting, see and .
@@ -41,7 +41,7 @@
List Data The following is an exchange from the user dist-list regarding a fairly common question: - how to handle per-user list data in HBase. + how to handle per-user list data in Apache HBase. *** QUESTION *** diff --git a/src/docbkx/community.xml b/src/docbkx/community.xml index c7e7eea32be..2c09908aed9 100644 --- a/src/docbkx/community.xml +++ b/src/docbkx/community.xml @@ -32,7 +32,7 @@
Feature Branches Feature Branches are easy to make. You do not have to be a committer to make one. Just request the name of your branch be added to JIRA up on the - developer's mailing list and a committer will add it for you. Thereafter you can file issues against your feature branch in HBase JIRA. Your code you + developer's mailing list and a committer will add it for you. Thereafter you can file issues against your feature branch in Apache HBase (TM) JIRA. Your code you keep elsewhere -- it should be public so it can be observed -- and you can update dev mailing list on progress. When the feature is ready for commit, 3 +1s from committers will get your feature mergedSee HBase, mail # dev - Thoughts about large feature dev branches @@ -45,14 +45,14 @@ suggested policy rather than a hard requirement. We want to try it first to see if it works before we cast it in stone. -HBase is made of +Apache HBase is made of components. Components have one or more s. See the 'Description' field on the components JIRA page for who the current owners are by component. -Patches that fit within the scope of a single HBase component require, +Patches that fit within the scope of a single Apache HBase component require, at least, a +1 by one of the component's owners before commit. If owners are absent -- busy or otherwise -- two +1s by non-owners will suffice. @@ -74,8 +74,7 @@ until the justification for the -1 is addressed.
Component Owner -Component owners are listed in the description field on this JIRA HBase -components +Component owners are listed in the description field on this Apache HBase JIRA components page. The owners are listed in the 'Description' field rather than in the 'Component Lead' field because the latter only allows us list one individual whereas it is encouraged that components have multiple owners. @@ -83,7 +82,7 @@ whereas it is encouraged that components have multiple owners. Owners are volunteers who are (usually, but not necessarily) expert in their component domain and may have an agenda on how they think their -HBase component should evolve. +Apache HBase component should evolve. Duties include: diff --git a/src/docbkx/configuration.xml b/src/docbkx/configuration.xml index e898e1d5489..bf29d4245f6 100644 --- a/src/docbkx/configuration.xml +++ b/src/docbkx/configuration.xml @@ -26,16 +26,16 @@ * limitations under the License. */ --> - Configuration - This chapter is the Not-So-Quick start guide to HBase configuration. It goes - over system requirements, Hadoop setup, the different HBase run modes, and the + Apache HBase (TM) Configuration + This chapter is the Not-So-Quick start guide to Apache HBase (TM) configuration. It goes + over system requirements, Hadoop setup, the different Apache HBase run modes, and the various configurations in HBase. Please read this chapter carefully. At a mimimum ensure that all have been satisfied. Failure to do so will cause you (and us) grief debugging strange errors and/or data loss. - HBase uses the same configuration system as Hadoop. + Apache HBase uses the same configuration system as Apache Hadoop. To configure a deploy, edit a file of environment variables in conf/hbase-env.sh -- this configuration is used mostly by the launcher shell scripts getting the cluster @@ -142,7 +142,7 @@ to ensure well-formedness of your document after an edit session. - HBase is a database. It uses a lot of files all at the same time. + Apache HBase is a database. It uses a lot of files all at the same time. The default ulimit -n -- i.e. user file limit -- of 1024 on most *nix systems is insufficient (On mac os x its 256). Any significant amount of loading will lead you to . @@ -163,7 +163,7 @@ to ensure well-formedness of your document after an edit session. See Jack Levin's major hdfs issues note up on the user list. The requirement that a database requires upping of system limits - is not peculiar to HBase. See for example the section + is not peculiar to Apache HBase. See for example the section Setting Shell Limits for the Oracle User in Short Guide to install Oracle 10 on Linux.. @@ -208,7 +208,7 @@ to ensure well-formedness of your document after an edit session.
Windows - HBase has been little tested running on Windows. Running a + Apache HBase has been little tested running on Windows. Running a production install of HBase on top of Windows is not recommended. @@ -270,18 +270,18 @@ to ensure well-formedness of your document after an edit session. branch had a working sync but no official release was ever made from this branch. You had to build it yourself. Michael Noll wrote a detailed blog, Building - an Hadoop 0.20.x version for HBase 0.90.2, on how to build an + an Hadoop 0.20.x version for Apache HBase 0.90.2, on how to build an Hadoop from branch-0.20-append. Recommended. Praveen Kumar has written a complimentary article, Building Hadoop and HBase for HBase Maven application development. Cloudera have dfs.support.append set to true by default.. Please use the most up-to-date Hadoop possible. - HBase 0.96.0 requires Hadoop 1.0.0 at a minimum - As of HBase 0.96.x, Hadoop 1.0.x at least is required. We will no + Apache HBase 0.96.0 requires Apache Hadoop 1.0.0 at a minimum + As of Apache HBase 0.96.x, Apache Hadoop 1.0.x at least is required. We will no longer run properly on older Hadoops such as 0.20.205 or branch-0.20-append. - Do not move to 0.96.x if you cannot upgrade your HadoopSee HBase, mail # dev - DISCUSS: Have hbase require at least hadoop 1.0.0 in hbase 0.96.0?. - HBase 0.96.0 runs on Hadoop 2.0. + Do not move to Apache HBase 0.96.x if you cannot upgrade your HadoopSee HBase, mail # dev - DISCUSS: Have hbase require at least hadoop 1.0.0 in hbase 0.96.0?. + Apache HBase 0.96.0 runs on Apache Hadoop 2.0. @@ -323,8 +323,8 @@ to ensure well-formedness of your document after an edit session.
- HBase on Secure Hadoop - HBase will run on any Hadoop 0.20.x that incorporates Hadoop + Apache HBase on Secure Hadoop + Apache HBase will run on any Hadoop 0.20.x that incorporates Hadoop security features -- e.g. Y! 0.20S or CDH3B3 -- as long as you do as suggested above and replace the Hadoop jar that ships with HBase with the secure version. If you want to read more about how to setup diff --git a/src/docbkx/developer.xml b/src/docbkx/developer.xml index 9a2d836d37a..4c39a7a6371 100644 --- a/src/docbkx/developer.xml +++ b/src/docbkx/developer.xml @@ -26,13 +26,13 @@ * limitations under the License. */ --> - Building and Developing HBase - This chapter will be of interest only to those building and developing HBase (i.e., as opposed to + Building and Developing Apache HBase (TM) + This chapter will be of interest only to those building and developing Apache HBase (TM) (i.e., as opposed to just downloading the latest distribution).
- HBase Repositories - There are two different repositories for HBase: Subversion (SVN) and Git. The former is the system of record for committers, but the latter is easier to work with to build and contribute. SVN updates get automatically propagated to the Git repo. + Apache HBase Repositories + There are two different repositories for Apache HBase: Subversion (SVN) and Git. The former is the system of record for committers, but the latter is easier to work with to build and contribute. SVN updates get automatically propagated to the Git repo.
SVN @@ -140,7 +140,7 @@ Access restriction: The method getLong(Object, long) from the type Unsafe is not
- Building HBase + Building Apache HBase
Basic Compile Thanks to maven, building HBase is pretty easy. You can read about the various maven commands in , but the simplest command to compile HBase from its java source code is: @@ -176,7 +176,7 @@ mvn clean package -DskipTests
- Adding an HBase release to Apache's Maven Repository + Adding an Apache HBase release to Apache's Maven Repository Follow the instructions at Publishing Maven Artifacts after reading the below miscellaney. @@ -313,7 +313,7 @@ What is the new development version for "HBase"? (org.apache.hbase:hbase) 0.92.3 Updating hbase.apache.org
Contributing to hbase.apache.org - The HBase apache web site (including this reference guide) is maintained as part of the main HBase source tree, under /src/docbkx and /src/site. The former is this reference guide; the latter, in most cases, are legacy pages that are in the process of being merged into the docbkx tree. + The Apache HBase apache web site (including this reference guide) is maintained as part of the main Apache HBase source tree, under /src/docbkx and /src/site. The former is this reference guide; the latter, in most cases, are legacy pages that are in the process of being merged into the docbkx tree. To contribute to the reference guide, edit these files and submit them as a patch (see ). Your Jira should contain a summary of the changes in each section (see HBASE-6081 for an example). To generate the site locally while you're working on it, run: mvn site @@ -337,8 +337,8 @@ What is the new development version for "HBase"? (org.apache.hbase:hbase) 0.92.3 HBase have a character not usually seen in other projects.
-HBase Modules -As of 0.96, HBase is split into multiple modules which creates "interesting" rules for +Apache HBase Modules +As of 0.96, Apache HBase is split into multiple modules which creates "interesting" rules for how and where tests are written. If you are writting code for hbase-server, see for how to write your tests; these tests can spin up a minicluster and will need to be categorized. For any other module, for example @@ -366,7 +366,7 @@ given the dependency tree).
Unit Tests -HBase unit tests are subdivided into four categories: small, medium, large, and +Apache HBase unit tests are subdivided into four categories: small, medium, large, and integration with corresponding JUnit categories: SmallTests, MediumTests, LargeTests, IntegrationTests. @@ -419,14 +419,14 @@ the developer machine as well.
-HBase uses a patched maven surefire plugin and maven profiles to implement +Apache HBase uses a patched maven surefire plugin and maven profiles to implement its unit test characterizations.
Running tests -Below we describe how to run the HBase junit categories. +Below we describe how to run the Apache HBase junit categories.
Default: small and medium category tests @@ -614,7 +614,7 @@ mvn compile <section xml:id="maven.build.hadoop"> <title>Building against various hadoop versions. - As of 0.96, HBase supports building against hadoop versions: 1.0.3, 2.0.0-alpha and 3.0.0-SNAPSHOT. + As of 0.96, Apache HBase supports building against Apache Hadoop versions: 1.0.3, 2.0.0-alpha and 3.0.0-SNAPSHOT. By default, we will build with Hadoop-1.0.3. To change the version to run with Hadoop-2.0.0-alpha, you would run: mvn -Dhadoop.profile=2.0 ... @@ -625,7 +625,7 @@ mvn compile Similarly, for 3.0, you would just replace the profile value. Note that Hadoop-3.0.0-SNAPSHOT does not currently have a deployed maven artificat - you will need to build and install your own in your local maven repository if you want to run against this profile. - In earilier verions of HBase, you can build against older versions of hadoop, notably, Hadoop 0.22.x and 0.23.x. + In earilier verions of Apache HBase, you can build against older versions of Apache Hadoop, notably, Hadoop 0.22.x and 0.23.x. If you are running, for example HBase-0.94 and wanted to build against Hadoop 0.23.x, you would run with: mvn -Dhadoop.profile=22 ...
@@ -633,9 +633,9 @@ mvn compile
Getting Involved - HBase gets better only when people contribute! + Apache HBase gets better only when people contribute! - As HBase is an Apache Software Foundation project, see for more information about how the ASF functions. + As Apache HBase is an Apache Software Foundation project, see for more information about how the ASF functions.
Mailing Lists @@ -748,7 +748,7 @@ mvn compile
Running In-Situ - If you are developing HBase, frequently it is useful to test your changes against a more-real cluster than what you find in unit tests. In this case, HBase can be run directly from the source in local-mode. + If you are developing Apache HBase, frequently it is useful to test your changes against a more-real cluster than what you find in unit tests. In this case, HBase can be run directly from the source in local-mode. All you need to do is run: ${HBASE_HOME}/bin/start-hbase.sh @@ -768,7 +768,7 @@ mvn compile If you are new to submitting patches to open source or new to submitting patches to Apache, I'd suggest you start by reading the On Contributing Patches page from Apache Commons Project. Its a nice overview that - applies equally to the HBase Project. + applies equally to the Apache HBase Project.
Create Patch See the aforementioned Apache Commons link for how to make patches against a checked out subversion @@ -781,8 +781,8 @@ mvn compile
Patch File Naming - The patch file should have the HBase Jira ticket in the name. For example, if a patch was submitted for Foo.java, then - a patch file called Foo_HBASE_XXXX.patch would be acceptable where XXXX is the HBase Jira number. + The patch file should have the Apache HBase Jira ticket in the name. For example, if a patch was submitted for Foo.java, then + a patch file called Foo_HBASE_XXXX.patch would be acceptable where XXXX is the Apache HBase Jira number. If you generating from a branch, then including the target branch in the filename is advised, e.g., HBASE-XXXX-0.90.patch. @@ -805,7 +805,7 @@ mvn compile Once attached to the ticket, click "Submit Patch" and the status of the ticket will change. Committers will review submitted patches for inclusion into the codebase. Please understand that not every patch may get committed, and that feedback will likely be provided on the patch. Fear not, though, - because the HBase community is helpful! + because the Apache HBase community is helpful!
@@ -932,7 +932,7 @@ Bar bar = foo.getBar(); <--- imagine there's an extra space(s) after the
Committing Patches - Committers do this. See How To Commit in the HBase wiki. + Committers do this. See How To Commit in the Apache HBase wiki. Commiters will also resolve the Jira, typically after the patch passes a build. diff --git a/src/docbkx/external_apis.xml b/src/docbkx/external_apis.xml index 136c7cc14b1..51961fbff0a 100644 --- a/src/docbkx/external_apis.xml +++ b/src/docbkx/external_apis.xml @@ -26,13 +26,13 @@ * limitations under the License. */ --> - External APIs - This chapter will cover access to HBase either through non-Java languages, or through custom protocols. + Apache HBase (TM) External APIs + This chapter will cover access to Apache HBase (TM) either through non-Java languages, or through custom protocols.
Non-Java Languages Talking to the JVM Currently the documentation on this topic in the - HBase Wiki. + Apache HBase Wiki. See also the Thrift API Javadoc.
@@ -40,18 +40,18 @@
REST Currently most of the documentation on REST exists in the - HBase Wiki on REST. + Apache HBase Wiki on REST.
Thrift Currently most of the documentation on Thrift exists in the - HBase Wiki on Thrift. + Apache HBase Wiki on Thrift.
Filter Language
Use Case - Note: this feature was introduced in HBase 0.92 + Note: this feature was introduced in Apache HBase 0.92 This allows the user to perform server-side filtering when accessing HBase over Thrift. The user specifies a filter via a string. The string is parsed on the server to construct the filter
@@ -414,10 +414,9 @@
- C/C++ HBase Client + C/C++ Apache HBase Client FB's Chip Turner wrote a pure C/C++ client. Check it out.
- diff --git a/src/docbkx/ops_mgt.xml b/src/docbkx/ops_mgt.xml index bad1516b662..d9b10e469fd 100644 --- a/src/docbkx/ops_mgt.xml +++ b/src/docbkx/ops_mgt.xml @@ -26,8 +26,8 @@ * limitations under the License. */ --> - HBase Operational Management - This chapter will cover operational tools and practices required of a running HBase cluster. + Apache HBase (TM) Operational Management + This chapter will cover operational tools and practices required of a running Apache HBase cluster. The subject of operations is related to the topics of , , and but is a distinct topic in itself. @@ -325,8 +325,8 @@ row10 c1 c2 A downside to the above stop of a RegionServer is that regions could be offline for a good period of time. Regions are closed in order. If many regions on the server, the first region to close may not be back online until all regions close and after the master - notices the RegionServer's znode gone. In HBase 0.90.2, we added facility for having - a node gradually shed its load and then shutdown itself down. HBase 0.90.2 added the + notices the RegionServer's znode gone. In Apache HBase 0.90.2, we added facility for having + a node gradually shed its load and then shutdown itself down. Apache HBase 0.90.2 added the graceful_stop.sh script. Here is its usage: $ ./bin/graceful_stop.sh Usage: graceful_stop.sh [--config &conf-dir>] [--restart] [--reload] [--thrift] [--rest] &hostname> diff --git a/src/docbkx/performance.xml b/src/docbkx/performance.xml index 831f648b9d7..ba13ee04fbe 100644 --- a/src/docbkx/performance.xml +++ b/src/docbkx/performance.xml @@ -26,7 +26,7 @@ * limitations under the License. */ --> - Performance Tuning + Apache HBase (TM) Performance Tuning
Operating System @@ -105,7 +105,7 @@ Java
- The Garbage Collector and HBase + The Garbage Collector and Apache HBase
Long GC pauses @@ -122,8 +122,8 @@ threshold, the more GCing is done, the more CPU used). To address the second fragmentation issue, Todd added an experimental facility, MSLAB, that - must be explicitly enabled in HBase 0.90.x (Its defaulted to be on in - 0.92.x HBase). See hbase.hregion.memstore.mslab.enabled + must be explicitly enabled in Apache HBase 0.90.x (Its defaulted to be on in + Apache 0.92.x HBase). See hbase.hregion.memstore.mslab.enabled to true in your Configuration. See the cited slides for background and detailThe latest jvms do better regards fragmentation so make sure you are running a recent release. @@ -646,7 +646,7 @@ htable.close();
Current Issues With Low-Latency Reads The original use-case for HDFS was batch processing. As such, there low-latency reads were historically not a priority. - With the increased adoption of HBase this is changing, and several improvements are already in development. + With the increased adoption of Apache HBase this is changing, and several improvements are already in development. See the Umbrella Jira Ticket for HDFS Improvements for HBase. diff --git a/src/docbkx/security.xml b/src/docbkx/security.xml index dbbe1482e86..066143a122c 100644 --- a/src/docbkx/security.xml +++ b/src/docbkx/security.xml @@ -26,12 +26,12 @@ * limitations under the License. */ --> -Secure HBase +Secure Apache HBase (TM)
- Secure Client Access to HBase - Newer releases of HBase (>= 0.92) support optional SASL authentication of clientsSee + Secure Client Access to Apache HBase + Newer releases of Apache HBase (TM) (>= 0.92) support optional SASL authentication of clientsSee also Matteo Bertozzi's article on Understanding User Authentication and Authorization in Apache HBase.. - This describes how to set up HBase and HBase clients for connection to secure HBase resources. + This describes how to set up Apache HBase and clients for connection to secure HBase resources.
Prerequisites @@ -233,7 +233,7 @@
Access Control - Newer releases of HBase (>= 0.92) support optional access control + Newer releases of Apache HBase (>= 0.92) support optional access control list (ACL-) based protection of resources on a column family and/or table basis. diff --git a/src/docbkx/shell.xml b/src/docbkx/shell.xml index 4fbab08d223..f341e7e6213 100644 --- a/src/docbkx/shell.xml +++ b/src/docbkx/shell.xml @@ -26,10 +26,10 @@ * limitations under the License. */ --> - The HBase Shell + The Apache HBase Shell - The HBase Shell is (J)Ruby's + The Apache HBase (TM) Shell is (J)Ruby's IRB with some HBase particular commands added. Anything you can do in IRB, you should be able to do in the HBase Shell. To run the HBase shell, @@ -47,7 +47,7 @@ for example basic shell operation.
Scripting - For examples scripting HBase, look in the + For examples scripting Apache HBase, look in the HBase bin directory. Look at the files that end in *.rb. To run one of these files, do as follows: diff --git a/src/docbkx/troubleshooting.xml b/src/docbkx/troubleshooting.xml index 8174618e991..48d7210ef8e 100644 --- a/src/docbkx/troubleshooting.xml +++ b/src/docbkx/troubleshooting.xml @@ -26,7 +26,7 @@ * limitations under the License. */ --> - Troubleshooting and Debugging HBase + Troubleshooting and Debugging Apache HBase (TM)
General Guidelines @@ -37,7 +37,7 @@ should return some hits for those exceptions you’re seeing. - An error rarely comes alone in HBase, usually when something gets screwed up what will + An error rarely comes alone in Apache HBase (TM), usually when something gets screwed up what will follow may be hundreds of exceptions and stack traces coming from all over the place. The best way to approach this type of problem is to walk the log up to where it all began, for example one trick with RegionServers is that they will print some @@ -207,9 +207,9 @@ export HBASE_OPTS="-XX:NewSize=64m -XX:MaxNewSize=64m <cms options from above
Mailing Lists - Ask a question on the HBase mailing lists. - The 'dev' mailing list is aimed at the community of developers actually building HBase and for features currently under development, and 'user' - is generally used for questions on released versions of HBase. Before going to the mailing list, make sure your + Ask a question on the Apache HBase mailing lists. + The 'dev' mailing list is aimed at the community of developers actually building Apache HBase and for features currently under development, and 'user' + is generally used for questions on released versions of Apache HBase. Before going to the mailing list, make sure your question has not already been answered by searching the mailing list archives first. Use . Take some time crafting your questionSee Getting Answers; a quality question that includes all context and @@ -498,13 +498,13 @@ hadoop 17789 155 35.2 9067824 8604364 ? S<l Mar04 9855:48 /usr/java/j
OpenTSDB - OpenTSDB is an excellent alternative to Ganglia as it uses HBase to store all the time series and doesn’t have to downsample. Monitoring your own HBase cluster that hosts OpenTSDB is a good exercise. + OpenTSDB is an excellent alternative to Ganglia as it uses Apache HBase to store all the time series and doesn’t have to downsample. Monitoring your own HBase cluster that hosts OpenTSDB is a good exercise. Here’s an example of a cluster that’s suffering from hundreds of compactions launched almost all around the same time, which severely affects the IO performance: (TODO: insert graph plotting compactionQueueSize) - It’s a good practice to build dashboards with all the important graphs per machine and per cluster so that debugging issues can be done with a single quick look. For example, at StumbleUpon there’s one dashboard per cluster with the most important metrics from both the OS and HBase. You can then go down at the machine level and get even more detailed metrics. + It’s a good practice to build dashboards with all the important graphs per machine and per cluster so that debugging issues can be done with a single quick look. For example, at StumbleUpon there’s one dashboard per cluster with the most important metrics from both the OS and Apache HBase. You can then go down at the machine level and get even more detailed metrics.
@@ -551,14 +551,14 @@ Harsh J investigated the issue as part of the mailing list thread
Long Client Pauses With Compression - This is a fairly frequent question on the HBase dist-list. The scenario is that a client is typically inserting a lot of data into a + This is a fairly frequent question on the Apache HBase dist-list. The scenario is that a client is typically inserting a lot of data into a relatively un-optimized HBase cluster. Compression can exacerbate the pauses, although it is not the source of the problem. See on the pattern for pre-creating regions and confirm that the table isn't starting with a single region. See for cluster configuration, particularly hbase.hstore.blockingStoreFiles, hbase.hregion.memstore.block.multiplier, MAX_FILESIZE (region size), and MEMSTORE_FLUSHSIZE. A slightly longer explanation of why pauses can happen is as follows: Puts are sometimes blocked on the MemStores which are blocked by the flusher thread which is blocked because there are too many files to compact because the compactor is given too many small files to compact and has to compact the same data repeatedly. This situation can occur even with minor compactions. - Compounding this situation, HBase doesn't compress data in memory. Thus, the 64MB that lives in the MemStore could become a 6MB file after compression - which results in a smaller StoreFile. The upside is that + Compounding this situation, Apache HBase doesn't compress data in memory. Thus, the 64MB that lives in the MemStore could become a 6MB file after compression - which results in a smaller StoreFile. The upside is that more data is packed into the same region, but performance is achieved by being able to write larger files - which is why HBase waits until the flushize before writing a new StoreFile. And smaller StoreFiles become targets for compaction. Without compression the files are much bigger and don't need as much compaction, however this is at the expense of I/O. @@ -623,7 +623,7 @@ invocation of the admin API. There can be several causes that produce this symptom. -First, check that you have a valid Kerberos ticket. One is required in order to set up communication with a secure HBase cluster. Examine the ticket currently in the credential cache, if any, by running the klist command line utility. If no ticket is listed, you must obtain a ticket by running the kinit command with either a keytab specified, or by interactively entering a password for the desired principal. +First, check that you have a valid Kerberos ticket. One is required in order to set up communication with a secure Apache HBase cluster. Examine the ticket currently in the credential cache, if any, by running the klist command line utility. If no ticket is listed, you must obtain a ticket by running the kinit command with either a keytab specified, or by interactively entering a password for the desired principal. Then, consult the Java Security Guide troubleshooting section. The most common problem addressed there is resolved by setting javax.security.auth.useSubjectCredsOnly system property value to false. @@ -924,7 +924,7 @@ ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: ZooKeeper session expi
Regions listed by domain name, then IP - Fix your DNS. In versions of HBase before 0.92.x, reverse DNS needs to give same answer + Fix your DNS. In versions of Apache HBase before 0.92.x, reverse DNS needs to give same answer as forward lookup. See HBASE 3431 RegionServer is not using the name given it by the master; double entry in master listing of servers for gorey details. @@ -1040,8 +1040,8 @@ ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: ZooKeeper session expi HBase and Hadoop version issues
<code>NoClassDefFoundError</code> when trying to run 0.90.x on hadoop-0.20.205.x (or hadoop-1.0.x) - HBase 0.90.x does not ship with hadoop-0.20.205.x, etc. To make it run, you need to replace the hadoop - jars that HBase shipped with in its lib directory with those of the Hadoop you want to + Apache HBase 0.90.x does not ship with hadoop-0.20.205.x, etc. To make it run, you need to replace the hadoop + jars that Apache HBase shipped with in its lib directory with those of the Hadoop you want to run HBase on. If even after replacing Hadoop jars you get the below exception: sv4r6s38: Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/configuration/Configuration diff --git a/src/docbkx/zookeeper.xml b/src/docbkx/zookeeper.xml index d082efa0333..9305837288a 100644 --- a/src/docbkx/zookeeper.xml +++ b/src/docbkx/zookeeper.xml @@ -31,9 +31,9 @@ ZooKeeper - A distributed HBase depends on a running ZooKeeper cluster. + A distributed Apache HBase (TM) installation depends on a running ZooKeeper cluster. All participating nodes and clients need to be able to access the - running ZooKeeper ensemble. HBase by default manages a ZooKeeper + running ZooKeeper ensemble. Apache HBase by default manages a ZooKeeper "cluster" for you. It will start and stop the ZooKeeper ensemble as part of the HBase start/stop process. You can also manage the ZooKeeper ensemble independent of HBase and just point HBase at @@ -191,7 +191,7 @@ ${HBASE_HOME}/bin/hbase-daemons.sh {start,stop} zookeeper
SASL Authentication with ZooKeeper - Newer releases of HBase (>= 0.92) will + Newer releases of Apache HBase (>= 0.92) will support connecting to a ZooKeeper Quorum that supports SASL authentication (which is available in Zookeeper versions 3.4.0 or later). diff --git a/src/site/site.vm b/src/site/site.vm index 0a478e4b4f9..0e2519528b6 100644 --- a/src/site/site.vm +++ b/src/site/site.vm @@ -535,7 +535,10 @@
-

All of the above guarantees must be possible within HBase. For users who would like to trade +

All of the above guarantees must be possible within Apache HBase. For users who would like to trade off some guarantees for performance, HBase may offer several tuning options. For example:

  • Visibility may be tuned on a per-read basis to allow stale reads or time travel.
  • @@ -206,7 +206,7 @@

- For more information, see the client architecture or data model sections in the HBase Reference Guide. + For more information, see the client architecture or data model sections in the Apache HBase Reference Guide.

@@ -217,7 +217,7 @@ (See Scan#setBatch(int)).

-

[2] In the context of HBase, "durably on disk" implies an hflush() call on the transaction +

[2] In the context of Apache HBase, "durably on disk" implies an hflush() call on the transaction log. This does not actually imply an fsync() to magnetic media, but rather just that the data has been written to the OS cache on all replicas of the log. In the case of a full datacenter power loss, it is possible that the edits are not truly durable.

diff --git a/src/site/xdoc/bulk-loads.xml b/src/site/xdoc/bulk-loads.xml index f6797fb5527..f4d3d75f951 100644 --- a/src/site/xdoc/bulk-loads.xml +++ b/src/site/xdoc/bulk-loads.xml @@ -18,7 +18,7 @@ xsi:schemaLocation="http://maven.apache.org/XDOC/2.0 http://maven.apache.org/xsd/xdoc-2.0.xsd"> - Bulk Loads in HBase + Bulk Loads in Apache HBase (TM) diff --git a/src/site/xdoc/cygwin.xml b/src/site/xdoc/cygwin.xml index 1be07178af7..40e0f72cf85 100644 --- a/src/site/xdoc/cygwin.xml +++ b/src/site/xdoc/cygwin.xml @@ -17,21 +17,21 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/XDOC/2.0 http://maven.apache.org/xsd/xdoc-2.0.xsd"> - Installing HBase on Windows using Cygwin + Installing Apache HBase (TM) on Windows using Cygwin
-

HBase is a distributed, column-oriented store, modeled after Google's BigTable. HBase is built on top of Hadoop for its MapReduce and distributed file system implementation. All these projects are open-source and part of the Apache Software Foundation.

+

Apache HBase (TM) is a distributed, column-oriented store, modeled after Google's BigTable. Apache HBase is built on top of Hadoop for its MapReduce and distributed file system implementation. All these projects are open-source and part of the Apache Software Foundation.

As being distributed, large scale platforms, the Hadoop and HBase projects mainly focus on *nix environments for production installations. However, being developed in Java, both projects are fully portable across platforms and, hence, also to the Windows operating system. For ease of development the projects rely on Cygwin to have a *nix-like environment on Windows to run the shell scripts.

-

This document explains the intricacies of running HBase on Windows using Cygwin as an all-in-one single-node installation for testing and development. The HBase Overview and QuickStart guides on the other hand go a long way in explaning how to setup HBase in more complex deployment scenario's.

+

This document explains the intricacies of running Apache HBase on Windows using Cygwin as an all-in-one single-node installation for testing and development. The HBase Overview and QuickStart guides on the other hand go a long way in explaning how to setup HBase in more complex deployment scenario's.

-

For running HBase on Windows, 3 technologies are required: Java, Cygwin and SSH. The following paragraphs detail the installation of each of the aforementioned technologies.

+

For running Apache HBase on Windows, 3 technologies are required: Java, Cygwin and SSH. The following paragraphs detail the installation of each of the aforementioned technologies.

HBase depends on the Java Platform, Standard Edition, 6 Release. So the target system has to be provided with at least the Java Runtime Environment (JRE); however if the system will also be used for development, the Jave Development Kit (JDK) is preferred. You can download the latest versions for both from Sun's download page. Installation is a simple GUI wizard that guides you through the process.

@@ -91,7 +91,7 @@
-

Download the latest release of HBase from the website. As the HBase distributable is just a zipped archive, installation is as simple as unpacking the archive so it ends up in its final installation directory. Notice that HBase has to be installed in Cygwin and a good directory suggestion is to use /usr/local/ (or [Root directory]\usr\local in Windows slang). You should end up with a /usr/local/hbase-<version> installation in Cygwin.

+

Download the latest release of Apache HBase from the website. As the Apache HBase distributable is just a zipped archive, installation is as simple as unpacking the archive so it ends up in its final installation directory. Notice that HBase has to be installed in Cygwin and a good directory suggestion is to use /usr/local/ (or [Root directory]\usr\local in Windows slang). You should end up with a /usr/local/hbase-<version> installation in Cygwin.

This finishes installation. We go on with the configuration.
@@ -192,7 +192,7 @@ If all previous configurations are working properly, we just need some tinkering
Testing

-This should conclude the installation and configuration of HBase on Windows using Cygwin. So it's time to test it. +This should conclude the installation and configuration of Apache HBase on Windows using Cygwin. So it's time to test it.

  1. Start a Cygwin terminal, if you haven't already.
  2. Change directory to HBase installation using CD /usr/local/hbase-<version>, preferably using auto-completion.
  3. diff --git a/src/site/xdoc/index.xml b/src/site/xdoc/index.xml index f6f15666eea..1e4c7945e5b 100644 --- a/src/site/xdoc/index.xml +++ b/src/site/xdoc/index.xml @@ -17,20 +17,20 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/XDOC/2.0 http://maven.apache.org/xsd/xdoc-2.0.xsd"> - HBase Home + Apache HBase™ Home - + -
    -

    HBase is the Hadoop database. Think of it as a distributed, scalable, big data store. +

    +

    Apache HBase™ is the Hadoop database, a distributed, scalable, big data store.

    -

    When Would I Use HBase?

    +

    When Would I Use Apache HBase?

    - Use HBase when you need random, realtime read/write access to your Big Data. + Use Apache HBase when you need random, realtime read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware. -HBase is an open-source, distributed, versioned, column-oriented store modeled after Google's Bigtable: A Distributed Storage System for Structured Data by Chang et al. - Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Hadoop and HDFS. +Apache HBase is an open-source, distributed, versioned, column-oriented store modeled after Google's Bigtable: A Distributed Storage System for Structured Data by Chang et al. + Just as Bigtable leverages the distributed data storage provided by the Google File System, Apache HBase provides Bigtable-like capabilities on top of Hadoop and HDFS.

    Features

    @@ -43,7 +43,7 @@ HBase is an open-source, distributed, versioned, column-oriented store modeled a

  4. Automatic failover support between RegionServers.
  5. -
  6. Convenient base classes for backing Hadoop MapReduce jobs with HBase tables. +
  7. Convenient base classes for backing Hadoop MapReduce jobs with Apache HBase tables.
  8. Easy to use Java API for client access.
  9. @@ -68,12 +68,12 @@ HBase is an open-source, distributed, versioned, column-oriented store modeled a

    October 29th, 2012 HBase User Group Meetup at Wize Commerce in San Mateo.

    October 25th, 2012 Strata/Hadoop World HBase Meetup. in NYC

    September 11th, 2012 Contributor's Pow-Wow at HortonWorks HQ.

    -

    August 8th, 2012 HBase 0.94.1 is available for download

    +

    August 8th, 2012 Apache HBase 0.94.1 is available for download

    June 15th, 2012 Birds-of-a-feather in San Jose, day after Hadoop Summit

    May 23rd, 2012 HackConAthon in Palo Alto

    Old News

    - + diff --git a/src/site/xdoc/metrics.xml b/src/site/xdoc/metrics.xml index 92abd0b3447..47902b8df49 100644 --- a/src/site/xdoc/metrics.xml +++ b/src/site/xdoc/metrics.xml @@ -18,14 +18,14 @@ xsi:schemaLocation="http://maven.apache.org/XDOC/2.0 http://maven.apache.org/xsd/xdoc-2.0.xsd"> - HBase Metrics + Apache HBase (TM) Metrics

    - HBase emits Hadoop metrics. + Apache HBase (TM) emits Hadoop metrics.

    @@ -138,7 +138,7 @@ export HBASE_REGIONSERVER_OPTS="$HBASE_JMX_OPTS -Dcom.sun.management.jmxremote.p

    - For more information on understanding HBase metrics, see the metrics section in the HBase Reference Guide. + For more information on understanding HBase metrics, see the metrics section in the Apache HBase Reference Guide.

    diff --git a/src/site/xdoc/old_news.xml b/src/site/xdoc/old_news.xml index 83c81e3aed6..bb0d35660d0 100644 --- a/src/site/xdoc/old_news.xml +++ b/src/site/xdoc/old_news.xml @@ -22,7 +22,7 @@ xsi:schemaLocation="http://maven.apache.org/XDOC/2.0 http://maven.apache.org/xsd/xdoc-2.0.xsd"> - Old News + Old Apache HBase (TM) News @@ -31,29 +31,29 @@

    March 27th, 2012 Meetup @ StumbleUpon in San Francisco

    January 19th, 2012 Meetup @ EBay

    -

    January 23rd, 2012 HBase 0.92.0 released. Download it!

    -

    December 23rd, 2011 HBase 0.90.5 released. Download it!

    +

    January 23rd, 2012 Apache HBase 0.92.0 released. Download it!

    +

    December 23rd, 2011 Apache HBase 0.90.5 released. Download it!

    November 29th, 2011 Developer Pow-Wow in SF at Salesforce HQ

    November 7th, 2011 HBase Meetup in NYC (6PM) at the AppNexus office

    August 22nd, 2011 HBase Hackathon (11AM) and Meetup (6PM) at FB in PA

    June 30th, 2011 HBase Contributor Day, the day after the Hadoop Summit hosted by Y!

    June 8th, 2011 HBase Hackathon in Berlin to coincide with Berlin Buzzwords

    -

    May 19th, 2011 HBase 0.90.3 released. Download it!

    -

    April 12th, 2011 HBase 0.90.2 released. Download it!

    +

    May 19th, 2011 Apache HBase 0.90.3 released. Download it!

    +

    April 12th, 2011 Apache HBase 0.90.2 released. Download it!

    March 21st, HBase 0.92 Hackathon at StumbleUpon, SF

    February 22nd, HUG12: February HBase User Group at StumbleUpon SF

    December 13th, HBase Hackathon: Coprocessor Edition

    -

    November 19th, Hadoop HUG in London is all about HBase

    +

    November 19th, Hadoop HUG in London is all about Apache HBase

    November 15-19th, Devoxx features HBase Training and multiple HBase presentations

    October 12th, HBase-related presentations by core contributors and users at Hadoop World 2010

    October 11th, HUG-NYC: HBase User Group NYC Edition (Night before Hadoop World)

    -

    June 30th, HBase Contributor Workshop (Day after Hadoop Summit)

    -

    May 10th, 2010: HBase graduates from Hadoop sub-project to Apache Top Level Project

    +

    June 30th, Apache HBase Contributor Workshop (Day after Hadoop Summit)

    +

    May 10th, 2010: Apache HBase graduates from Hadoop sub-project to Apache Top Level Project

    Signup for HBase User Group Meeting, HUG10 hosted by Trend Micro, April 19th, 2010

    HBase User Group Meeting, HUG9 hosted by Mozilla, March 10th, 2010

    Sign up for the HBase User Group Meeting, HUG8, January 27th, 2010 at StumbleUpon in SF

    -

    September 8th, 2010: HBase 0.20.0 is faster, stronger, slimmer, and sweeter tasting than any previous HBase release. Get it off the Releases page.

    +

    September 8th, 2010: Apache HBase 0.20.0 is faster, stronger, slimmer, and sweeter tasting than any previous Apache HBase release. Get it off the Releases page.

    ApacheCon in Oakland: November 2-6th, 2009: The Apache Foundation will be celebrating its 10th anniversary in beautiful Oakland by the Bay. Lots of good talks and meetups including an HBase presentation by a couple of the lads.

    HBase at Hadoop World in NYC: October 2nd, 2009: A few of us will be talking on Practical HBase out east at Hadoop World: NYC.

    diff --git a/src/site/xdoc/pseudo-distributed.xml b/src/site/xdoc/pseudo-distributed.xml index 596a4c5ae5b..a1b38babb99 100644 --- a/src/site/xdoc/pseudo-distributed.xml +++ b/src/site/xdoc/pseudo-distributed.xml @@ -22,7 +22,7 @@ xsi:schemaLocation="http://maven.apache.org/XDOC/2.0 http://maven.apache.org/xsd/xdoc-2.0.xsd"> -Running HBase in pseudo-distributed mode +Running Apache HBase (TM) in pseudo-distributed mode diff --git a/src/site/xdoc/replication.xml b/src/site/xdoc/replication.xml index c465f2fb8e1..727ca267636 100644 --- a/src/site/xdoc/replication.xml +++ b/src/site/xdoc/replication.xml @@ -22,13 +22,13 @@ xsi:schemaLocation="http://maven.apache.org/XDOC/2.0 http://maven.apache.org/xsd/xdoc-2.0.xsd"> - HBase Replication + Apache HBase (TM) Replication

    - HBase replication is a way to copy data between HBase deployments. It + The replication feature of Apache HBase (TM) provides a way to copy data between HBase deployments. It can serve as a disaster recovery solution and can contribute to provide higher availability at the HBase layer. It can also serve more practically; for example, as a way to easily copy edits from a web-facing cluster to a "MapReduce" @@ -36,7 +36,7 @@ automatically.

    - The basic architecture pattern used for HBase replication is (HBase cluster) master-push; + The basic architecture pattern used for Apache HBase replication is (HBase cluster) master-push; it is much easier to keep track of what’s currently being replicated since each region server has its own write-ahead-log (aka WAL or HLog), just like other well known solutions like MySQL master/slave replication where @@ -74,15 +74,15 @@ of replication on the slave clusters by relying on randomization.

    - As of version 0.92 HBase supports master/master and cyclic replication as - well as replication to multiple slaves. + As of version 0.92, Apache HBase supports master/master and cyclic + replication as well as replication to multiple slaves.

    The guide on enabling and using cluster replication is contained - in the API documentation shipped with your HBase distribution. + in the API documentation shipped with your Apache HBase distribution.

    The most up-to-date documentation is @@ -98,7 +98,7 @@

    - The client uses a HBase API that sends a Put, Delete or ICV to a region + The client uses an API that sends a Put, Delete or ICV to a region server. The key values are transformed into a WALEdit by the region server and is inspected by the replication code that, for each family that is scoped for replication, adds the scope to the edit. The edit diff --git a/src/site/xdoc/resources.xml b/src/site/xdoc/resources.xml index 49a06afbcbf..214fc79aa37 100644 --- a/src/site/xdoc/resources.xml +++ b/src/site/xdoc/resources.xml @@ -16,11 +16,11 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/XDOC/2.0 http://maven.apache.org/xsd/xdoc-2.0.xsd"> - Other HBase Resources + Other Apache HBase (TM) Resources -

    +

    HBase: The Definitive Guide Random Access to Your Planet-Size Data by Lars George. Publisher: O'Reilly Media, Released: August 2011, Pages: 556.

    diff --git a/src/site/xdoc/sponsors.xml b/src/site/xdoc/sponsors.xml index e39730bbbd2..197ee1d3abd 100644 --- a/src/site/xdoc/sponsors.xml +++ b/src/site/xdoc/sponsors.xml @@ -16,12 +16,15 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/XDOC/2.0 http://maven.apache.org/xsd/xdoc-2.0.xsd"> - Installing HBase on Windows using Cygwin + Apache HBase™ Sponsors
    -

    The below companies have been gracious enough to provide their commerical tool offerings free of charge to the Apache HBase project. +

    First off, thanks to all who sponsor + our parent, the Apache Software Foundation. +

    +

    The below companies have been gracious enough to provide their commerical tool offerings free of charge to the Apache HBase™ project.

    +
    +

    To contribute to the Apache Software Foundation, a good idea in our opinion, see the ASF Sponsorship page. +

    +