HBASE-738 overview.html in need of updating
git-svn-id: https://svn.apache.org/repos/asf/hadoop/hbase/trunk@676090 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
36f0d36de9
commit
e6e11eae01
|
@ -283,6 +283,7 @@ Trunk (unreleased changes)
|
||||||
(Jean-Daniel Cryans via Stack)
|
(Jean-Daniel Cryans via Stack)
|
||||||
HBASE-730 On startup, rinse STARTCODE and SERVER from .META.
|
HBASE-730 On startup, rinse STARTCODE and SERVER from .META.
|
||||||
(Jean-Daniel Cryans via Stack)
|
(Jean-Daniel Cryans via Stack)
|
||||||
|
HBASE-738 overview.html in need of updating (Izaak Rubin via Stack)
|
||||||
|
|
||||||
NEW FEATURES
|
NEW FEATURES
|
||||||
HBASE-47 Option to set TTL for columns in hbase
|
HBASE-47 Option to set TTL for columns in hbase
|
||||||
|
|
|
@ -28,7 +28,8 @@
|
||||||
<ul>
|
<ul>
|
||||||
<li>Java 1.5.x, preferably from <a href="http://www.java.com/en/download/">Sun</a>.
|
<li>Java 1.5.x, preferably from <a href="http://www.java.com/en/download/">Sun</a>.
|
||||||
</li>
|
</li>
|
||||||
<li>Hadoop 0.17.x. This version of HBase will only run on this version of Hadoop.</a>.
|
<li><a href="http://hadoop.apache.org/core/releases.html">Hadoop 0.17.x</a>. This version of HBase will
|
||||||
|
only run on this version of Hadoop.
|
||||||
</li>
|
</li>
|
||||||
<li>
|
<li>
|
||||||
ssh must be installed and sshd must be running to use Hadoop's
|
ssh must be installed and sshd must be running to use Hadoop's
|
||||||
|
@ -48,63 +49,75 @@ for the first time. If upgrading your
|
||||||
HBase instance, see <a href="#upgrading">Upgrading</a>.
|
HBase instance, see <a href="#upgrading">Upgrading</a>.
|
||||||
</p>
|
</p>
|
||||||
<p>
|
<p>
|
||||||
<ul>
|
Define <code>${HBASE_HOME}</code> to be the location of the root of your HBase installation, e.g.
|
||||||
<li><code>${HBASE_HOME}</code>: Set HBASE_HOME to the location of the HBase root: e.g. <code>/user/local/hbase</code>.
|
<code>/user/local/hbase</code>. Edit <code>${HBASE_HOME}/conf/hbase-env.sh</code>. In this file you can
|
||||||
</li>
|
|
||||||
</ul>
|
|
||||||
</p>
|
|
||||||
<p>Edit <code>${HBASE_HOME}/conf/hbase-env.sh</code>. In this file you can
|
|
||||||
set the heapsize for HBase, etc. At a minimum, set <code>JAVA_HOME</code> to point at the root of
|
set the heapsize for HBase, etc. At a minimum, set <code>JAVA_HOME</code> to point at the root of
|
||||||
your Java installation.
|
your Java installation.
|
||||||
|
</p>
|
||||||
<p>
|
<p>
|
||||||
If you are running a standalone operation, there should be nothing further to configure; proceed to
|
If you are running a standalone operation, there should be nothing further to configure; proceed to
|
||||||
<a href=#runandconfirm>Running and Confirming Your Installation</a>. If you are running a distributed
|
<a href=#runandconfirm>Running and Confirming Your Installation</a>. If you are running a distributed
|
||||||
operation, continue reading.
|
operation, continue reading.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<h2><a name="distributed" >Distributed Operation</a></h2>
|
<h2><a name="distributed">Distributed Operation</a></h2>
|
||||||
<p>Distributed mode requires an instance of the Hadoop Distributed File System (DFS).
|
<p>Distributed mode requires an instance of the Hadoop Distributed File System (DFS).
|
||||||
See the Hadoop <a href="http://lucene.apache.org/hadoop/api/overview-summary.html#overview_description">
|
See the Hadoop <a href="http://lucene.apache.org/hadoop/api/overview-summary.html#overview_description">
|
||||||
requirements and instructions</a> for how to set up a DFS.</p>
|
requirements and instructions</a> for how to set up a DFS.</p>
|
||||||
<p>Once you have confirmed your DFS setup, configuring HBase requires modification of the following two files:
|
|
||||||
<code>${HBASE_HOME}/conf/hbase-site.xml</code> and <code>${HBASE_HOME}/conf/regionservers</code>.
|
<h3><a name="pseudo-distrib">Pseudo-Distributed Operation</a></h3>
|
||||||
The former needs to be pointed at the running Hadoop DFS instance. The latter file lists
|
<p>A pseudo-distributed operation is simply a distributed operation run on a single host.
|
||||||
all the members of the HBase cluster.
|
Once you have confirmed your DFS setup, configuring HBase for use on one host requires modification of
|
||||||
</p>
|
<code>${HBASE_HOME}/conf/hbase-site.xml</code>, which needs to be pointed at the running Hadoop DFS instance.
|
||||||
<p>Use <code>hbase-site.xml</code> to override the properties defined in
|
Use <code>hbase-site.xml</code> to override the properties defined in
|
||||||
<code>${HBASE_HOME}/conf/hbase-default.xml</code> (<code>hbase-default.xml</code> itself
|
<code>${HBASE_HOME}/conf/hbase-default.xml</code> (<code>hbase-default.xml</code> itself
|
||||||
should never be modified). At a minimum the <code>hbase.master</code> and the
|
should never be modified). At a minimum the <code>hbase.rootdir</code> property should be redefined
|
||||||
<code>hbase.rootdir</code> properties should be redefined
|
in <code>hbase-site.xml</code> to point HBase at the Hadoop filesystem to use. For example, adding the property
|
||||||
in <code>hbase-site.xml</code> to configure the <code>host:port</code> pair on which the
|
below to your <code>hbase-site.xml</code> says that HBase should use the <code>/hbase</code> directory in the
|
||||||
HMaster runs (<a href="http://wiki.apache.org/lucene-hadoop/Hbase/HbaseArchitecture">read about the
|
HDFS whose namenode is at port 9000 on your local machine:
|
||||||
HBase master, regionservers, etc</a>) and to point HBase at the Hadoop filesystem to use. For
|
|
||||||
example, adding the below to your hbase-site.xml says the master is up on port 60000 on the host
|
|
||||||
example.org and that HBase should use the <code>/hbase</code> directory in the HDFS whose namenode
|
|
||||||
is at port 9000, again on example.org:
|
|
||||||
</p>
|
</p>
|
||||||
<pre>
|
<pre>
|
||||||
<configuration>
|
<configuration>
|
||||||
|
...
|
||||||
|
<property>
|
||||||
|
<name>hbase.rootdir</name>
|
||||||
|
<value>hdfs://localhost:9000/hbase</value>
|
||||||
|
<description>The directory shared by region servers.
|
||||||
|
</description>
|
||||||
|
</property>
|
||||||
|
...
|
||||||
|
</configuration>
|
||||||
|
</pre>
|
||||||
|
|
||||||
|
<h3><a name="fully-distrib">Fully-Distributed Operation</a></h3>
|
||||||
|
For running a fully-distributed operation on more than one host, the following configurations
|
||||||
|
must be made <i>in addition</i> to those described in the
|
||||||
|
<a href="#pseudo-distrib">pseudo-distributed operation</a> section above. In
|
||||||
|
<code>hbase-site.xml</code>, you must also configure <code>hbase.master</code> to the
|
||||||
|
<code>host:port</code> pair on which the HMaster runs
|
||||||
|
(<a href="http://wiki.apache.org/lucene-hadoop/Hbase/HbaseArchitecture">read about the HBase master,
|
||||||
|
regionservers, etc</a>). For example, adding the below to your <code>hbase-site.xml</code> says the
|
||||||
|
master is up on port 60000 on the host example.org:
|
||||||
|
</p>
|
||||||
|
<pre>
|
||||||
|
<configuration>
|
||||||
|
...
|
||||||
<property>
|
<property>
|
||||||
<name>hbase.master</name>
|
<name>hbase.master</name>
|
||||||
<value>example.org:60000</value>
|
<value>example.org:60000</value>
|
||||||
<description>The host and port that the HBase master runs at.
|
<description>The host and port that the HBase master runs at.
|
||||||
</description>
|
</description>
|
||||||
</property>
|
</property>
|
||||||
|
...
|
||||||
<property>
|
|
||||||
<name>hbase.rootdir</name>
|
|
||||||
<value>hdfs://example.org:9000/hbase</value>
|
|
||||||
<description>The directory shared by region servers.
|
|
||||||
</description>
|
|
||||||
</property>
|
|
||||||
|
|
||||||
</configuration>
|
</configuration>
|
||||||
</pre>
|
</pre>
|
||||||
<p>
|
<p>
|
||||||
The <code>regionserver</code> file lists all the hosts running HRegionServers, one
|
Keep in mind that for a fully-distributed operation, you may not want your <code>hbase.rootdir</code>
|
||||||
host per line (This file in HBase is like the hadoop slaves file at
|
to point to localhost (maybe, as in the configuration above, you will want to use
|
||||||
<code>${HADOOP_HOME}/conf/slaves</code>).
|
<code>example.org</code>). In addition to <code>hbase-site.xml</code>, a fully-distributed
|
||||||
|
operation requires that you also modify <code>${HBASE_HOME}/conf/regionservers</code>.
|
||||||
|
<code>regionserver</code> lists all the hosts running HRegionServers, one host per line (This file
|
||||||
|
in HBase is like the hadoop slaves file at <code>${HADOOP_HOME}/conf/slaves</code>).
|
||||||
</p>
|
</p>
|
||||||
<p>Of note, if you have made <i>HDFS client configuration</i> on your hadoop cluster, hbase will not
|
<p>Of note, if you have made <i>HDFS client configuration</i> on your hadoop cluster, hbase will not
|
||||||
see this configuration unless you do one of the following:
|
see this configuration unless you do one of the following:
|
||||||
|
@ -114,8 +127,8 @@ see this configuration unless you do one of the following:
|
||||||
<li>If only a small set of HDFS client configurations, add them to <code>hbase-site.xml</code></li>
|
<li>If only a small set of HDFS client configurations, add them to <code>hbase-site.xml</code></li>
|
||||||
</ul>
|
</ul>
|
||||||
An example of such an HDFS client configuration is <code>dfs.replication</code>. If for example,
|
An example of such an HDFS client configuration is <code>dfs.replication</code>. If for example,
|
||||||
you want to run with a replication factor of 5, hbase will make files will create files with
|
you want to run with a replication factor of 5, hbase will create files with the default of 3 unless
|
||||||
the default of 3 unless you do the above to make the configuration available to hbase.
|
you do the above to make the configuration available to hbase.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<h2><a name="runandconfirm">Running and Confirming Your Installation</a></h2>
|
<h2><a name="runandconfirm">Running and Confirming Your Installation</a></h2>
|
||||||
|
@ -125,7 +138,7 @@ the local filesystem.</p>
|
||||||
<p>If you are running a distributed cluster you will need to start the Hadoop DFS daemons
|
<p>If you are running a distributed cluster you will need to start the Hadoop DFS daemons
|
||||||
before starting HBase and stop the daemons after HBase has shut down. Start and
|
before starting HBase and stop the daemons after HBase has shut down. Start and
|
||||||
stop the Hadoop DFS daemons by running <code>${HADOOP_HOME}/bin/start-dfs.sh</code>.
|
stop the Hadoop DFS daemons by running <code>${HADOOP_HOME}/bin/start-dfs.sh</code>.
|
||||||
Ensure it started properly by testing the put and get of files into the Hadoop filesystem.
|
You can ensure it started properly by testing the put and get of files into the Hadoop filesystem.
|
||||||
HBase does not normally use the mapreduce daemons. These do not need to be started.</p>
|
HBase does not normally use the mapreduce daemons. These do not need to be started.</p>
|
||||||
|
|
||||||
<p>Start HBase with the following command:
|
<p>Start HBase with the following command:
|
||||||
|
@ -169,14 +182,12 @@ the HBase version. It does not change your install unless you explicitly ask it
|
||||||
|
|
||||||
<div style="background-color: #cccccc; padding: 2px">
|
<div style="background-color: #cccccc; padding: 2px">
|
||||||
<code><pre>
|
<code><pre>
|
||||||
|
import java.io.IOException;
|
||||||
import org.apache.hadoop.hbase.client.HTable;
|
import org.apache.hadoop.hbase.client.HTable;
|
||||||
import org.apache.hadoop.hbase.HBaseConfiguration;
|
import org.apache.hadoop.hbase.client.Scanner;
|
||||||
import org.apache.hadoop.hbase.HStoreKey;
|
|
||||||
import org.apache.hadoop.hbase.HScannerInterface;
|
|
||||||
import org.apache.hadoop.hbase.io.BatchUpdate;
|
import org.apache.hadoop.hbase.io.BatchUpdate;
|
||||||
import org.apache.hadoop.hbase.io.Cell;
|
import org.apache.hadoop.hbase.io.Cell;
|
||||||
import org.apache.hadoop.io.Text;
|
import org.apache.hadoop.hbase.io.RowResult;
|
||||||
import java.io.IOException;
|
|
||||||
|
|
||||||
public class MyClient {
|
public class MyClient {
|
||||||
|
|
||||||
|
@ -187,12 +198,12 @@ public class MyClient {
|
||||||
|
|
||||||
// This instantiates an HTable object that connects you to the "myTable"
|
// This instantiates an HTable object that connects you to the "myTable"
|
||||||
// table.
|
// table.
|
||||||
HTable table = new HTable(config, new Text("myTable"));
|
HTable table = new HTable(config, "myTable");
|
||||||
|
|
||||||
// To do any sort of update on a row, you use an instance of the BatchUpdate
|
// To do any sort of update on a row, you use an instance of the BatchUpdate
|
||||||
// class. A BatchUpdate takes a row and optionally a timestamp which your
|
// class. A BatchUpdate takes a row and optionally a timestamp which your
|
||||||
// updates will affect.
|
// updates will affect.
|
||||||
BatchUpdate batchUpdate = new BatchUpdate(new Text("myRow"));
|
BatchUpdate batchUpdate = new BatchUpdate("myRow");
|
||||||
|
|
||||||
// The BatchUpdate#put method takes a Text that describes what cell you want
|
// The BatchUpdate#put method takes a Text that describes what cell you want
|
||||||
// to put a value into, and a byte array that is the value you want to
|
// to put a value into, and a byte array that is the value you want to
|
||||||
|
@ -200,11 +211,11 @@ public class MyClient {
|
||||||
// from the string for HBase to understand how to store it. (The same goes
|
// from the string for HBase to understand how to store it. (The same goes
|
||||||
// for primitives like ints and longs and user-defined classes - you must
|
// for primitives like ints and longs and user-defined classes - you must
|
||||||
// find a way to reduce it to bytes.)
|
// find a way to reduce it to bytes.)
|
||||||
batchUpdate.put(new Text("myColumnFamily:columnQualifier1"),
|
batchUpdate.put("myColumnFamily:columnQualifier1",
|
||||||
"columnQualifier1 value!".getBytes());
|
"columnQualifier1 value!".getBytes());
|
||||||
|
|
||||||
// Deletes are batch operations in HBase as well.
|
// Deletes are batch operations in HBase as well.
|
||||||
batchUpdate.delete(new Text("myColumnFamily:cellIWantDeleted"));
|
batchUpdate.delete("myColumnFamily:cellIWantDeleted");
|
||||||
|
|
||||||
// Once you've done all the puts you want, you need to commit the results.
|
// Once you've done all the puts you want, you need to commit the results.
|
||||||
// The HTable#commit method takes the BatchUpdate instance you've been
|
// The HTable#commit method takes the BatchUpdate instance you've been
|
||||||
|
@ -216,44 +227,38 @@ public class MyClient {
|
||||||
// the timestamp the value was stored with. If you happen to know that the
|
// the timestamp the value was stored with. If you happen to know that the
|
||||||
// value contained is a string and want an actual string, then you must
|
// value contained is a string and want an actual string, then you must
|
||||||
// convert it yourself.
|
// convert it yourself.
|
||||||
Cell cell = table.get(new Text("myRow"),
|
Cell cell = table.get("myRow", "myColumnFamily:columnQualifier1");
|
||||||
new Text("myColumnFamily:columnQualifier1"));
|
String valueStr = new String(cell.getValue());
|
||||||
String valueStr = new String(valueBytes.getValue());
|
|
||||||
|
|
||||||
// Sometimes, you won't know the row you're looking for. In this case, you
|
// Sometimes, you won't know the row you're looking for. In this case, you
|
||||||
// use a Scanner. This will give you cursor-like interface to the contents
|
// use a Scanner. This will give you cursor-like interface to the contents
|
||||||
// of the table.
|
// of the table.
|
||||||
HStoreKey row = new HStoreKey();
|
Scanner scanner =
|
||||||
SortedMap<Text, byte[]> columns = new TreeMap<Text, byte[]>();
|
|
||||||
HScannerInterface scanner =
|
|
||||||
// we want to get back only "myColumnFamily:columnQualifier1" when we iterate
|
// we want to get back only "myColumnFamily:columnQualifier1" when we iterate
|
||||||
table.obtainScanner(new Text[]{new Text("myColumnFamily:columnQualifier1")},
|
table.getScanner(new String[]{"myColumnFamily:columnQualifier1"});
|
||||||
// we want to start scanning from an empty Text, meaning the beginning of
|
|
||||||
// the table
|
|
||||||
new Text(""));
|
|
||||||
|
|
||||||
|
|
||||||
// Scanners in HBase 0.2 return RowResult instances. A RowResult is like the
|
// Scanners in HBase 0.2 return RowResult instances. A RowResult is like the
|
||||||
// row key and the columns all wrapped up in a single interface.
|
// row key and the columns all wrapped up in a single interface.
|
||||||
// RowResult#getRow gives you the row key. RowResult also implements
|
// RowResult#getRow gives you the row key. RowResult also implements
|
||||||
// Map<Text, Cell>, so you can get to your column results easily.
|
// Map, so you can get to your column results easily.
|
||||||
|
|
||||||
// Now, for the actual iteration. One way is to use a while loop like so:
|
// Now, for the actual iteration. One way is to use a while loop like so:
|
||||||
RowResult rowResult = scanner.next();
|
RowResult rowResult = scanner.next();
|
||||||
|
|
||||||
while(rowResult != null) {
|
while(rowResult != null) {
|
||||||
// print out the row we found and the columns we were looking for
|
// print out the row we found and the columns we were looking for
|
||||||
System.out.println("Found row: " + rowResult.getRow() + " with value: " +
|
System.out.println("Found row: " + new String(rowResult.getRow()) + " with value: " +
|
||||||
new String(rowResult.get("myColumnFamily:columnQualifier1")));
|
rowResult.get("myColumnFamily:columnQualifier1".getBytes()));
|
||||||
|
|
||||||
rowResult = scanner.next();
|
rowResult = scanner.next();
|
||||||
}
|
}
|
||||||
|
|
||||||
// The other approach is to use a foreach loop. Scanners are iterable!
|
// The other approach is to use a foreach loop. Scanners are iterable!
|
||||||
for (RowResult rowResult : scanner) {
|
for (RowResult result : scanner) {
|
||||||
// print out the row we found and the columns we were looking for
|
// print out the row we found and the columns we were looking for
|
||||||
System.out.println("Found row: " + rowResult.getRow() + " with value: " +
|
System.out.println("Found row: " + new String(result.getRow()) + " with value: " +
|
||||||
new String(rowResult.get("myColumnFamily:columnQualifier1")));
|
result.get("myColumnFamily:columnQualifier1".getBytes()));
|
||||||
}
|
}
|
||||||
|
|
||||||
// Make sure you close your scanners when you are done!
|
// Make sure you close your scanners when you are done!
|
||||||
|
|
Loading…
Reference in New Issue