HBASE-11640 Add syntax highlighting support to HBase Ref Guide programlistings (Misty Stanley-Jones)

This commit is contained in:
stack 2014-08-01 09:39:56 -07:00
parent 78d47cc0e4
commit 24b5fa7f0c
19 changed files with 391 additions and 367 deletions

View File

@ -107,7 +107,7 @@
<para>For each issue you work on, create a new branch. One convention that works
well for naming the branches is to name a given branch the same as the JIRA it
relates to:</para>
<screen>$ git checkout -b HBASE-123456</screen>
<screen language="bourne">$ git checkout -b HBASE-123456</screen>
</step>
<step>
<para>Make your suggested changes on your branch, committing your changes to your
@ -123,8 +123,8 @@
sure you have built HBase at least once, in order to fetch all the Maven
dependencies you need.</para>
</note>
<screen>$ mvn clean install -DskipTests # Builds HBase</screen>
<screen>$ mvn clean site -DskipTests # Builds the website and documentation</screen>
<screen language="bourne">$ mvn clean install -DskipTests # Builds HBase</screen>
<screen language="bourne">$ mvn clean site -DskipTests # Builds the website and documentation</screen>
<para>If any errors occur, address them.</para>
</step>
<step>
@ -132,7 +132,7 @@
the area of the code you are working in has had a lot of changes lately, make
sure you rebase your branch against the remote master and take care of any
conflicts before submitting your patch.</para>
<screen>
<screen language="bourne">
$ git checkout HBASE-123456
$ git rebase origin/master
</screen>
@ -141,7 +141,7 @@ $ git rebase origin/master
<para>Generate your patch against the remote master. Run the following command from
the top level of your git repository (usually called
<literal>hbase</literal>):</para>
<screen>$ git diff --no-prefix origin/master > HBASE-123456.patch</screen>
<screen language="bourne">$ git diff --no-prefix origin/master > HBASE-123456.patch</screen>
<para>The name of the patch should contain the JIRA ID. Look over the patch file to
be sure that you did not change any additional files by accident and that there
are no other surprises. When you are satisfied, attach the patch to the JIRA and
@ -227,7 +227,7 @@ $ git rebase origin/master
recommended that you use a &lt;figure&gt; Docbook element for an image. This allows
screen readers to navigate to the image and also provides alternative text for the
image. The following is an example of a &lt;figure&gt; element.</para>
<programlisting><![CDATA[<figure>
<programlisting language="xml"><![CDATA[<figure>
<title>HFile Version 1</title>
<mediaobject>
<imageobject>
@ -295,7 +295,7 @@ $ git rebase origin/master
render as block-level elements (they take the whole width of the page), it
is better to mark them up as siblings to the paragraphs around them, like
this:</para>
<programlisting><![CDATA[<para>This is the paragraph.</para>
<programlisting language="xml"><![CDATA[<para>This is the paragraph.</para>
<note>
<para>This is an admonition which occurs after the paragraph.</para>
</note>]]></programlisting>
@ -312,7 +312,7 @@ $ git rebase origin/master
consist of things other than plain text, they need to be wrapped in some
element. If they are plain text, they need to be inclosed in &lt;para&gt;
tags. This is tedious but necessary for validity.</para>
<programlisting><![CDATA[<itemizedlist>
<programlisting language="xml"><![CDATA[<itemizedlist>
<listitem>
<para>This is a paragraph.</para>
</listitem>
@ -367,7 +367,7 @@ $ git rebase origin/master
the content. Also, to avoid having an extra blank line at the
beginning of the programlisting output, do not put the CDATA
element on its own line. For example:</para>
<programlisting><![CDATA[ <programlisting>
<programlisting language="bourne"><![CDATA[ <programlisting>
case $1 in
--cleanZk|--cleanHdfs|--cleanAll)
matches="yes" ;;
@ -396,6 +396,29 @@ esac
especially if you use GUI mode in the editor.</para>
</answer>
</qandaentry>
<qandaentry>
<question>
<para>Syntax Highlighting</para>
</question>
<answer>
<para>The HBase Reference Guide uses the <link
xlink:href="http://sourceforge.net/projects/xslthl/files/xslthl/2.1.0/"
>XSLT Syntax Highlighting</link> Maven module for syntax highlighting.
To enable syntax highlighting for a given &lt;programlisting&gt; or
&lt;screen&gt; (or possibly other elements), add the attribute
<literal>language=<replaceable>LANGUAGE_OF_CHOICE</replaceable></literal>
to the element, as in the following example:</para>
<programlisting language="xml"><![CDATA[
<programlisting language="xml">
<foo>bar</foo>
<bar>foo</bar>
</programlisting>]]></programlisting>
<para>Several syntax types are supported. The most interesting ones for the
HBase Reference Guide are <literal>java</literal>, <literal>xml</literal>,
<literal>sql</literal>, and <literal>bourne</literal> (for BASH shell
output or Linux command-line examples).</para>
</answer>
</qandaentry>
</qandaset>
</section>
</appendix>

View File

@ -300,25 +300,25 @@
<para> A namespace can be created, removed or altered. Namespace membership is determined
during table creation by specifying a fully-qualified table name of the form:</para>
<programlisting><![CDATA[<table namespace>:<table qualifier>]]></programlisting>
<programlisting language="xml"><![CDATA[<table namespace>:<table qualifier>]]></programlisting>
<example>
<title>Examples</title>
<programlisting>
<programlisting language="bourne">
#Create a namespace
create_namespace 'my_ns'
</programlisting>
<programlisting>
<programlisting language="bourne">
#create my_table in my_ns namespace
create 'my_ns:my_table', 'fam'
</programlisting>
<programlisting>
<programlisting language="bourne">
#drop namespace
drop_namespace 'my_ns'
</programlisting>
<programlisting>
<programlisting language="bourne">
#alter namespace
alter_namespace 'my_ns', {METHOD => 'set', 'PROPERTY_NAME' => 'PROPERTY_VALUE'}
</programlisting>
@ -340,7 +340,7 @@ alter_namespace 'my_ns', {METHOD => 'set', 'PROPERTY_NAME' => 'PROPERTY_VALUE'}
<example>
<title>Examples</title>
<programlisting>
<programlisting language="bourne">
#namespace=foo and table qualifier=bar
create 'foo:bar', 'fam'
@ -429,7 +429,7 @@ create 'bar', 'fam'
populated with rows with keys "row1", "row2", "row3", and then another set of rows with
the keys "abc1", "abc2", and "abc3". The following example shows how startRow and stopRow
can be applied to a Scan instance to return the rows beginning with "row".</para>
<programlisting>
<programlisting language="java">
public static final byte[] CF = "cf".getBytes();
public static final byte[] ATTR = "attr".getBytes();
...
@ -562,7 +562,7 @@ try {
xml:id="default_get_example">
<title>Default Get Example</title>
<para>The following Get will only retrieve the current version of the row</para>
<programlisting>
<programlisting language="java">
public static final byte[] CF = "cf".getBytes();
public static final byte[] ATTR = "attr".getBytes();
...
@ -575,7 +575,7 @@ byte[] b = r.getValue(CF, ATTR); // returns current version of value
xml:id="versioned_get_example">
<title>Versioned Get Example</title>
<para>The following Get will return the last 3 versions of the row.</para>
<programlisting>
<programlisting language="java">
public static final byte[] CF = "cf".getBytes();
public static final byte[] ATTR = "attr".getBytes();
...
@ -603,7 +603,7 @@ List&lt;KeyValue&gt; kv = r.getColumn(CF, ATTR); // returns all versions of thi
<title>Implicit Version Example</title>
<para>The following Put will be implicitly versioned by HBase with the current
time.</para>
<programlisting>
<programlisting language="java">
public static final byte[] CF = "cf".getBytes();
public static final byte[] ATTR = "attr".getBytes();
...
@ -616,7 +616,7 @@ htable.put(put);
xml:id="explicit_version_example">
<title>Explicit Version Example</title>
<para>The following Put has the version timestamp explicitly set.</para>
<programlisting>
<programlisting language="java">
public static final byte[] CF = "cf".getBytes();
public static final byte[] ATTR = "attr".getBytes();
...
@ -815,7 +815,7 @@ htable.put(put);
Be sure to use the correct version of the HBase JAR for your system. The backticks
(<literal>`</literal> symbols) cause ths shell to execute the sub-commands, setting the
CLASSPATH as part of the command. This example assumes you use a BASH-compatible shell. </para>
<screen>$ <userinput>HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-0.90.0.jar rowcounter usertable</userinput></screen>
<screen language="bourne">$ <userinput>HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-0.90.0.jar rowcounter usertable</userinput></screen>
<para>When the command runs, internally, the HBase JAR finds the dependencies it needs for
zookeeper, guava, and its other dependencies on the passed <envar>HADOOP_CLASSPATH</envar>
and adds the JARs to the MapReduce job configuration. See the source at
@ -826,7 +826,7 @@ htable.put(put);
<screen>java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapreduce.RowCounter$RowCounterMapper</screen>
<para>If this occurs, try modifying the command as follows, so that it uses the HBase JARs
from the <filename>target/</filename> directory within the build environment.</para>
<screen>$ <userinput>HADOOP_CLASSPATH=${HBASE_HOME}/target/hbase-0.90.0-SNAPSHOT.jar:`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/target/hbase-0.90.0-SNAPSHOT.jar rowcounter usertable</userinput></screen>
<screen language="bourne">$ <userinput>HADOOP_CLASSPATH=${HBASE_HOME}/target/hbase-0.90.0-SNAPSHOT.jar:`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/target/hbase-0.90.0-SNAPSHOT.jar rowcounter usertable</userinput></screen>
</note>
<caution>
<title>Notice to Mapreduce users of HBase 0.96.1 and above</title>
@ -876,14 +876,14 @@ Exception in thread "main" java.lang.IllegalAccessError: class
<code>HADOOP_CLASSPATH</code> environment variable at job submission time. When
launching jobs that package their dependencies, all three of the following job launching
commands satisfy this requirement:</para>
<screen>
<screen language="bourne">
$ <userinput>HADOOP_CLASSPATH=/path/to/hbase-protocol.jar:/path/to/hbase/conf hadoop jar MyJob.jar MyJobMainClass</userinput>
$ <userinput>HADOOP_CLASSPATH=$(hbase mapredcp):/path/to/hbase/conf hadoop jar MyJob.jar MyJobMainClass</userinput>
$ <userinput>HADOOP_CLASSPATH=$(hbase classpath) hadoop jar MyJob.jar MyJobMainClass</userinput>
</screen>
<para>For jars that do not package their dependencies, the following command structure is
necessary:</para>
<screen>
<screen language="bourne">
$ <userinput>HADOOP_CLASSPATH=$(hbase mapredcp):/etc/hbase/conf hadoop jar MyApp.jar MyJobMainClass -libjars $(hbase mapredcp | tr ':' ',')</userinput> ...
</screen>
<para>See also <link
@ -898,7 +898,7 @@ $ <userinput>HADOOP_CLASSPATH=$(hbase mapredcp):/etc/hbase/conf hadoop jar MyApp
<para>The HBase JAR also serves as a Driver for some bundled mapreduce jobs. To learn about
the bundled MapReduce jobs, run the following command.</para>
<screen>$ <userinput>${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-0.90.0-SNAPSHOT.jar</userinput>
<screen language="bourne">$ <userinput>${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-0.90.0-SNAPSHOT.jar</userinput>
<computeroutput>An example program must be given as the first argument.
Valid program names are:
copytable: Export a table from local cluster to peer cluster
@ -910,7 +910,7 @@ Valid program names are:
</screen>
<para>Each of the valid program names are bundled MapReduce jobs. To run one of the jobs,
model your command after the following example.</para>
<screen>$ <userinput>${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-0.90.0-SNAPSHOT.jar rowcounter myTable</userinput></screen>
<screen language="bourne">$ <userinput>${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-0.90.0-SNAPSHOT.jar rowcounter myTable</userinput></screen>
</section>
<section>
@ -972,7 +972,7 @@ Valid program names are:
xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/RowCounter.html">RowCounter</link>
MapReduce job uses <code>TableInputFormat</code> and does a count of all rows in the specified
table. To run it, use the following command: </para>
<screen>$ <userinput>./bin/hadoop jar hbase-X.X.X.jar</userinput></screen>
<screen language="bourne">$ <userinput>./bin/hadoop jar hbase-X.X.X.jar</userinput></screen>
<para>This will
invoke the HBase MapReduce Driver class. Select <literal>rowcounter</literal> from the choice of jobs
offered. This will print rowcouner usage advice to standard output. Specify the tablename,
@ -1011,7 +1011,7 @@ Valid program names are:
<para>The following is an example of using HBase as a MapReduce source in read-only manner.
Specifically, there is a Mapper instance but no Reducer, and nothing is being emitted from
the Mapper. There job would be defined as follows...</para>
<programlisting>
<programlisting language="java">
Configuration config = HBaseConfiguration.create();
Job job = new Job(config, "ExampleRead");
job.setJarByClass(MyReadJob.class); // class that contains mapper
@ -1038,7 +1038,7 @@ if (!b) {
</programlisting>
<para>...and the mapper instance would extend <link
xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableMapper.html">TableMapper</link>...</para>
<programlisting>
<programlisting language="java">
public static class MyMapper extends TableMapper&lt;Text, Text&gt; {
public void map(ImmutableBytesWritable row, Result value, Context context) throws InterruptedException, IOException {
@ -1052,7 +1052,7 @@ public static class MyMapper extends TableMapper&lt;Text, Text&gt; {
<title>HBase MapReduce Read/Write Example</title>
<para>The following is an example of using HBase both as a source and as a sink with
MapReduce. This example will simply copy data from one table to another.</para>
<programlisting>
<programlisting language="java">
Configuration config = HBaseConfiguration.create();
Job job = new Job(config,"ExampleReadWrite");
job.setJarByClass(MyReadWriteJob.class); // class that contains mapper
@ -1091,7 +1091,7 @@ if (!b) {
<para>The following is the example mapper, which will create a <classname>Put</classname>
and matching the input <classname>Result</classname> and emit it. Note: this is what the
CopyTable utility does. </para>
<programlisting>
<programlisting language="java">
public static class MyMapper extends TableMapper&lt;ImmutableBytesWritable, Put&gt; {
public void map(ImmutableBytesWritable row, Result value, Context context) throws IOException, InterruptedException {
@ -1125,7 +1125,7 @@ public static class MyMapper extends TableMapper&lt;ImmutableBytesWritable, Put&
<para>The following example uses HBase as a MapReduce source and sink with a summarization
step. This example will count the number of distinct instances of a value in a table and
write those summarized counts in another table.
<programlisting>
<programlisting language="java">
Configuration config = HBaseConfiguration.create();
Job job = new Job(config,"ExampleSummary");
job.setJarByClass(MySummaryJob.class); // class that contains mapper and reducer
@ -1156,7 +1156,7 @@ if (!b) {
In this example mapper a column with a String-value is chosen as the value to summarize
upon. This value is used as the key to emit from the mapper, and an
<classname>IntWritable</classname> represents an instance counter.
<programlisting>
<programlisting language="java">
public static class MyMapper extends TableMapper&lt;Text, IntWritable&gt; {
public static final byte[] CF = "cf".getBytes();
public static final byte[] ATTR1 = "attr1".getBytes();
@ -1174,7 +1174,7 @@ public static class MyMapper extends TableMapper&lt;Text, IntWritable&gt; {
</programlisting>
In the reducer, the "ones" are counted (just like any other MR example that does this),
and then emits a <classname>Put</classname>.
<programlisting>
<programlisting language="java">
public static class MyTableReducer extends TableReducer&lt;Text, IntWritable, ImmutableBytesWritable&gt; {
public static final byte[] CF = "cf".getBytes();
public static final byte[] COUNT = "count".getBytes();
@ -1199,7 +1199,7 @@ public static class MyTableReducer extends TableReducer&lt;Text, IntWritable, Im
<para>This very similar to the summary example above, with exception that this is using
HBase as a MapReduce source but HDFS as the sink. The differences are in the job setup and
in the reducer. The mapper remains the same. </para>
<programlisting>
<programlisting language="java">
Configuration config = HBaseConfiguration.create();
Job job = new Job(config,"ExampleSummaryToFile");
job.setJarByClass(MySummaryFileJob.class); // class that contains mapper and reducer
@ -1228,7 +1228,7 @@ if (!b) {
<para>As stated above, the previous Mapper can run unchanged with this example. As for the
Reducer, it is a "generic" Reducer instead of extending TableMapper and emitting
Puts.</para>
<programlisting>
<programlisting language="java">
public static class MyReducer extends Reducer&lt;Text, IntWritable, Text, IntWritable&gt; {
public void reduce(Text key, Iterable&lt;IntWritable&gt; values, Context context) throws IOException, InterruptedException {
@ -1268,7 +1268,7 @@ if (!b) {
reducers. Neither is right or wrong, it depends on your use-case. Recognize that the more
reducers that are assigned to the job, the more simultaneous connections to the RDBMS will
be created - this will scale, but only to a point. </para>
<programlisting>
<programlisting language="java">
public static class MyRdbmsReducer extends Reducer&lt;Text, IntWritable, Text, IntWritable&gt; {
private Connection c = null;
@ -1299,7 +1299,7 @@ if (!b) {
<para>Although the framework currently allows one HBase table as input to a MapReduce job,
other HBase tables can be accessed as lookup tables, etc., in a MapReduce job via creating
an HTable instance in the setup method of the Mapper.
<programlisting>public class MyMapper extends TableMapper&lt;Text, LongWritable&gt; {
<programlisting language="java">public class MyMapper extends TableMapper&lt;Text, LongWritable&gt; {
private HTable myOtherTable;
public void setup(Context context) {
@ -1519,11 +1519,11 @@ if (!b) {
xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HBaseConfiguration">HBaseConfiguration</link>
instance. This will ensure sharing of ZooKeeper and socket instances to the RegionServers
which is usually what you want. For example, this is preferred:</para>
<programlisting>HBaseConfiguration conf = HBaseConfiguration.create();
<programlisting language="java">HBaseConfiguration conf = HBaseConfiguration.create();
HTable table1 = new HTable(conf, "myTable");
HTable table2 = new HTable(conf, "myTable");</programlisting>
<para>as opposed to this:</para>
<programlisting>HBaseConfiguration conf1 = HBaseConfiguration.create();
<programlisting language="java">HBaseConfiguration conf1 = HBaseConfiguration.create();
HTable table1 = new HTable(conf1, "myTable");
HBaseConfiguration conf2 = HBaseConfiguration.create();
HTable table2 = new HTable(conf2, "myTable");</programlisting>
@ -1537,7 +1537,7 @@ HTable table2 = new HTable(conf2, "myTable");</programlisting>
the following example:</para>
<example>
<title>Pre-Creating a <code>HConnection</code></title>
<programlisting>// Create a connection to the cluster.
<programlisting language="java">// Create a connection to the cluster.
HConnection connection = HConnectionManager.createConnection(Configuration);
HTableInterface table = connection.getTable("myTable");
// use table as needed, the table returned is lightweight
@ -1594,7 +1594,7 @@ connection.close();</programlisting>
represents a list of Filters with a relationship of <code>FilterList.Operator.MUST_PASS_ALL</code> or
<code>FilterList.Operator.MUST_PASS_ONE</code> between the Filters. The following example shows an 'or' between two
Filters (checking for either 'my value' or 'my other value' on the same attribute).</para>
<programlisting>
<programlisting language="java">
FilterList list = new FilterList(FilterList.Operator.MUST_PASS_ONE);
SingleColumnValueFilter filter1 = new SingleColumnValueFilter(
cf,
@ -1627,7 +1627,7 @@ scan.setFilter(list);
</code>), inequality (<code>CompareOp.NOT_EQUAL</code>), or ranges (e.g.,
<code>CompareOp.GREATER</code>). The following is example of testing equivalence a
column to a String value "my value"...</para>
<programlisting>
<programlisting language="java">
SingleColumnValueFilter filter = new SingleColumnValueFilter(
cf,
column,
@ -1650,7 +1650,7 @@ scan.setFilter(filter);
<para><link
xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/RegexStringComparator.html">RegexStringComparator</link>
supports regular expressions for value comparisons.</para>
<programlisting>
<programlisting language="java">
RegexStringComparator comp = new RegexStringComparator("my."); // any value that starts with 'my'
SingleColumnValueFilter filter = new SingleColumnValueFilter(
cf,
@ -1671,7 +1671,7 @@ scan.setFilter(filter);
xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SubstringComparator.html">SubstringComparator</link>
can be used to determine if a given substring exists in a value. The comparison is
case-insensitive. </para>
<programlisting>
<programlisting language="java">
SubstringComparator comp = new SubstringComparator("y val"); // looking for 'my value'
SingleColumnValueFilter filter = new SingleColumnValueFilter(
cf,
@ -1728,7 +1728,7 @@ scan.setFilter(filter);
<para>Note: The same column qualifier can be used in different column families. This
filter returns all matching columns. </para>
<para>Example: Find all columns in a row and family that start with "abc"</para>
<programlisting>
<programlisting language="java">
HTableInterface t = ...;
byte[] row = ...;
byte[] family = ...;
@ -1758,7 +1758,7 @@ rs.close();
prefixes. It can be used to efficiently get discontinuous sets of columns from very wide
rows. </para>
<para>Example: Find all columns in a row and family that start with "abc" or "xyz"</para>
<programlisting>
<programlisting language="java">
HTableInterface t = ...;
byte[] row = ...;
byte[] family = ...;
@ -1791,7 +1791,7 @@ rs.close();
filter returns all matching columns. </para>
<para>Example: Find all columns in a row and family between "bbbb" (inclusive) and "bbdd"
(inclusive)</para>
<programlisting>
<programlisting language="java">
HTableInterface t = ...;
byte[] row = ...;
byte[] family = ...;
@ -2018,7 +2018,7 @@ rs.close();
was accessed. Catalog tables are configured like this. This group is the last one
considered during evictions.</para>
<para>To mark a column family as in-memory, call
<programlisting>HColumnDescriptor.setInMemory(true);</programlisting> if creating a table from java,
<programlisting language="java">HColumnDescriptor.setInMemory(true);</programlisting> if creating a table from java,
or set <command>IN_MEMORY => true</command> when creating or altering a table in
the shell: e.g. <programlisting>hbase(main):003:0> create 't', {NAME => 'f', IN_MEMORY => 'true'}</programlisting></para>
</listitem>
@ -2218,7 +2218,7 @@ rs.close();
<step>
<para>Next, add the following configuration to the RegionServer's
<filename>hbase-site.xml</filename>.</para>
<programlisting>
<programlisting language="xml">
<![CDATA[<property>
<name>hbase.bucketcache.ioengine</name>
<value>offheap</value>
@ -2461,7 +2461,7 @@ rs.close();
ZooKeeper splitlog node (<filename>/hbase/splitlog</filename>) as tasks. You can
view the contents of the splitlog by issuing the following
<command>zkcli</command> command. Example output is shown.</para>
<screen>ls /hbase/splitlog
<screen language="bourne">ls /hbase/splitlog
[hdfs%3A%2F%2Fhost2.sample.com%3A56020%2Fhbase%2F.logs%2Fhost8.sample.com%2C57020%2C1340474893275-splitting%2Fhost8.sample.com%253A57020.1340474893900,
hdfs%3A%2F%2Fhost2.sample.com%3A56020%2Fhbase%2F.logs%2Fhost3.sample.com%2C57020%2C1340474893299-splitting%2Fhost3.sample.com%253A57020.1340474893931,
hdfs%3A%2F%2Fhost2.sample.com%3A56020%2Fhbase%2F.logs%2Fhost4.sample.com%2C57020%2C1340474893287-splitting%2Fhost4.sample.com%253A57020.1340474893946]
@ -2846,7 +2846,7 @@ ctime = Sat Jun 23 11:13:40 PDT 2012
Typically a custom split policy should extend HBase's default split policy: <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/ConstantSizeRegionSplitPolicy.html">ConstantSizeRegionSplitPolicy</link>.
</para>
<para>The policy can set globally through the HBaseConfiguration used or on a per table basis:
<programlisting>
<programlisting language="java">
HTableDescriptor myHtd = ...;
myHtd.setValue(HTableDescriptor.SPLIT_POLICY, MyCustomSplitPolicy.class.getName());
</programlisting>
@ -2867,7 +2867,7 @@ myHtd.setValue(HTableDescriptor.SPLIT_POLICY, MyCustomSplitPolicy.class.getName(
opens merged region on the regionserver and reports the merge to Master at last.
</para>
<para>An example of region merges in the hbase shell
<programlisting>$ hbase> merge_region 'ENCODED_REGIONNAME', 'ENCODED_REGIONNAME'
<programlisting language="bourne">$ hbase> merge_region 'ENCODED_REGIONNAME', 'ENCODED_REGIONNAME'
hbase> merge_region 'ENCODED_REGIONNAME', 'ENCODED_REGIONNAME', true
</programlisting>
It's an asynchronous operation and call returns immediately without waiting merge completed.
@ -2969,10 +2969,10 @@ myHtd.setValue(HTableDescriptor.SPLIT_POLICY, MyCustomSplitPolicy.class.getName(
<para>To view a textualized version of hfile content, you can do use
the <classname>org.apache.hadoop.hbase.io.hfile.HFile
</classname>tool. Type the following to see usage:<programlisting><code>$ ${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.io.hfile.HFile </code> </programlisting>For
</classname>tool. Type the following to see usage:<programlisting language="bourne"><code>$ ${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.io.hfile.HFile </code> </programlisting>For
example, to view the content of the file
<filename>hdfs://10.81.47.41:8020/hbase/TEST/1418428042/DSMP/4759508618286845475</filename>,
type the following:<programlisting> <code>$ ${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.io.hfile.HFile -v -f hdfs://10.81.47.41:8020/hbase/TEST/1418428042/DSMP/4759508618286845475 </code> </programlisting>If
type the following:<programlisting language="bourne"> <code>$ ${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.io.hfile.HFile -v -f hdfs://10.81.47.41:8020/hbase/TEST/1418428042/DSMP/4759508618286845475 </code> </programlisting>If
you leave off the option -v to see just a summary on the hfile. See
usage for other things to do with the <classname>HFile</classname>
tool.</para>
@ -3818,7 +3818,7 @@ myHtd.setValue(HTableDescriptor.SPLIT_POLICY, MyCustomSplitPolicy.class.getName(
<step>
<para>Run one of following commands in the HBase shell. Replace the table name
<literal>orders_table</literal> with the name of your table.</para>
<screen>
<screen language="sql">
<userinput>alter 'orders_table', CONFIGURATION => {'hbase.hstore.engine.class' => 'org.apache.hadoop.hbase.regionserver.StripeStoreEngine', 'hbase.hstore.blockingStoreFiles' => '100'}</userinput>
<userinput>alter 'orders_table', {NAME => 'blobs_cf', CONFIGURATION => {'hbase.hstore.engine.class' => 'org.apache.hadoop.hbase.regionserver.StripeStoreEngine', 'hbase.hstore.blockingStoreFiles' => '100'}}</userinput>
<userinput>create 'orders_table', 'blobs_cf', CONFIGURATION => {'hbase.hstore.engine.class' => 'org.apache.hadoop.hbase.regionserver.StripeStoreEngine', 'hbase.hstore.blockingStoreFiles' => '100'}</userinput>
@ -3842,7 +3842,7 @@ myHtd.setValue(HTableDescriptor.SPLIT_POLICY, MyCustomSplitPolicy.class.getName(
<para>Set the <varname>hbase.hstore.engine.class</varname> option to either nil or
<literal>org.apache.hadoop.hbase.regionserver.DefaultStoreEngine</literal>.
Either option has the same effect.</para>
<screen>
<screen language="sql">
<userinput>alter 'orders_table', CONFIGURATION => {'hbase.hstore.engine.class' => ''}</userinput>
</screen>
</step>
@ -3861,7 +3861,7 @@ myHtd.setValue(HTableDescriptor.SPLIT_POLICY, MyCustomSplitPolicy.class.getName(
column family, after disabling the table. If you use HBase shell, the general
command pattern is as follows:</para>
<programlisting>
<programlisting language="sql">
alter 'orders_table', CONFIGURATION => {'key' => 'value', ..., 'key' => 'value'}}
</programlisting>
<section
@ -4061,7 +4061,7 @@ alter 'orders_table', CONFIGURATION => {'key' => 'value', ..., 'key' => 'value'}
where <code>importtsv</code> or your MapReduce job put its results, and
the table name to import into. For example:
</para>
<screen>$ hadoop jar hbase-VERSION.jar completebulkload [-c /path/to/hbase/config/hbase-site.xml] /user/todd/myoutput mytable</screen>
<screen language="bourne">$ hadoop jar hbase-VERSION.jar completebulkload [-c /path/to/hbase/config/hbase-site.xml] /user/todd/myoutput mytable</screen>
<para>
The <code>-c config-file</code> option can be used to specify a file
containing the appropriate hbase parameters (e.g., hbase-site.xml) if
@ -4143,7 +4143,7 @@ alter 'orders_table', CONFIGURATION => {'key' => 'value', ..., 'key' => 'value'}
<title>Timeline Consistency </title>
<para>
With this feature, HBase introduces a Consistency definition, which can be provided per read operation (get or scan).
<programlisting>
<programlisting language="java">
public enum Consistency {
STRONG,
TIMELINE
@ -4254,7 +4254,7 @@ public enum Consistency {
</para>
<section>
<title>Server side properties</title>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.regionserver.storefile.refresh.period</name>
<value>0</value>
@ -4274,7 +4274,7 @@ public enum Consistency {
<title>Client side properties</title>
<para> Ensure to set the following for all clients (and servers) that will use region
replicas. </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.ipc.client.allowsInterrupt</name>
<value>true</value>
@ -4325,7 +4325,7 @@ flush 't1'
</section>
<section><title>Java</title>
<programlisting><![CDATA[
<programlisting language="java"><![CDATA[
HTableDescriptor htd = new HTableDesctiptor(TableName.valueOf(“test_table”));
htd.setRegionReplication(2);
...
@ -4362,7 +4362,7 @@ hbase(main):001:0> get 't1','r6', {CONSISTENCY => "TIMELINE"}
]]></programlisting>
<para> You can simulate a region server pausing or becoming unavailable and do a read from
the secondary replica: </para>
<programlisting><![CDATA[
<programlisting language="bourne"><![CDATA[
$ kill -STOP <pid or primary region server>
hbase(main):001:0> get 't1','r6', {CONSISTENCY => "TIMELINE"}
@ -4376,14 +4376,14 @@ hbase> scan 't1', {CONSISTENCY => 'TIMELINE'}
<title>Java</title>
<para>You can set set the consistency for Gets and Scans and do requests as
follows.</para>
<programlisting><![CDATA[
<programlisting language="java"><![CDATA[
Get get = new Get(row);
get.setConsistency(Consistency.TIMELINE);
...
Result result = table.get(get);
]]></programlisting>
<para>You can also pass multiple gets: </para>
<programlisting><![CDATA[
<programlisting language="java"><![CDATA[
Get get1 = new Get(row);
get1.setConsistency(Consistency.TIMELINE);
...
@ -4393,7 +4393,7 @@ gets.add(get1);
Result[] results = table.get(gets);
]]></programlisting>
<para>And Scans: </para>
<programlisting><![CDATA[
<programlisting language="java"><![CDATA[
Scan scan = new Scan();
scan.setConsistency(Consistency.TIMELINE);
...
@ -4402,7 +4402,7 @@ ResultScanner scanner = table.getScanner(scan);
<para>You can inspect whether the results are coming from primary region or not by calling
the Result.isStale() method: </para>
<programlisting><![CDATA[
<programlisting language="java"><![CDATA[
Result result = table.get(get);
if (result.isStale()) {
...
@ -4649,7 +4649,7 @@ identifying mode and a multi-phase read-write repair mode.
<section>
<title>Running hbck to identify inconsistencies</title>
<para>To check to see if your HBase cluster has corruptions, run hbck against your HBase cluster:</para>
<programlisting>
<programlisting language="bourne">
$ ./bin/hbase hbck
</programlisting>
<para>
@ -4661,13 +4661,13 @@ A run of hbck will report a list of inconsistencies along with a brief descripti
tables affected. The using the <code>-details</code> option will report more details including a representative
listing of all the splits present in all the tables.
</para>
<programlisting>
<programlisting language="bourne">
$ ./bin/hbase hbck -details
</programlisting>
<para>If you just want to know if some tables are corrupted, you can limit hbck to identify inconsistencies
in only specific tables. For example the following command would only attempt to check table
TableFoo and TableBar. The benefit is that hbck will run in less time.</para>
<programlisting>
<programlisting language="bourne">
$ ./bin/hbase hbck TableFoo TableBar
</programlisting>
</section>
@ -4726,12 +4726,12 @@ assigned or multiply assigned regions.</para>
</itemizedlist>
To fix deployment and assignment problems you can run this command:
</para>
<programlisting>
<programlisting language="bourne">
$ ./bin/hbase hbck -fixAssignments
</programlisting>
<para>To fix deployment and assignment problems as well as repairing incorrect meta rows you can
run this command:</para>
<programlisting>
<programlisting language="bourne">
$ ./bin/hbase hbck -fixAssignments -fixMeta
</programlisting>
<para>There are a few classes of table integrity problems that are low risk repairs. The first two are
@ -4743,12 +4743,12 @@ The third low-risk class is hdfs region holes. This can be repaired by using the
If holes are detected you can use -fixHdfsHoles and should include -fixMeta and -fixAssignments to make the new region consistent.</para>
</listitem>
</itemizedlist>
<programlisting>
<programlisting language="bourne">
$ ./bin/hbase hbck -fixAssignments -fixMeta -fixHdfsHoles
</programlisting>
<para>Since this is a common operation, weve added a the <code>-repairHoles</code> flag that is equivalent to the
previous command:</para>
<programlisting>
<programlisting language="bourne">
$ ./bin/hbase hbck -repairHoles
</programlisting>
<para>If inconsistencies still remain after these steps, you most likely have table integrity problems
@ -4800,14 +4800,14 @@ integrity options.</para>
</itemizedlist>
<para>Finally, there are safeguards to limit repairs to only specific tables. For example the following
command would only attempt to check and repair table TableFoo and TableBar.</para>
<screen>
<screen language="bourne">
$ ./bin/hbase hbck -repair TableFoo TableBar
</screen>
<section><title>Special cases: Meta is not properly assigned</title>
<para>There are a few special cases that hbck can handle as well.
Sometimes the meta tables only region is inconsistently assigned or deployed. In this case
there is a special <code>-fixMetaOnly</code> option that can try to fix meta assignments.</para>
<screen>
<screen language="bourne">
$ ./bin/hbase hbck -fixMetaOnly -fixAssignments
</screen>
</section>
@ -4825,7 +4825,7 @@ directory, loads as much information from region metadata files (.regioninfo fil
from the file system. If the region metadata has proper table integrity, it sidelines the original root
and meta table directories, and builds new ones with pointers to the region directories and their
data.</para>
<screen>
<screen language="bourne">
$ ./bin/hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair
</screen>
<para>NOTE: This tool is not as clever as uberhbck but can be used to bootstrap repairs that uberhbck
@ -5085,7 +5085,7 @@ This option should not normally be used, and it is not in <code>-fixAll</code>.
linkend="hbase.native.platform" />), you can make a symbolic link from HBase to the native Hadoop
libraries. This assumes the two software installs are colocated. For example, if my
'platform' is Linux-amd64-64:
<programlisting>$ cd $HBASE_HOME
<programlisting language="bourne">$ cd $HBASE_HOME
$ mkdir lib/native
$ ln -s $HADOOP_HOME/lib/native lib/native/Linux-amd64-64</programlisting>
Use the compression tool to check that LZ4 is installed on all nodes. Start up (or restart)
@ -5128,7 +5128,7 @@ hbase(main):003:0> <userinput>alter 'TestTable', {NAME => 'info', COMPRESSION =>
<title>CompressionTest</title>
<para>You can use the CompressionTest tool to verify that your compressor is available to
HBase:</para>
<screen>
<screen language="bourne">
$ hbase org.apache.hadoop.hbase.util.CompressionTest hdfs://<replaceable>host/path/to/hbase</replaceable> snappy
</screen>
</section>
@ -5192,7 +5192,7 @@ DESCRIPTION ENABLED
parameter, usage advice is printed for each option.</para>
<example>
<title><command>LoadTestTool</command> Usage</title>
<screen><![CDATA[
<screen language="bourne"><![CDATA[
$ bin/hbase org.apache.hadoop.hbase.util.LoadTestTool -h
usage: bin/hbase org.apache.hadoop.hbase.util.LoadTestTool <options>
Options:
@ -5248,7 +5248,7 @@ Options:
</example>
<example>
<title>Example Usage of LoadTestTool</title>
<screen>
<screen language="bourne">
$ hbase org.apache.hadoop.hbase.util.LoadTestTool -write 1:10:100 -num_keys 1000000
-read 100:30 -num_tables 1 -data_block_encoding NONE -tn load_test_tool_NONE
</screen>

View File

@ -145,7 +145,7 @@
some unusual anomalies, namely interface errors, overruns, framing errors. While not
unheard of, these kinds of errors are exceedingly rare on modern hardware which is
operating as it should: </para>
<screen>
<screen language="bourne">
$ /sbin/ifconfig bond0
bond0 Link encap:Ethernet HWaddr 00:00:00:00:00:00
inet addr:10.x.x.x Bcast:10.x.x.255 Mask:255.255.255.0
@ -160,7 +160,7 @@ RX bytes:2416328868676 (2.4 TB) TX bytes:3464991094001 (3.4 TB)
running an ICMP ping from an external host and observing round-trip-time in excess of
700ms, and by running <code>ethtool(8)</code> on the members of the bond interface and
discovering that the active interface was operating at 100Mbs/, full duplex. </para>
<screen>
<screen language="bourne">
$ sudo ethtool eth0
Settings for eth0:
Supported ports: [ TP ]

View File

@ -520,16 +520,16 @@ Index: pom.xml
<listitem>
<para>Type the following commands:</para>
<para>
<programlisting><![CDATA[$ protoc -Isrc/main/protobuf --java_out=src/main/java src/main/protobuf/hbase.proto]]></programlisting>
<programlisting language="bourne"><![CDATA[$ protoc -Isrc/main/protobuf --java_out=src/main/java src/main/protobuf/hbase.proto]]></programlisting>
</para>
<para>
<programlisting><![CDATA[$ protoc -Isrc/main/protobuf --java_out=src/main/java src/main/protobuf/ErrorHandling.proto]]></programlisting>
<programlisting language="bourne"><![CDATA[$ protoc -Isrc/main/protobuf --java_out=src/main/java src/main/protobuf/ErrorHandling.proto]]></programlisting>
</para>
</listitem>
</itemizedlist>
<para> Building against the hadoop 2 profile by running something like the
following command: </para>
<screen>$ mvn clean install assembly:single -Dhadoop.profile=2.0 -DskipTests</screen>
<screen language="bourne">$ mvn clean install assembly:single -Dhadoop.profile=2.0 -DskipTests</screen>
</footnote></entry>
<entry>S</entry>
<entry>S</entry>
@ -615,7 +615,7 @@ Index: pom.xml
<filename>hbase-site.xml</filename> -- and on the serverside in
<filename>hdfs-site.xml</filename> (The sync facility HBase needs is a subset of the
append code path).</para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>dfs.support.append</name>
<value>true</value>
@ -644,7 +644,7 @@ Index: pom.xml
Hadoop's <filename>conf/hdfs-site.xml</filename>, setting the
<varname>dfs.datanode.max.transfer.threads</varname> value to at least the following:
</para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>dfs.datanode.max.transfer.threads</name>
<value>4096</value>
@ -779,7 +779,7 @@ Index: pom.xml
configuration parameters. Most HBase configuration directives have default values, which
are used unless the value is overridden in the <filename>hbase-site.xml</filename>. See <xref
linkend="config.files" /> for more information.</para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<configuration>
<property>
<name>hbase.rootdir</name>
@ -891,7 +891,7 @@ node-c.example.com
finally disable and drop your tables.</para>
<para>To stop HBase after exiting the HBase shell enter</para>
<screen>$ ./bin/stop-hbase.sh
<screen language="bourne">$ ./bin/stop-hbase.sh
stopping hbase...............</screen>
<para>Shutdown can take a moment to complete. It can take longer if your cluster is comprised
of many machines. If you are running a distributed operation, be sure to wait until HBase
@ -1063,7 +1063,7 @@ slf4j-log4j (slf4j-log4j12-1.5.8.jar)
zookeeper (zookeeper-3.4.2.jar)</programlisting>
</para>
<para> An example basic <filename>hbase-site.xml</filename> for client only might look as
follows: <programlisting><![CDATA[
follows: <programlisting language="xml"><![CDATA[
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
@ -1090,7 +1090,7 @@ zookeeper (zookeeper-3.4.2.jar)</programlisting>
<filename>hbase.X.X.X.jar</filename>). It is also possible to specify configuration
directly without having to read from a <filename>hbase-site.xml</filename>. For example,
to set the ZooKeeper ensemble for the cluster programmatically do as follows:
<programlisting>Configuration config = HBaseConfiguration.create();
<programlisting language="java">Configuration config = HBaseConfiguration.create();
config.set("hbase.zookeeper.quorum", "localhost"); // Here we are running zookeeper locally</programlisting>
If multiple ZooKeeper instances make up your ZooKeeper ensemble, they may be specified in
a comma-separated list (just as in the <filename>hbase-site.xml</filename> file). This
@ -1126,7 +1126,7 @@ config.set("hbase.zookeeper.quorum", "localhost"); // Here we are running zooke
xml:id="hbase_site">
<title><filename>hbase-site.xml</filename></title>
<programlisting>
<programlisting language="bourne">
<![CDATA[
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
@ -1140,7 +1140,7 @@ config.set("hbase.zookeeper.quorum", "localhost"); // Here we are running zooke
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/export/zookeeper</value>
<description>Property from ZooKeeper's config zoo.cfg.
<description>Property from ZooKeeper config zoo.cfg.
The directory where the snapshot is stored.
</description>
</property>
@ -1191,7 +1191,7 @@ example9
<filename>hbase-env.sh</filename> file. Here we are setting the HBase heap to be 4G
instead of the default 1G.</para>
<screen>
<screen language="bourne">
$ git diff hbase-env.sh
diff --git a/conf/hbase-env.sh b/conf/hbase-env.sh
@ -1476,7 +1476,7 @@ index e70ebc6..96f8c27 100644
running on a late-version HDFS so you have the fixes he refers too and himself adds to HDFS that help HBase MTTR
(e.g. HDFS-3703, HDFS-3712, and HDFS-4791 -- hadoop 2 for sure has them and late hadoop 1 has some).
Set the following in the RegionServer.</para>
<programlisting>
<programlisting language="xml">
<![CDATA[<property>
<property>
<name>hbase.lease.recovery.dfs.timeout</name>
@ -1493,7 +1493,7 @@ index e70ebc6..96f8c27 100644
<para>And on the namenode/datanode side, set the following to enable 'staleness' introduced
in HDFS-3703, HDFS-3912. </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>dfs.client.socket-timeout</name>
<value>10000</value>
@ -1550,7 +1550,7 @@ index e70ebc6..96f8c27 100644
<para>As an alternative, You can use the coprocessor-based JMX implementation provided
by HBase. To enable it in 0.99 or above, add below property in
<filename>hbase-site.xml</filename>:
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.coprocessor.regionserver.classes</name>
<value>org.apache.hadoop.hbase.JMXListener</value>
@ -1566,7 +1566,7 @@ index e70ebc6..96f8c27 100644
By default, the JMX listens on TCP port 10102, you can further configure the port
using below properties:
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>regionserver.rmi.registry.port</name>
<value>61130</value>
@ -1584,7 +1584,7 @@ index e70ebc6..96f8c27 100644
<para>By default the password authentication and SSL communication is disabled.
To enable password authentication, you need to update <filename>hbase-env.sh</filename>
like below:
<screen>
<screen language="bourne">
export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.authenticate=true \
-Dcom.sun.management.jmxremote.password.file=your_password_file \
-Dcom.sun.management.jmxremote.access.file=your_access_file"
@ -1596,7 +1596,7 @@ export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE "
</para>
<para>To enable SSL communication with password authentication, follow below steps:
<screen>
<screen language="bourne">
#1. generate a key pair, stored in myKeyStore
keytool -genkey -alias jconsole -keystore myKeyStore
@ -1607,10 +1607,10 @@ keytool -export -alias jconsole -keystore myKeyStore -file jconsole.cert
keytool -import -alias jconsole -keystore jconsoleKeyStore -file jconsole.cert
</screen>
And then update <filename>hbase-env.sh</filename> like below:
<screen>
<screen language="bourne">
export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=true \
-Djavax.net.ssl.keyStore=/home/tianq/myKeyStore \
-Djavax.net.ssl.keyStorePassword=your_password_in_step_#1 \
-Djavax.net.ssl.keyStorePassword=your_password_in_step_1 \
-Dcom.sun.management.jmxremote.authenticate=true \
-Dcom.sun.management.jmxremote.password.file=your_password file \
-Dcom.sun.management.jmxremote.access.file=your_access_file"
@ -1620,13 +1620,13 @@ export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE "
</screen>
Finally start jconsole on client using the key store:
<screen>
<screen language="bourne">
jconsole -J-Djavax.net.ssl.trustStore=/home/tianq/jconsoleKeyStore
</screen>
</para>
<para>NOTE: for HBase 0.98, To enable the HBase JMX implementation on Master, you also
need to add below property in <filename>hbase-site.xml</filename>:
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.coprocessor.master.classes</name>
<value>org.apache.hadoop.hbase.JMXListener</value>

View File

@ -265,7 +265,7 @@
<example>
<title>Example RegionObserver Configuration</title>
<para>In this example, one RegionObserver is configured for all the HBase tables.</para>
<screen><![CDATA[
<screen language="xml"><![CDATA[
<property>
<name>hbase.coprocessor.region.classes</name>
<value>org.apache.hadoop.hbase.coprocessor.AggregateImplementation</value>

View File

@ -22,6 +22,7 @@
*/
-->
<xsl:import href="urn:docbkx:stylesheet"/>
<xsl:import href="urn:docbkx:stylesheet/highlight.xsl"/>
<xsl:output method="html" encoding="UTF-8" indent="no"/>
<xsl:template name="user.header.content">

View File

@ -90,9 +90,9 @@
<section xml:id="eclipse.commandline">
<title>Import into eclipse with the command line</title>
<para>For those not inclined to use m2eclipse, you can generate the Eclipse files from the command line. First, run (you should only have to do this once):
<programlisting>mvn clean install -DskipTests</programlisting>
<programlisting language="bourne">mvn clean install -DskipTests</programlisting>
and then close Eclipse and execute...
<programlisting>mvn eclipse:eclipse</programlisting>
<programlisting language="bourne">mvn eclipse:eclipse</programlisting>
... from your local HBase project directory in your workspace to generate some new <filename>.project</filename>
and <filename>.classpath</filename>files. Then reopen Eclipse, or refresh your eclipse project (F5), and import
the .project file in the HBase directory to a workspace.
@ -136,11 +136,11 @@ Access restriction: The method getLong(Object, long) from the type Unsafe is not
<title>Basic Compile</title>
<para>Thanks to maven, building HBase is pretty easy. You can read about the various maven commands in <xref linkend="maven.build.commands"/>,
but the simplest command to compile HBase from its java source code is:
<programlisting>
<programlisting language="bourne">
mvn package -DskipTests
</programlisting>
Or, to clean up before compiling:
<programlisting>
<programlisting language="bourne">
mvn clean package -DskipTests
</programlisting>
With Eclipse set up as explained above in <xref linkend="eclipse"/>, you can also simply use the build command in Eclipse.
@ -152,14 +152,14 @@ mvn clean package -DskipTests
<para>
The protobuf files are located <filename>hbase-protocol/src/main/protobuf</filename>.
For the change to be effective, you will need to regenerate the classes. You can use maven profile compile-protobuf to do this.
<programlisting>
<programlisting language="bourne">
mvn compile -Dcompile-protobuf
or
mvn compile -Pcompile-protobuf
</programlisting>
You may also want to define protoc.path for the protoc binary
<programlisting>
<programlisting language="bourne">
mvn compile -Dcompile-protobuf -Dprotoc.path=/opt/local/bin/protoc
</programlisting> Read the <filename>hbase-protocol/README.txt</filename> for more details.
</para>
@ -212,7 +212,7 @@ mvn compile -Dcompile-protobuf -Dprotoc.path=/opt/local/bin/protoc
build do this for you, you need to make sure you have a properly configured
<filename>settings.xml</filename> in your local repository under <filename>.m2</filename>.
Here is my <filename>~/.m2/settings.xml</filename>.
<programlisting><![CDATA[<settings xmlns="http://maven.apache.org/SETTINGS/1.0.0"
<programlisting language="xml"><![CDATA[<settings xmlns="http://maven.apache.org/SETTINGS/1.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0
http://maven.apache.org/xsd/settings-1.0.0.xsd">
@ -287,7 +287,7 @@ under the respective release documentation folders.
publish a SNAPSHOT, you must keep the <emphasis>-SNAPSHOT</emphasis> suffix on the hbase version.
The <link xlink:href="http://mojo.codehaus.org/versions-maven-plugin/">Versions Maven Plugin</link> can be of use here. To
set a version in all the many poms of the hbase multi-module project, do something like this:
<programlisting>$ mvn clean org.codehaus.mojo:versions-maven-plugin:1.3.1:set -DnewVersion=0.96.0</programlisting>
<programlisting language="bourne">$ mvn clean org.codehaus.mojo:versions-maven-plugin:1.3.1:set -DnewVersion=0.96.0</programlisting>
Checkin the <filename>CHANGES.txt</filename> and any version changes.
</para>
<para>
@ -296,7 +296,7 @@ under the respective release documentation folders.
</para>
<para>Now, build the src tarball. This tarball is hadoop version independent. It is just the pure src code and documentation without a particular hadoop taint, etc.
Add the <varname>-Prelease</varname> profile when building; it checks files for licenses and will fail the build if unlicensed files present.
<programlisting>$ MAVEN_OPTS="-Xmx2g" mvn clean install -DskipTests assembly:single -Dassembly.file=hbase-assembly/src/main/assembly/src.xml -Prelease</programlisting>
<programlisting language="bourne">$ MAVEN_OPTS="-Xmx2g" mvn clean install -DskipTests assembly:single -Dassembly.file=hbase-assembly/src/main/assembly/src.xml -Prelease</programlisting>
Undo the tarball and make sure it looks good. A good test for the src tarball being 'complete' is to see if
you can build new tarballs from this source bundle.
If the source tarball is good, save it off to a <emphasis>version directory</emphasis>, i.e a directory somewhere where you are collecting
@ -309,7 +309,7 @@ under the respective release documentation folders.
Do it in two steps. First install into the local repository and then generate documentation and assemble the tarball
(Otherwise build complains that hbase modules are not in maven repo when we try to do it all in the one go especially on fresh repo).
It seems that you need the install goal in both steps.
<programlisting>$ MAVEN_OPTS="-Xmx3g" mvn clean install -DskipTests -Prelease
<programlisting language="bourne">$ MAVEN_OPTS="-Xmx3g" mvn clean install -DskipTests -Prelease
$ MAVEN_OPTS="-Xmx3g" mvn install -DskipTests site assembly:single -Prelease</programlisting>
Undo the generated tarball and check it out. Look at doc. and see if it runs, etc.
If good, copy the tarball to the above mentioned <emphasis>version directory</emphasis>.
@ -320,7 +320,7 @@ If good, copy the tarball to the above mentioned <emphasis>version directory</em
This time we use the <varname>apache-release</varname> profile instead of just <varname>release</varname> profile when doing mvn deploy;
it will invoke the apache pom referenced by our poms. It will also sign your artifacts published to mvn as long as your settings.xml in your local <filename>.m2</filename>
repository is configured correctly (your <filename>settings.xml</filename> adds your gpg password property to the apache profile).
<programlisting>$ MAVEN_OPTS="-Xmx3g" mvn deploy -DskipTests -Papache-release</programlisting>
<programlisting language="bourne">$ MAVEN_OPTS="-Xmx3g" mvn deploy -DskipTests -Papache-release</programlisting>
The last command above copies all artifacts up to a temporary staging apache mvn repo in an 'open' state.
We'll need to do more work on these maven artifacts to make them generally available.
</para>
@ -379,7 +379,7 @@ or borked, just delete the 'open' staged artifacts.
<para>
If all checks out, next put the <emphasis>version directory</emphasis> up on people.apache.org. You will need to sign and fingerprint them before you
push them up. In the <emphasis>version directory</emphasis> do this:
<programlisting>$ for i in *.tar.gz; do echo $i; gpg --print-mds $i > $i.mds ; done
<programlisting language="bourne">$ for i in *.tar.gz; do echo $i; gpg --print-mds $i > $i.mds ; done
$ for i in *.tar.gz; do echo $i; gpg --armor --output $i.asc --detach-sig $i ; done
$ cd ..
# Presuming our 'version directory' is named 0.96.0RC0, now copy it up to people.apache.org.
@ -396,7 +396,7 @@ $ rsync -av 0.96.0RC0 people.apache.org:public_html
<para>Make sure your <filename>settings.xml</filename> is set up properly (see above for how).
Make sure the hbase version includes <varname>-SNAPSHOT</varname> as a suffix. Here is how I published SNAPSHOTS of
a release that had an hbase version of 0.96.0 in its poms.
<programlisting>$ MAVEN_OPTS="-Xmx3g" mvn clean install -DskipTests javadoc:aggregate site assembly:single -Prelease
<programlisting language="bourne">$ MAVEN_OPTS="-Xmx3g" mvn clean install -DskipTests javadoc:aggregate site assembly:single -Prelease
$ MAVEN_OPTS="-Xmx3g" mvn -DskipTests deploy -Papache-release</programlisting>
</para>
<para>The <filename>make_rc.sh</filename> script mentioned above in the
@ -436,7 +436,7 @@ $ rsync -av 0.96.0RC0 people.apache.org:public_html
(see <xref linkend="submitting.patches"/>). Your Jira should contain a summary of the changes in each
section (see <link xlink:href="https://issues.apache.org/jira/browse/HBASE-6081">HBASE-6081</link> for an example).</para>
<para>To generate the site locally while you're working on it, run:
<programlisting>mvn site</programlisting>
<programlisting language="bourne">mvn site</programlisting>
Then you can load up the generated HTML files in your browser (file are under <filename>/target/site</filename>).</para>
</section>
<section xml:id="hbase.org.site.publishing">
@ -446,14 +446,14 @@ $ rsync -av 0.96.0RC0 people.apache.org:public_html
Finally, check it in. For example, if trunk is checked out out at <filename>/Users/stack/checkouts/trunk</filename>
and the hbase website, hbase.apache.org, is checked out at <filename>/Users/stack/checkouts/hbase.apache.org/trunk</filename>, to update
the site, do the following:
<programlisting>
<programlisting language="bourne">
# Build the site and deploy it to the checked out directory
# Getting the javadoc into site is a little tricky. You have to build it before you invoke 'site'.
$ MAVEN_OPTS=" -Xmx3g" mvn clean install -DskipTests javadoc:aggregate site site:stage -DstagingDirectory=/Users/stack/checkouts/hbase.apache.org/trunk
</programlisting>
Now check the deployed site by viewing in a brower, browse to file:////Users/stack/checkouts/hbase.apache.org/trunk/index.html and check all is good.
If all checks out, commit it and your new build will show up immediately at http://hbase.apache.org
<programlisting>
<programlisting language="bourne">
$ cd /Users/stack/checkouts/hbase.apache.org/trunk
$ svn status
# Do an svn add of any new content...
@ -500,16 +500,16 @@ HBase have a character not usually seen in other projects.</para>
<title>Running Tests in other Modules</title>
<para>If the module you are developing in has no other dependencies on other HBase modules, then
you can cd into that module and just run:</para>
<programlisting>mvn test</programlisting>
<programlisting language="bourne">mvn test</programlisting>
<para>which will just run the tests IN THAT MODULE. If there are other dependencies on other modules,
then you will have run the command from the ROOT HBASE DIRECTORY. This will run the tests in the other
modules, unless you specify to skip the tests in that module. For instance, to skip the tests in the hbase-server module,
you would run:</para>
<programlisting>mvn clean test -PskipServerTests</programlisting>
<programlisting language="bourne">mvn clean test -PskipServerTests</programlisting>
<para>from the top level directory to run all the tests in modules other than hbase-server. Note that you
can specify to skip tests in multiple modules as well as just for a single module. For example, to skip
the tests in <classname>hbase-server</classname> and <classname>hbase-common</classname>, you would run:</para>
<programlisting>mvn clean test -PskipServerTests -PskipCommonTests</programlisting>
<programlisting language="bourne">mvn clean test -PskipServerTests -PskipCommonTests</programlisting>
<para>Also, keep in mind that if you are running tests in the <classname>hbase-server</classname> module you will need to
apply the maven profiles discussed in <xref linkend="hbase.unittests.cmds"/> to get the tests to run properly.</para>
</section>
@ -522,7 +522,7 @@ integration with corresponding JUnit <link xlink:href="http://www.junit.org/node
<classname>SmallTests</classname>, <classname>MediumTests</classname>,
<classname>LargeTests</classname>, <classname>IntegrationTests</classname>.
JUnit categories are denoted using java annotations and look like this in your unit test code.</para>
<programlisting>...
<programlisting language="java">...
@Category(SmallTests.class)
public class TestHRegionInfo {
@Test
@ -589,7 +589,7 @@ public class TestHRegionInfo {
<section
xml:id="hbase.unittests.cmds.test">
<title>Default: small and medium category tests </title>
<para>Running <programlisting>mvn test</programlisting> will execute all small tests
<para>Running <programlisting language="bourne">mvn test</programlisting> will execute all small tests
in a single JVM (no fork) and then medium tests in a separate JVM for each test
instance. Medium tests are NOT executed if there is an error in a small test.
Large tests are NOT executed. There is one report for small tests, and one
@ -599,7 +599,7 @@ public class TestHRegionInfo {
<section
xml:id="hbase.unittests.cmds.test.runAllTests">
<title>Running all tests</title>
<para>Running <programlisting>mvn test -P runAllTests</programlisting> will execute
<para>Running <programlisting language="bourne">mvn test -P runAllTests</programlisting> will execute
small tests in a single JVM then medium and large tests in a separate JVM for
each test. Medium and large tests are NOT executed if there is an error in a
small test. Large tests are NOT executed if there is an error in a small or
@ -611,11 +611,11 @@ public class TestHRegionInfo {
xml:id="hbase.unittests.cmds.test.localtests.mytest">
<title>Running a single test or all tests in a package</title>
<para>To run an individual test, e.g. <classname>MyTest</classname>, do
<programlisting>mvn test -Dtest=MyTest</programlisting> You can also pass
<programlisting language="bourne">mvn test -Dtest=MyTest</programlisting> You can also pass
multiple, individual tests as a comma-delimited list:
<programlisting>mvn test -Dtest=MyTest1,MyTest2,MyTest3</programlisting> You can
<programlisting language="bourne">mvn test -Dtest=MyTest1,MyTest2,MyTest3</programlisting> You can
also pass a package, which will run all tests under the package:
<programlisting>mvn test '-Dtest=org.apache.hadoop.hbase.client.*'</programlisting>
<programlisting language="bourne">mvn test '-Dtest=org.apache.hadoop.hbase.client.*'</programlisting>
</para>
<para> When <code>-Dtest</code> is specified, <code>localTests</code> profile will
@ -656,10 +656,10 @@ public class TestHRegionInfo {
can as well use a ramdisk. You will need 2Gb of memory to run all tests. You
will also need to delete the files between two test run. The typical way to
configure a ramdisk on Linux is:</para>
<screen>$ sudo mkdir /ram2G
<screen language="bourne">$ sudo mkdir /ram2G
sudo mount -t tmpfs -o size=2048M tmpfs /ram2G</screen>
<para>You can then use it to run all HBase tests with the command: </para>
<screen>mvn test
<screen language="bourne">mvn test
-P runAllTests -Dsurefire.secondPartThreadCount=12
-Dtest.build.data.basedirectory=/ram2G</screen>
</section>
@ -848,7 +848,7 @@ ConnectionCount=1 (was 1) </screen>
tests that are in the HBase integration test group. After you have completed
<command>mvn install -DskipTests</command> You can run just the integration
tests by invoking:</para>
<programlisting>
<programlisting language="bourne">
cd hbase-it
mvn verify</programlisting>
<para>If you just want to run the integration tests in top-level, you need to run
@ -890,9 +890,9 @@ mvn verify</programlisting>
<para> If you have an already-setup HBase cluster, you can launch the integration
tests by invoking the class <code>IntegrationTestsDriver</code>. You may have to
run test-compile first. The configuration will be picked by the bin/hbase
script. <programlisting>mvn test-compile</programlisting> Then launch the tests
script. <programlisting language="bourne">mvn test-compile</programlisting> Then launch the tests
with:</para>
<programlisting>bin/hbase [--config config_dir] org.apache.hadoop.hbase.IntegrationTestsDriver</programlisting>
<programlisting language="bourne">bin/hbase [--config config_dir] org.apache.hadoop.hbase.IntegrationTestsDriver</programlisting>
<para>Pass <code>-h</code> to get usage on this sweet tool. Running the
IntegrationTestsDriver without any argument will launch tests found under
<code>hbase-it/src/test</code>, having
@ -968,7 +968,7 @@ mvn verify</programlisting>
ChaosMonkey uses the configuration from the bin/hbase script, thus no extra
configuration needs to be done. You can invoke the ChaosMonkey by
running:</para>
<programlisting>bin/hbase org.apache.hadoop.hbase.util.ChaosMonkey</programlisting>
<programlisting language="bourne">bin/hbase org.apache.hadoop.hbase.util.ChaosMonkey</programlisting>
<para> This will output smt like: </para>
<screen>
12/11/19 23:21:57 INFO util.ChaosMonkey: Using ChaosMonkey Policy: class org.apache.hadoop.hbase.util.ChaosMonkey$PeriodicRandomActionPolicy, period:60000
@ -1021,7 +1021,7 @@ As you can see from the log, ChaosMonkey started the default PeriodicRandomActio
<classname>org.apache.hadoop.hbase.chaos.factories.MonkeyConstants</classname> class.
If any chaos monkey configuration is missing from the property file, then the default values are assumed.
For example:</para>
<programlisting>
<programlisting language="bourne">
$<userinput>bin/hbase org.apache.hadoop.hbase.IntegrationTestIngest -m slowDeterministic -monkeyProps monkey.properties</userinput>
</programlisting>
<para>The above command will start the integration tests and chaos monkey passing the properties file <filename>monkey.properties</filename>.
@ -1046,7 +1046,7 @@ batch.restart.rs.ratio=0.4f
</para>
<section xml:id="maven.build.commands.compile">
<title>Compile</title>
<programlisting>
<programlisting language="bourne">
mvn compile
</programlisting>
</section>
@ -1063,7 +1063,7 @@ mvn compile
By default, in 0.96 and earlier, we will build with Hadoop-1.0.x.
As of 0.98, Hadoop 1.x is deprecated and Hadoop 2.x is the default.
To change the version to build against, add a hadoop.profile property when you invoke <command>mvn</command>:</para>
<programlisting>mvn -Dhadoop.profile=1.0 ...</programlisting>
<programlisting language="bourne">mvn -Dhadoop.profile=1.0 ...</programlisting>
<para>
The above will build against whatever explicit hadoop 1.x version we have in our <filename>pom.xml</filename> as our '1.0' version.
Tests may not all pass so you may need to pass <code>-DskipTests</code> unless you are inclined to fix the failing tests.</para>
@ -1083,7 +1083,7 @@ pecularity that is probably fixable but we've not spent the time trying to figur
<para> In earilier versions of Apache HBase, you can build against older versions of Apache
Hadoop, notably, Hadoop 0.22.x and 0.23.x. If you are running, for example
HBase-0.94 and wanted to build against Hadoop 0.23.x, you would run with:</para>
<programlisting>mvn -Dhadoop.profile=22 ...</programlisting>
<programlisting language="bourne">mvn -Dhadoop.profile=22 ...</programlisting>
</section>
</section>
@ -1154,7 +1154,7 @@ pecularity that is probably fixable but we've not spent the time trying to figur
<para>HBase uses <link
xlink:href="http://junit.org">JUnit</link> 4 for unit tests</para>
<para>This example will add unit tests to the following example class:</para>
<programlisting>
<programlisting language="java">
public class MyHBaseDAO {
public static void insertRecord(HTableInterface table, HBaseTestObj obj)
@ -1174,7 +1174,7 @@ public class MyHBaseDAO {
}
</programlisting>
<para>The first step is to add JUnit dependencies to your Maven POM file:</para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
@ -1184,7 +1184,7 @@ public class MyHBaseDAO {
]]></programlisting>
<para>Next, add some unit tests to your code. Tests are annotated with
<literal>@Test</literal>. Here, the unit tests are in bold.</para>
<programlisting>
<programlisting language="java">
public class TestMyHbaseDAOData {
@Test
public void testCreatePut() throws Exception {
@ -1222,7 +1222,7 @@ public class TestMyHbaseDAOData {
linkend="unit.tests" />, to test the <code>insertRecord</code>
method.</para>
<para>First, add a dependency for Mockito to your Maven POM file.</para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<dependency>
<groupId>org.mockito</groupId>
<artifactId>mockito-all</artifactId>
@ -1232,7 +1232,7 @@ public class TestMyHbaseDAOData {
]]></programlisting>
<para>Next, add a <code>@RunWith</code> annotation to your test class, to direct it
to use Mockito.</para>
<programlisting>
<programlisting language="java">
<userinput>@RunWith(MockitoJUnitRunner.class)</userinput>
public class TestMyHBaseDAO{
@Mock
@ -1283,7 +1283,7 @@ public class TestMyHBaseDAO{
<literal>MyTest</literal>, which has one column family called
<literal>CF</literal>, the reducer of such a job could look like the
following:</para>
<programlisting><![CDATA[
<programlisting language="java"><![CDATA[
public class MyReducer extends TableReducer<Text, Text, ImmutableBytesWritable> {
public static final byte[] CF = "CF".getBytes();
public static final byte[] QUALIFIER = "CQ-1".getBytes();
@ -1304,7 +1304,7 @@ public class MyReducer extends TableReducer<Text, Text, ImmutableBytesWritable>
</programlisting>
<para>To test this code, the first step is to add a dependency to MRUnit to your
Maven POM file. </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<dependency>
<groupId>org.apache.mrunit</groupId>
<artifactId>mrunit</artifactId>
@ -1313,7 +1313,7 @@ public class MyReducer extends TableReducer<Text, Text, ImmutableBytesWritable>
</dependency>
]]></programlisting>
<para>Next, use the ReducerDriver provided by MRUnit, in your Reducer job.</para>
<programlisting><![CDATA[
<programlisting language="java"><![CDATA[
public class MyReducerTest {
ReduceDriver<Text, Text, ImmutableBytesWritable, Writable> reduceDriver;
byte[] CF = "CF".getBytes();
@ -1367,7 +1367,7 @@ strValue2 = "DATA2";
tests using a <firstterm>mini-cluster</firstterm>. The first step is to add some
dependencies to your Maven POM file. Check the versions to be sure they are
appropriate.</para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
@ -1401,7 +1401,7 @@ strValue2 = "DATA2";
]]></programlisting>
<para>This code represents an integration test for the MyDAO insert shown in <xref
linkend="unit.tests" />.</para>
<programlisting>
<programlisting language="java">
public class MyHBaseIntegrationTest {
private static HBaseTestingUtility utility;
byte[] CF = "CF".getBytes();
@ -1567,12 +1567,12 @@ public class MyHBaseIntegrationTest {
<para>If you are developing Apache HBase, frequently it is useful to test your changes against a more-real cluster than what you find in unit tests. In this case, HBase can be run directly from the source in local-mode.
All you need to do is run:
</para>
<programlisting>${HBASE_HOME}/bin/start-hbase.sh</programlisting>
<programlisting language="bourne">${HBASE_HOME}/bin/start-hbase.sh</programlisting>
<para>
This will spin up a full local-cluster, just as if you had packaged up HBase and installed it on your machine.
</para>
<para>Keep in mind that you will need to have installed HBase into your local maven repository for the in-situ cluster to work properly. That is, you will need to run:</para>
<programlisting>mvn clean install -DskipTests</programlisting>
<programlisting language="bourne">mvn clean install -DskipTests</programlisting>
<para>to ensure that maven can find the correct classpath and dependencies. Generally, the above command
is just a good thing to try running first, if maven is acting oddly.</para>
</section> <!-- run.insitu -->
@ -1638,7 +1638,7 @@ public class MyHBaseIntegrationTest {
selected resource when generating the patch is a directory. Patch files can reflect changes in multiple files. </para>
<para>
Generating patches using git:</para>
<screen>$ git diff --no-prefix > HBASE_XXXX.patch</screen>
<screen language="bourne">$ git diff --no-prefix > HBASE_XXXX.patch</screen>
<para>
Don't forget the 'no-prefix' option; and generate the diff from the root directory of project
</para>
@ -1686,20 +1686,20 @@ public class MyHBaseIntegrationTest {
<section xml:id="common.patch.feedback.space.invaders">
<title>Space Invaders</title>
<para>Rather than do this...
<programlisting>
<programlisting language="java">
if ( foo.equals( bar ) ) { // don't do this
</programlisting>
... do this instead...
<programlisting>
<programlisting language="java">
if (foo.equals(bar)) {
</programlisting>
</para>
<para>Also, rather than do this...
<programlisting>
<programlisting language="java">
foo = barArray[ i ]; // don't do this
</programlisting>
... do this instead...
<programlisting>
<programlisting language="java">
foo = barArray[i];
</programlisting>
</para>
@ -1707,12 +1707,12 @@ foo = barArray[i];
<section xml:id="common.patch.feedback.autogen">
<title>Auto Generated Code</title>
<para>Auto-generated code in Eclipse often looks like this...
<programlisting>
<programlisting language="java">
public void readFields(DataInput arg0) throws IOException { // don't do this
foo = arg0.readUTF(); // don't do this
</programlisting>
... do this instead ...
<programlisting>
<programlisting language="java">
public void readFields(DataInput di) throws IOException {
foo = di.readUTF();
</programlisting>
@ -1723,11 +1723,11 @@ foo = barArray[i];
<title>Long Lines</title>
<para>
Keep lines less than 100 characters.
<programlisting>
<programlisting language="java">
Bar bar = foo.veryLongMethodWithManyArguments(argument1, argument2, argument3, argument4, argument5, argument6, argument7, argument8, argument9); // don't do this
</programlisting>
... do something like this instead ...
<programlisting>
<programlisting language="java">
Bar bar = foo.veryLongMethodWithManyArguments(
argument1, argument2, argument3,argument4, argument5, argument6, argument7, argument8, argument9);
</programlisting>
@ -1737,8 +1737,8 @@ Bar bar = foo.veryLongMethodWithManyArguments(
<title>Trailing Spaces</title>
<para>
This happens more than people would imagine.
<programlisting>
Bar bar = foo.getBar(); &lt;--- imagine there's an extra space(s) after the semicolon instead of a line break.
<programlisting language="java">
Bar bar = foo.getBar(); &lt;--- imagine there is an extra space(s) after the semicolon instead of a line break.
</programlisting>
Make sure there's a line-break after the end of your code, and also avoid lines that have nothing
but whitespace.
@ -1772,7 +1772,7 @@ Bar bar = foo.getBar(); &lt;--- imagine there's an extra space(s) after the
findbugs files locally. Sometimes, you may have to write code smarter than
Findbugs. You can annotate your code to tell Findbugs you know what you're
doing, by annotating your class with:
<programlisting>@edu.umd.cs.findbugs.annotations.SuppressWarnings(
<programlisting language="java">@edu.umd.cs.findbugs.annotations.SuppressWarnings(
value="HE_EQUALS_USE_HASHCODE",
justification="I know what I'm doing")</programlisting>
</para>
@ -1785,7 +1785,7 @@ Bar bar = foo.getBar(); &lt;--- imagine there's an extra space(s) after the
<title>Javadoc - Useless Defaults</title>
<para>Don't just leave the @param arguments the way your IDE generated them. Don't do
this...</para>
<programlisting>
<programlisting language="java">
/**
*
* @param bar &lt;---- don't do this!!!!
@ -1853,7 +1853,7 @@ Bar bar = foo.getBar(); &lt;--- imagine there's an extra space(s) after the
<para>patch 1:</para>
<itemizedlist>
<listitem>
<screen>$ git diff --no-prefix > HBASE_XXXX-1.patch</screen>
<screen language="bourne">$ git diff --no-prefix > HBASE_XXXX-1.patch</screen>
</listitem>
</itemizedlist>
</listitem>
@ -1862,12 +1862,12 @@ Bar bar = foo.getBar(); &lt;--- imagine there's an extra space(s) after the
<itemizedlist>
<listitem>
<para>create a new git branch</para>
<screen>$ git checkout -b my_branch</screen>
<screen language="bourne">$ git checkout -b my_branch</screen>
</listitem>
<listitem>
<para>save your work</para>
<screen>$ git add file1 file2 </screen>
<screen>$ git commit -am 'saved after HBASE_XXXX-1.patch'</screen>
<screen language="bourne">$ git add file1 file2 </screen>
<screen language="bourne">$ git commit -am 'saved after HBASE_XXXX-1.patch'</screen>
<para>now you have your own branch, that is different from remote
master branch</para>
</listitem>
@ -1876,7 +1876,7 @@ Bar bar = foo.getBar(); &lt;--- imagine there's an extra space(s) after the
</listitem>
<listitem>
<para>create second patch</para>
<screen>$ git diff --no-prefix > HBASE_XXXX-2.patch</screen>
<screen language="bourne">$ git diff --no-prefix > HBASE_XXXX-2.patch</screen>
</listitem>
</itemizedlist>

View File

@ -111,7 +111,7 @@
</step>
<step>
<para>Extract the downloaded file, and change to the newly-created directory.</para>
<screen>
<screen language="bourne">
$ tar xzvf hbase-<![CDATA[<?eval ${project.version}?>]]>-hadoop2-bin.tar.gz
$ cd hbase-<![CDATA[<?eval ${project.version}?>]]>-hadoop2/
</screen>
@ -127,7 +127,7 @@ $ cd hbase-<![CDATA[<?eval ${project.version}?>]]>-hadoop2/
<markup>&lt;configuration&gt;</markup> tags, which should be empty in a new HBase install.</para>
<example>
<title>Example <filename>hbase-site.xml</filename> for Standalone HBase</title>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<configuration>
<property>
<name>hbase.rootdir</name>
@ -168,7 +168,7 @@ $ cd hbase-<![CDATA[<?eval ${project.version}?>]]>-hadoop2/
install. In this example, some usage and version information that is printed when you
start HBase Shell has been omitted. The HBase Shell prompt ends with a
<literal>&gt;</literal> character.</para>
<screen>
<screen language="bourne">
$ <userinput>./bin/hbase shell</userinput>
hbase(main):001:0&gt;
</screen>
@ -283,7 +283,7 @@ hbase&gt; drop 'test'
<para>In the same way that the <filename>bin/start-hbase.sh</filename> script is provided
to conveniently start all HBase daemons, the <filename>bin/stop-hbase.sh</filename>
script stops them.</para>
<screen>
<screen language="bourne">
$ ./bin/stop-hbase.sh
stopping hbase....................
$
@ -335,7 +335,7 @@ $
property <code>hbase.master.wait.on.regionservers.mintostart</code> should be set to
<code>1</code> (Its default is changed to <code>2</code> since version 1.0.0).
</para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
@ -348,7 +348,7 @@ $
<para>Next, change the <code>hbase.rootdir</code> from the local filesystem to the address
of your HDFS instance, using the <code>hdfs:////</code> URI syntax. In this example,
HDFS is running on the localhost at port 8020.</para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:8020/hbase</value>
@ -371,7 +371,7 @@ $
configuration above, it is stored in <filename>/hbase/</filename> on HDFS. You can use
the <command>hadoop fs</command> command in Hadoop's <filename>bin/</filename> directory
to list this directory.</para>
<screen>
<screen language="bourne">
$ <userinput>./bin/hadoop fs -ls /hbase</userinput>
Found 7 items
drwxr-xr-x - hbase users 0 2014-06-25 18:58 /hbase/.tmp
@ -404,7 +404,7 @@ drwxr-xr-x - hbase users 0 2014-06-25 21:49 /hbase/oldWALs
using an offset of 2, the backup HMaster would use ports 16012, 16022, and 16032. The
following command starts 3 backup servers using ports 16012/16022/16032, 16013/16023/16033,
and 16015/16025/16035.</para>
<screen>
<screen language="bourne">
$ ./bin/local-master-backup.sh 2 3 5
</screen>
<para>To kill a backup master without killing the entire cluster, you need to find its
@ -413,7 +413,7 @@ $ ./bin/local-master-backup.sh 2 3 5
The only contents of the file are the PID. You can use the <command>kill -9</command>
command to kill that PID. The following command will kill the master with port offset 1,
but leave the cluster running:</para>
<screen>
<screen language="bourne">
$ cat /tmp/hbase-testuser-1-master.pid |xargs kill -9
</screen>
</step>
@ -432,13 +432,13 @@ $ cat /tmp/hbase-testuser-1-master.pid |xargs kill -9
You can run 99 additional RegionServers that are not a HMaster or backup HMaster,
on a server. The following command starts four additional RegionServers, running on
sequential ports starting at 16202/16302 (base ports 16200/16300 plus 2).</para>
<screen>
<screen language="bourne">
$ .bin/local-regionservers.sh start 2 3 4 5
</screen>
<para>To stop a RegionServer manually, use the <command>local-regionservers.sh</command>
command with the <literal>stop</literal> parameter and the offset of the server to
stop.</para>
<screen>$ .bin/local-regionservers.sh stop 3</screen>
<screen language="bourne">$ .bin/local-regionservers.sh stop 3</screen>
</step>
<step>
<title>Stop HBase.</title>
@ -510,7 +510,7 @@ $ .bin/local-regionservers.sh start 2 3 4 5
<para>While logged in as the user who will run HBase, generate a SSH key pair, using the
following command:
</para>
<screen>$ ssh-keygen -t rsa</screen>
<screen language="bourne">$ ssh-keygen -t rsa</screen>
<para>If the command succeeds, the location of the key pair is printed to standard output.
The default name of the public key is <filename>id_rsa.pub</filename>.</para>
</step>
@ -528,7 +528,7 @@ $ .bin/local-regionservers.sh start 2 3 4 5
not already exist</emphasis>, and append the contents of the
<filename>id_rsa.pub</filename> file to the end of it. Note that you also need to do
this for <code>node-a</code> itself.</para>
<screen>$ cat id_rsa.pub &gt;&gt; ~/.ssh/authorized_keys</screen>
<screen language="bourne">$ cat id_rsa.pub &gt;&gt; ~/.ssh/authorized_keys</screen>
</step>
<step>
<title>Test password-less login.</title>
@ -574,7 +574,7 @@ $ .bin/local-regionservers.sh start 2 3 4 5
ZooKeeper instance on each node of the cluster.</para>
<para>On <code>node-a</code>, edit <filename>conf/hbase-site.xml</filename> and add the
following properties.</para>
<programlisting><![CDATA[
<programlisting language="bourne"><![CDATA[
<property>
<name>hbase.zookeeper.quorum</name>
<value>node-a.example.com,node-b.example.com,node-c.example.com</value>
@ -623,7 +623,7 @@ $ .bin/local-regionservers.sh start 2 3 4 5
<title>Start the cluster.</title>
<para>On <code>node-a</code>, issue the <command>start-hbase.sh</command> command. Your
output will be similar to that below.</para>
<screen>
<screen language="bourne">
$ <userinput>bin/start-hbase.sh</userinput>
node-c.example.com: starting zookeeper, logging to /home/hbuser/hbase-0.98.3-hadoop2/bin/../logs/hbase-hbuser-zookeeper-node-c.example.com.out
node-a.example.com: starting zookeeper, logging to /home/hbuser/hbase-0.98.3-hadoop2/bin/../logs/hbase-hbuser-zookeeper-node-a.example.com.out
@ -643,7 +643,7 @@ node-b.example.com: starting master, logging to /home/hbuser/hbase-0.98.3-hadoop
running on your servers as well, if they are used for other purposes.</para>
<example>
<title><code>node-a</code> <command>jps</command> Output</title>
<screen>
<screen language="bourne">
$ <userinput>jps</userinput>
20355 Jps
20071 HQuorumPeer
@ -652,7 +652,7 @@ $ <userinput>jps</userinput>
</example>
<example>
<title><code>node-b</code> <command>jps</command> Output</title>
<screen>
<screen language="bourne">
$ <userinput>jps</userinput>
15930 HRegionServer
16194 Jps
@ -662,7 +662,7 @@ $ <userinput>jps</userinput>
</example>
<example>
<title><code>node-c</code> <command>jps</command> Output</title>
<screen>
<screen language="bourne">
$ <userinput>jps</userinput>
13901 Jps
13639 HQuorumPeer

View File

@ -40,7 +40,7 @@
<example>
<title>Create a Table Using Java</title>
<para>This example has been tested on HBase 0.96.1.1.</para>
<programlisting>
<programlisting language="java">
package com.example.hbase.admin;
import java.io.IOException;
@ -90,7 +90,7 @@ public class CreateSchema {
<example>
<title>Add, Modify, and Delete a Table</title>
<para>This example has been tested on HBase 0.96.1.1.</para>
<programlisting>
<programlisting language="java">
public static void upgradeFrom0 (Configuration config) {
try {

View File

@ -46,7 +46,7 @@
<para> There is a Canary class can help users to canary-test the HBase cluster status, with
every column-family for every regions or regionservers granularity. To see the usage, use
the <literal>--help</literal> parameter. </para>
<screen>$ ${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.tool.Canary -help
<screen language="bourne">$ ${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.tool.Canary -help
Usage: bin/hbase org.apache.hadoop.hbase.tool.Canary [opts] [table1 [table2]...] | [regionserver1 [regionserver2]..]
where [opts] are:
@ -61,7 +61,7 @@ Usage: bin/hbase org.apache.hadoop.hbase.tool.Canary [opts] [table1 [table2]...]
-t &lt;N> timeout for a check, default is 600000 (milliseconds)</screen>
<para> This tool will return non zero error codes to user for collaborating with other
monitoring tools, such as Nagios. The error code definitions are: </para>
<programlisting>private static final int USAGE_EXIT_CODE = 1;
<programlisting language="java">private static final int USAGE_EXIT_CODE = 1;
private static final int INIT_ERROR_EXIT_CODE = 2;
private static final int TIMEOUT_ERROR_EXIT_CODE = 3;
private static final int ERROR_EXIT_CODE = 4;</programlisting>
@ -113,7 +113,7 @@ private static final int ERROR_EXIT_CODE = 4;</programlisting>
<para> Following are some examples based on the previous given case. </para>
<section>
<title>Canary test for every column family (store) of every region of every table</title>
<screen>$ ${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.tool.Canary
<screen language="bourne">$ ${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.tool.Canary
3/12/09 03:26:32 INFO tool.Canary: read from region test-01,,1386230156732.0e3c7d77ffb6361ea1b996ac1042ca9a. column family cf1 in 2ms
13/12/09 03:26:32 INFO tool.Canary: read from region test-01,,1386230156732.0e3c7d77ffb6361ea1b996ac1042ca9a. column family cf2 in 2ms
@ -134,14 +134,14 @@ private static final int ERROR_EXIT_CODE = 4;</programlisting>
<title>Canary test for every column family (store) of every region of specific
table(s)</title>
<para> You can also test one or more specific tables.</para>
<screen>$ ${HBASE_HOME}/bin/hbase orghapache.hadoop.hbase.tool.Canary test-01 test-02</screen>
<screen language="bourne">$ ${HBASE_HOME}/bin/hbase orghapache.hadoop.hbase.tool.Canary test-01 test-02</screen>
</section>
<section>
<title>Canary test with regionserver granularity</title>
<para> This will pick one small piece of data from each regionserver, and can also put your
resionserver name as input options for canary-test specific regionservers.</para>
<screen>$ ${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.tool.Canary -regionserver
<screen language="bourne">$ ${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.tool.Canary -regionserver
13/12/09 06:05:17 INFO tool.Canary: Read from table:test-01 on region server:rs2 in 72ms
13/12/09 06:05:17 INFO tool.Canary: Read from table:test-02 on region server:rs3 in 34ms
@ -150,7 +150,7 @@ private static final int ERROR_EXIT_CODE = 4;</programlisting>
<section>
<title>Canary test with regular expression pattern</title>
<para> This will test both table test-01 and test-02.</para>
<screen>$ ${HBASE_HOME}/bin/hbase orghapache.hadoop.hbase.tool.Canary -e test-0[1-2]</screen>
<screen language="bourne">$ ${HBASE_HOME}/bin/hbase orghapache.hadoop.hbase.tool.Canary -e test-0[1-2]</screen>
</section>
<section>
@ -158,10 +158,10 @@ private static final int ERROR_EXIT_CODE = 4;</programlisting>
<para> Run repeatedly with interval defined in option -interval whose default value is 6
seconds. This daemon will stop itself and return non-zero error code if any error occurs,
due to the default value of option -f is true.</para>
<screen>$ ${HBASE_HOME}/bin/hbase orghapache.hadoop.hbase.tool.Canary -daemon</screen>
<screen language="bourne">$ ${HBASE_HOME}/bin/hbase orghapache.hadoop.hbase.tool.Canary -daemon</screen>
<para>Run repeatedly with internal 5 seconds and will not stop itself even error occurs in
the test.</para>
<screen>$ ${HBASE_HOME}/bin/hbase orghapache.hadoop.hbase.tool.Canary -daemon -interval 50000 -f false</screen>
<screen language="bourne">$ ${HBASE_HOME}/bin/hbase orghapache.hadoop.hbase.tool.Canary -daemon -interval 50000 -f false</screen>
</section>
<section>
@ -171,7 +171,7 @@ private static final int ERROR_EXIT_CODE = 4;</programlisting>
Master, which would bring the clients hung. So we provide the timeout option to kill the
canary test forcefully and return non-zero error code as well. This run sets the timeout
value to 60 seconds, the default value is 600 seconds.</para>
<screen>$ ${HBASE_HOME}/bin/hbase orghapache.hadoop.hbase.tool.Canary -t 600000</screen>
<screen language="bourne">$ ${HBASE_HOME}/bin/hbase orghapache.hadoop.hbase.tool.Canary -t 600000</screen>
</section>
</section>
@ -194,7 +194,7 @@ private static final int ERROR_EXIT_CODE = 4;</programlisting>
<replaceable>UtilityName</replaceable> with the utility you want to run. This command
assumes you have set the environment variable <literal>HBASE_HOME</literal> to the directory
where HBase is unpacked on your server.</para>
<screen>
<screen language="bourne">
${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.mapreduce.<replaceable>UtilityName</replaceable>
</screen>
<para>The following utilities are available:</para>
@ -267,13 +267,13 @@ ${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.mapreduce.<replaceable>UtilityNa
<filename>recovered.edits</filename>. directory.</para>
<para>You can get a textual dump of a WAL file content by doing the following:</para>
<screen> $ ./bin/hbase org.apache.hadoop.hbase.regionserver.wal.FSHLog --dump hdfs://example.org:8020/hbase/.logs/example.org,60020,1283516293161/10.10.21.10%3A60020.1283973724012 </screen>
<screen language="bourne"> $ ./bin/hbase org.apache.hadoop.hbase.regionserver.wal.FSHLog --dump hdfs://example.org:8020/hbase/.logs/example.org,60020,1283516293161/10.10.21.10%3A60020.1283973724012 </screen>
<para>The return code will be non-zero if issues with the file so you can test wholesomeness
of file by redirecting <varname>STDOUT</varname> to <code>/dev/null</code> and testing the
program return.</para>
<para>Similarly you can force a split of a log file directory by doing:</para>
<screen> $ ./bin/hbase org.apache.hadoop.hbase.regionserver.wal.FSHLog --split hdfs://example.org:8020/hbase/.logs/example.org,60020,1283516293161/</screen>
<screen language="bourne"> $ ./bin/hbase org.apache.hadoop.hbase.regionserver.wal.FSHLog --split hdfs://example.org:8020/hbase/.logs/example.org,60020,1283516293161/</screen>
<section
xml:id="hlog_tool.prettyprint">
@ -297,7 +297,7 @@ ${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.mapreduce.<replaceable>UtilityNa
cluster or another cluster. The target table must first exist. The usage is as
follows:</para>
<screen>
<screen language="bourne">
$ <userinput>./bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --help </userinput>
/bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --help
Usage: CopyTable [general options] [--starttime=X] [--endtime=Y] [--new.name=NEW] [--peer.adr=ADR] &lt;tablename&gt;
@ -355,7 +355,7 @@ For performance consider the following general options:
<title>Export</title>
<para>Export is a utility that will dump the contents of table to HDFS in a sequence file.
Invoke via:</para>
<screen>$ bin/hbase org.apache.hadoop.hbase.mapreduce.Export &lt;tablename&gt; &lt;outputdir&gt; [&lt;versions&gt; [&lt;starttime&gt; [&lt;endtime&gt;]]]
<screen language="bourne">$ bin/hbase org.apache.hadoop.hbase.mapreduce.Export &lt;tablename&gt; &lt;outputdir&gt; [&lt;versions&gt; [&lt;starttime&gt; [&lt;endtime&gt;]]]
</screen>
<para>Note: caching for the input Scan is configured via
@ -366,11 +366,11 @@ For performance consider the following general options:
<title>Import</title>
<para>Import is a utility that will load data that has been exported back into HBase. Invoke
via:</para>
<screen>$ bin/hbase org.apache.hadoop.hbase.mapreduce.Import &lt;tablename&gt; &lt;inputdir&gt;
<screen language="bourne">$ bin/hbase org.apache.hadoop.hbase.mapreduce.Import &lt;tablename&gt; &lt;inputdir&gt;
</screen>
<para>To import 0.94 exported files in a 0.96 cluster or onwards, you need to set system
property "hbase.import.version" when running the import command as below:</para>
<screen>$ bin/hbase -Dhbase.import.version=0.94 org.apache.hadoop.hbase.mapreduce.Import &lt;tablename&gt; &lt;inputdir&gt;
<screen language="bourne">$ bin/hbase -Dhbase.import.version=0.94 org.apache.hadoop.hbase.mapreduce.Import &lt;tablename&gt; &lt;inputdir&gt;
</screen>
</section>
<section
@ -380,11 +380,11 @@ For performance consider the following general options:
usages: loading data from TSV format in HDFS into HBase via Puts, and preparing StoreFiles
to be loaded via the <code>completebulkload</code>. </para>
<para>To load data via Puts (i.e., non-bulk loading):</para>
<screen>$ bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=a,b,c &lt;tablename&gt; &lt;hdfs-inputdir&gt;
<screen language="bourne">$ bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=a,b,c &lt;tablename&gt; &lt;hdfs-inputdir&gt;
</screen>
<para>To generate StoreFiles for bulk-loading:</para>
<programlisting>$ bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=a,b,c -Dimporttsv.bulk.output=hdfs://storefile-outputdir &lt;tablename&gt; &lt;hdfs-data-inputdir&gt;
<programlisting language="bourne">$ bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=a,b,c -Dimporttsv.bulk.output=hdfs://storefile-outputdir &lt;tablename&gt; &lt;hdfs-data-inputdir&gt;
</programlisting>
<para>These generated StoreFiles can be loaded into HBase via <xref
linkend="completebulkload" />. </para>
@ -438,7 +438,7 @@ row10 c1 c2
</screen>
</para>
<para>For ImportTsv to use this imput file, the command line needs to look like this:</para>
<screen>
<screen language="bourne">
HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-VERSION.jar importtsv -Dimporttsv.columns=HBASE_ROW_KEY,d:c1,d:c2 -Dimporttsv.bulk.output=hdfs://storefileoutput datatsv hdfs://inputfile
</screen>
<para> ... and in this example the first column is the rowkey, which is why the
@ -467,10 +467,10 @@ row10 c1 c2
linkend="importtsv" />. </para>
<para>There are two ways to invoke this utility, with explicit classname and via the
driver:</para>
<screen>$ bin/hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles &lt;hdfs://storefileoutput&gt; &lt;tablename&gt;
<screen language="bourne">$ bin/hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles &lt;hdfs://storefileoutput&gt; &lt;tablename&gt;
</screen>
<para> .. and via the Driver..</para>
<screen>HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-VERSION.jar completebulkload &lt;hdfs://storefileoutput&gt; &lt;tablename&gt;
<screen language="bourne">HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-VERSION.jar completebulkload &lt;hdfs://storefileoutput&gt; &lt;tablename&gt;
</screen>
<section
xml:id="completebulkload.warning">
@ -493,10 +493,10 @@ row10 c1 c2
<para>WALPlayer can also generate HFiles for later bulk importing, in that case only a single
table and no mapping can be specified. </para>
<para>Invoke via:</para>
<screen>$ bin/hbase org.apache.hadoop.hbase.mapreduce.WALPlayer [options] &lt;wal inputdir&gt; &lt;tables&gt; [&lt;tableMappings>]&gt;
<screen language="bourne">$ bin/hbase org.apache.hadoop.hbase.mapreduce.WALPlayer [options] &lt;wal inputdir&gt; &lt;tables&gt; [&lt;tableMappings>]&gt;
</screen>
<para>For example:</para>
<screen>$ bin/hbase org.apache.hadoop.hbase.mapreduce.WALPlayer /backuplogdir oldTable1,oldTable2 newTable1,newTable2
<screen language="bourne">$ bin/hbase org.apache.hadoop.hbase.mapreduce.WALPlayer /backuplogdir oldTable1,oldTable2 newTable1,newTable2
</screen>
<para> WALPlayer, by default, runs as a mapreduce job. To NOT run WALPlayer as a mapreduce job
on your cluster, force it to run all in the local process by adding the flags
@ -511,7 +511,7 @@ row10 c1 c2
sanity check to ensure that HBase can read all the blocks of a table if there are any
concerns of metadata inconsistency. It will run the mapreduce all in a single process but it
will run faster if you have a MapReduce cluster in place for it to exploit.</para>
<screen>$ bin/hbase org.apache.hadoop.hbase.mapreduce.RowCounter &lt;tablename&gt; [&lt;column1&gt; &lt;column2&gt;...]
<screen language="bourne">$ bin/hbase org.apache.hadoop.hbase.mapreduce.RowCounter &lt;tablename&gt; [&lt;column1&gt; &lt;column2&gt;...]
</screen>
<para>Note: caching for the input Scan is configured via
<code>hbase.client.scanner.caching</code> in the job configuration. </para>
@ -542,7 +542,7 @@ row10 c1 c2
<para>The program allows you to limit the scope of the run. Provide a row regex or prefix to
limit the rows to analyze. Use <code>hbase.mapreduce.scan.column.family</code> to specify
scanning a single column family.</para>
<screen>$ bin/hbase org.apache.hadoop.hbase.mapreduce.CellCounter &lt;tablename&gt; &lt;outputDir&gt; [regex or prefix]</screen>
<screen language="bourne">$ bin/hbase org.apache.hadoop.hbase.mapreduce.CellCounter &lt;tablename&gt; &lt;outputDir&gt; [regex or prefix]</screen>
<para>Note: just like RowCounter, caching for the input Scan is configured via
<code>hbase.client.scanner.caching</code> in the job configuration. </para>
</section>
@ -585,7 +585,7 @@ row10 c1 c2
<title>Merge</title>
<para>Merge is a utility that can merge adjoining regions in the same table (see
org.apache.hadoop.hbase.util.Merge).</para>
<programlisting>$ bin/hbase org.apache.hadoop.hbase.util.Merge &lt;tablename&gt; &lt;region1&gt; &lt;region2&gt;
<programlisting language="bourne">$ bin/hbase org.apache.hadoop.hbase.util.Merge &lt;tablename&gt; &lt;region1&gt; &lt;region2&gt;
</programlisting>
<para>If you feel you have too many regions and want to consolidate them, Merge is the utility
you need. Merge must run be done when the cluster is down. See the <link
@ -609,7 +609,7 @@ row10 c1 c2
<title>Node Decommission</title>
<para>You can stop an individual RegionServer by running the following script in the HBase
directory on the particular node:</para>
<screen>$ ./bin/hbase-daemon.sh stop regionserver</screen>
<screen language="bourne">$ ./bin/hbase-daemon.sh stop regionserver</screen>
<para> The RegionServer will first close all regions and then shut itself down. On shutdown,
the RegionServer's ephemeral node in ZooKeeper will expire. The master will notice the
RegionServer gone and will treat it as a 'crashed' server; it will reassign the nodes the
@ -627,7 +627,7 @@ row10 c1 c2
the RegionServer's znode gone. In Apache HBase 0.90.2, we added facility for having a node
gradually shed its load and then shutdown itself down. Apache HBase 0.90.2 added the
<filename>graceful_stop.sh</filename> script. Here is its usage:</para>
<screen>$ ./bin/graceful_stop.sh
<screen language="bourne">$ ./bin/graceful_stop.sh
Usage: graceful_stop.sh [--config &amp;conf-dir>] [--restart] [--reload] [--thrift] [--rest] &amp;hostname>
thrift If we should stop/start thrift before/after the hbase stop/start
rest If we should stop/start rest before/after the hbase stop/start
@ -729,7 +729,7 @@ false
<para> You can also ask this script to restart a RegionServer after the shutdown AND move its
old regions back into place. The latter you might do to retain data locality. A primitive
rolling restart might be effected by running something like the following:</para>
<screen>$ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart --reload --debug $i; done &amp;> /tmp/log.txt &amp;</screen>
<screen language="bourne">$ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart --reload --debug $i; done &amp;> /tmp/log.txt &amp;</screen>
<para> Tail the output of <filename>/tmp/log.txt</filename> to follow the scripts progress.
The above does RegionServers only. The script will also disable the load balancer before
moving the regions. You'd need to do the master update separately. Do it before you run the
@ -741,18 +741,18 @@ false
</listitem>
<listitem>
<para>Run hbck to ensure the cluster consistent
<programlisting>$ ./bin/hbase hbck</programlisting> Effect repairs if inconsistent.
<programlisting language="bourne">$ ./bin/hbase hbck</programlisting> Effect repairs if inconsistent.
</para>
</listitem>
<listitem>
<para>Restart the Master:
<programlisting>$ ./bin/hbase-daemon.sh stop master; ./bin/hbase-daemon.sh start master</programlisting>
<programlisting language="bourne">$ ./bin/hbase-daemon.sh stop master; ./bin/hbase-daemon.sh start master</programlisting>
</para>
</listitem>
<listitem>
<para>Run the <filename>graceful_stop.sh</filename> script per RegionServer. For
example:</para>
<programlisting>$ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart --reload --debug $i; done &amp;> /tmp/log.txt &amp;
<programlisting language="bourne">$ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart --reload --debug $i; done &amp;> /tmp/log.txt &amp;
</programlisting>
<para> If you are running thrift or rest servers on the RegionServer, pass --thrift or
--rest options (See usage for <filename>graceful_stop.sh</filename> script). </para>
@ -1678,7 +1678,7 @@ false
<para>To turn on the snapshot support just set the <varname>hbase.snapshot.enabled</varname>
property to true. (Snapshots are enabled by default in 0.95+ and off by default in
0.94.6+)</para>
<programlisting>
<programlisting language="java">
&lt;property>
&lt;name>hbase.snapshot.enabled&lt;/name>
&lt;value>true&lt;/value>
@ -1690,7 +1690,7 @@ false
<title>Take a Snapshot</title>
<para>You can take a snapshot of a table regardless of whether it is enabled or disabled. The
snapshot operation doesn't involve any data copying.</para>
<screen>
<screen language="bourne">
$ ./bin/hbase shell
hbase> snapshot 'myTable', 'myTableSnapshot-122112'
</screen>
@ -1699,7 +1699,7 @@ hbase> snapshot 'myTable', 'myTableSnapshot-122112'
xml:id="ops.snapshots.list">
<title>Listing Snapshots</title>
<para>List all snapshots taken (by printing the names and relative information).</para>
<screen>
<screen language="bourne">
$ ./bin/hbase shell
hbase> list_snapshots
</screen>
@ -1709,7 +1709,7 @@ hbase> list_snapshots
<title>Deleting Snapshots</title>
<para>You can remove a snapshot, and the files retained for that snapshot will be removed if
no longer needed.</para>
<screen>
<screen language="bourne">
$ ./bin/hbase shell
hbase> delete_snapshot 'myTableSnapshot-122112'
</screen>
@ -1720,7 +1720,7 @@ hbase> delete_snapshot 'myTableSnapshot-122112'
<para>From a snapshot you can create a new table (clone operation) with the same data that you
had when the snapshot was taken. The clone operation, doesn't involve data copies, and a
change to the cloned table doesn't impact the snapshot or the original table.</para>
<screen>
<screen language="bourne">
$ ./bin/hbase shell
hbase> clone_snapshot 'myTableSnapshot-122112', 'myNewTestTable'
</screen>
@ -1731,7 +1731,7 @@ hbase> clone_snapshot 'myTableSnapshot-122112', 'myNewTestTable'
<para>The restore operation requires the table to be disabled, and the table will be restored
to the state at the time when the snapshot was taken, changing both data and schema if
required.</para>
<screen>
<screen language="bourne">
$ ./bin/hbase shell
hbase> disable 'myTable'
hbase> restore_snapshot 'myTableSnapshot-122112'
@ -1763,14 +1763,14 @@ hbase> restore_snapshot 'myTableSnapshot-122112'
hbase cluster does not have to be online.</para>
<para>To copy a snapshot called MySnapshot to an HBase cluster srv2 (hdfs:///srv2:8082/hbase)
using 16 mappers:</para>
<programlisting>$ bin/hbase class org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot MySnapshot -copy-to hdfs://srv2:8082/hbase -mappers 16</programlisting>
<programlisting language="bourne">$ bin/hbase class org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot MySnapshot -copy-to hdfs://srv2:8082/hbase -mappers 16</programlisting>
<formalpara>
<title>Limiting Bandwidth Consumption</title>
<para>You can limit the bandwidth consumption when exporting a snapshot, by specifying the
<code>-bandwidth</code> parameter, which expects an integer representing megabytes per
second. The following example limits the above example to 200 MB/sec.</para>
</formalpara>
<programlisting>$ bin/hbase class org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot MySnapshot -copy-to hdfs://srv2:8082/hbase -mappers 16 -bandwidth 200</programlisting>
<programlisting language="bourne">$ bin/hbase class org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot MySnapshot -copy-to hdfs://srv2:8082/hbase -mappers 16 -bandwidth 200</programlisting>
</section>
</section>
<!-- snapshots -->
@ -2035,7 +2035,7 @@ hbase shell> clone_snapshot 'tableSnapshot', 'newTableName'
hbase shell> delete_snapshot 'tableSnapshot'
hbase shell> drop 'tableName']]></screen>
<para>or in code it would be as follows:</para>
<programlisting role="Java" language="Java">void rename(HBaseAdmin admin, String oldTableName, String newTableName) {
<programlisting language="Java">void rename(HBaseAdmin admin, String oldTableName, String newTableName) {
String snapshotName = randomName();
admin.disableTable(oldTableName);
admin.snapshot(snapshotName, oldTableName);

View File

@ -475,7 +475,7 @@ hbase> <userinput>create 'mytable',{NAME => 'colfam1', BLOOMFILTER => 'ROWCOL'}<
<title>Constants</title>
<para>When people get started with HBase they have a tendency to write code that looks like
this:</para>
<programlisting>
<programlisting language="java">
Get get = new Get(rowkey);
Result r = htable.get(get);
byte[] b = r.getValue(Bytes.toBytes("cf"), Bytes.toBytes("attr")); // returns current version of value
@ -483,7 +483,7 @@ byte[] b = r.getValue(Bytes.toBytes("cf"), Bytes.toBytes("attr")); // returns c
<para>But especially when inside loops (and MapReduce jobs), converting the columnFamily and
column-names to byte-arrays repeatedly is surprisingly expensive. It's better to use
constants for the byte-arrays, like this:</para>
<programlisting>
<programlisting language="java">
public static final byte[] CF = "cf".getBytes();
public static final byte[] ATTR = "attr".getBytes();
...
@ -517,14 +517,14 @@ byte[] b = r.getValue(CF, ATTR); // returns current version of value
<para>There are two different approaches to pre-creating splits. The first approach is to rely
on the default <code>HBaseAdmin</code> strategy (which is implemented in
<code>Bytes.split</code>)... </para>
<programlisting>
byte[] startKey = ...; // your lowest keuy
<programlisting language="java">
byte[] startKey = ...; // your lowest key
byte[] endKey = ...; // your highest key
int numberOfRegions = ...; // # of regions to create
admin.createTable(table, startKey, endKey, numberOfRegions);
</programlisting>
<para>And the other approach is to define the splits yourself... </para>
<programlisting>
<programlisting language="java">
byte[][] splits = ...; // create your own splits
admin.createTable(table, splits);
</programlisting>
@ -676,7 +676,7 @@ admin.createTable(table, splits);
<code>Scan.HINT_LOOKAHEAD</code> can be set the on Scan object. The following code
instructs the RegionServer to attempt two iterations of next before a seek is
scheduled:</para>
<programlisting>
<programlisting language="java">
Scan scan = new Scan();
scan.addColumn(...);
scan.setAttribute(Scan.HINT_LOOKAHEAD, Bytes.toBytes(2));
@ -701,7 +701,7 @@ table.getScanner(scan);
xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/ResultScanner.html">ResultScanners</link>
you can cause problems on the RegionServers. Always have ResultScanner processing enclosed
in try/catch blocks...</para>
<programlisting>
<programlisting language="java">
Scan scan = new Scan();
// set attrs...
ResultScanner rs = htable.getScanner(scan);
@ -907,7 +907,7 @@ htable.close();
shortcircuit reads configuration page</link> for how to enable the latter, better version
of shortcircuit. For example, here is a minimal config. enabling short-circuit reads added
to <filename>hbase-site.xml</filename>: </para>
<programlisting><![CDATA[<property>
<programlisting language="xml"><![CDATA[<property>
<name>dfs.client.read.shortcircuit</name>
<value>true</value>
<description>

View File

@ -45,7 +45,7 @@
the <filename>src/main/docbkx</filename> directory of the HBase source. This reference
guide is marked up using <link xlink:href="http://www.docbook.com/">DocBook</link> from
which the the finished guide is generated as part of the 'site' build target. Run
<programlisting>mvn site</programlisting> to generate this documentation. Amendments and
<programlisting language="bourne">mvn site</programlisting> to generate this documentation. Amendments and
improvements to the documentation are welcomed. Click <link
xlink:href="https://issues.apache.org/jira/secure/CreateIssueDetails!init.jspa?pid=12310753&amp;issuetype=1&amp;components=12312132&amp;summary=SHORT+DESCRIPTION"
>this link</link> to file a new documentation bug against Apache HBase with some

View File

@ -44,7 +44,7 @@
xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html">HBaseAdmin</link>
in the Java API. </para>
<para>Tables must be disabled when making ColumnFamily modifications, for example:</para>
<programlisting>
<programlisting language="java">
Configuration config = HBaseConfiguration.create();
HBaseAdmin admin = new HBaseAdmin(conf);
String table = "myTable";
@ -184,7 +184,7 @@ admin.enableTable(table);
in those eight bytes. If you stored this number as a String -- presuming a byte per
character -- you need nearly 3x the bytes. </para>
<para>Not convinced? Below is some sample code that you can run on your own.</para>
<programlisting>
<programlisting language="java">
// long
//
long l = 1234567890L;
@ -307,7 +307,7 @@ COLUMN CELL
are accessible in the keyspace. </para>
<para>To conclude this example, the following is an example of how appropriate splits can be
pre-created for hex-keys:. </para>
<programlisting><![CDATA[public static boolean createTable(HBaseAdmin admin, HTableDescriptor table, byte[][] splits)
<programlisting language="java"><![CDATA[public static boolean createTable(HBaseAdmin admin, HTableDescriptor table, byte[][] splits)
throws IOException {
try {
admin.createTable( table, splits );
@ -580,7 +580,7 @@ public static byte[][] getHexSplits(String startKey, String endKey, int numRegio
timestamps, by performing a mod operation on the timestamp. If time-oriented scans are
important, this could be a useful approach. Attention must be paid to the number of
buckets, because this will require the same number of scans to return results.</para>
<programlisting>
<programlisting language="java">
long bucket = timestamp % numBuckets;
</programlisting>
<para>… to construct:</para>
@ -1041,13 +1041,13 @@ long bucket = timestamp % numBuckets;
]]></programlisting>
<para>The other option we had was to do this entirely using:</para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<FixedWidthUserName><FixedWidthPageNum0>:<FixedWidthLength><FixedIdNextPageNum><ValueId1><ValueId2><ValueId3>...
<FixedWidthUserName><FixedWidthPageNum1>:<FixedWidthLength><FixedIdNextPageNum><ValueId1><ValueId2><ValueId3>...
]]></programlisting>
<para> where each row would contain multiple values. So in one case reading the first thirty
values would be: </para>
<programlisting><![CDATA[
<programlisting language="java"><![CDATA[
scan { STARTROW => 'FixedWidthUsername' LIMIT => 30}
]]></programlisting>
<para>And in the second case it would be </para>

View File

@ -73,7 +73,7 @@
operation that must be added to the <code>hbase-site.xml</code> file on every server
machine in the cluster. Required for even the most basic interactions with a secure
Hadoop configuration, independent of HBase security. </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.regionserver.kerberos.principal</name>
<value>hbase/_HOST@YOUR-REALM.COM</value>
@ -117,7 +117,7 @@
underlying HDFS configuration is secure.</para>
<para> Add the following to the <code>hbase-site.xml</code> file on every server machine in
the cluster: </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.security.authentication</name>
<value>kerberos</value>
@ -140,7 +140,7 @@
<para>First, refer to <xref linkend="security.prerequisites" /> and ensure that your
underlying HDFS configuration is secure.</para>
<para> Add the following to the <code>hbase-site.xml</code> file on every client: </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.security.authentication</name>
<value>kerberos</value>
@ -154,7 +154,7 @@
<para> Once HBase is configured for secure RPC it is possible to optionally configure
encrypted communication. To do so, add the following to the <code>hbase-site.xml</code> file
on every client: </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.rpc.protection</name>
<value>privacy</value>
@ -162,7 +162,7 @@
]]></programlisting>
<para> This configuration property can also be set on a per connection basis. Set it in the
<code>Configuration</code> supplied to <code>HTable</code>: </para>
<programlisting>
<programlisting language="java">
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.rpc.protection", "privacy");
HTable table = new HTable(conf, tablename);
@ -173,7 +173,7 @@ HTable table = new HTable(conf, tablename);
<section xml:id="security.client.thrift">
<title>Client-side Configuration for Secure Operation - Thrift Gateway</title>
<para> Add the following to the <code>hbase-site.xml</code> file for every Thrift gateway: <programlisting><![CDATA[
<para> Add the following to the <code>hbase-site.xml</code> file for every Thrift gateway: <programlisting language="xml"><![CDATA[
<property>
<name>hbase.thrift.keytab.file</name>
<value>/etc/hbase/conf/hbase.keytab</value>
@ -193,7 +193,7 @@ HTable table = new HTable(conf, tablename);
add the <code>hbase.thrift.kerberos.principal</code> to the <code>_acl_</code> table. For
example, to give the Thrift API principal, <code>thrift_server</code>, administrative
access, a command such as this one will suffice: </para>
<programlisting><![CDATA[
<programlisting language="sql"><![CDATA[
grant 'thrift_server', 'RWCA'
]]></programlisting>
<para>For more information about ACLs, please see the <link
@ -257,7 +257,7 @@ grant 'thrift_server', 'RWCA'
<section>
<title>Client-side Configuration for Secure Operation - REST Gateway</title>
<para> Add the following to the <code>hbase-site.xml</code> file for every REST gateway: </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.rest.keytab.file</name>
<value>$KEYTAB</value>
@ -276,7 +276,7 @@ grant 'thrift_server', 'RWCA'
add the <code>hbase.rest.kerberos.principal</code> to the <code>_acl_</code> table. For
example, to give the REST API principal, <code>rest_server</code>, administrative access, a
command such as this one will suffice: </para>
<programlisting><![CDATA[
<programlisting language="sql"><![CDATA[
grant 'rest_server', 'RWCA'
]]></programlisting>
<para>For more information about ACLs, please see the <link
@ -298,7 +298,7 @@ grant 'rest_server', 'RWCA'
region servers) to allow proxy users; configure REST gateway to enable impersonation. </para>
<para> To allow proxy users, add the following to the <code>hbase-site.xml</code> file for
every HBase server: </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hadoop.security.authorization</name>
<value>true</value>
@ -316,7 +316,7 @@ grant 'rest_server', 'RWCA'
$GROUPS. </para>
<para> To enable REST gateway impersonation, add the following to the
<code>hbase-site.xml</code> file for every REST gateway. </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.rest.authentication.type</name>
<value>kerberos</value>
@ -367,7 +367,7 @@ grant 'rest_server', 'RWCA'
<title>Server-side Configuration for Simple User Access Operation</title>
<para> Add the following to the <code>hbase-site.xml</code> file on every server machine
in the cluster: </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.security.authentication</name>
<value>simple</value>
@ -387,7 +387,7 @@ grant 'rest_server', 'RWCA'
]]></programlisting>
<para> For 0.94, add the following to the <code>hbase-site.xml</code> file on every server
machine in the cluster: </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.rpc.engine</name>
<value>org.apache.hadoop.hbase.ipc.SecureRpcEngine</value>
@ -408,7 +408,7 @@ grant 'rest_server', 'RWCA'
<section>
<title>Client-side Configuration for Simple User Access Operation</title>
<para> Add the following to the <code>hbase-site.xml</code> file on every client: </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.security.authentication</name>
<value>simple</value>
@ -416,7 +416,7 @@ grant 'rest_server', 'RWCA'
]]></programlisting>
<para> For 0.94, add the following to the <code>hbase-site.xml</code> file on every server
machine in the cluster: </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.rpc.engine</name>
<value>org.apache.hadoop.hbase.ipc.SecureRpcEngine</value>
@ -432,7 +432,7 @@ grant 'rest_server', 'RWCA'
<para>The Thrift gateway user will need access. For example, to give the Thrift API user,
<code>thrift_server</code>, administrative access, a command such as this one will
suffice: </para>
<programlisting><![CDATA[
<programlisting language="sql"><![CDATA[
grant 'thrift_server', 'RWCA'
]]></programlisting>
<para>For more information about ACLs, please see the <link
@ -452,7 +452,7 @@ grant 'thrift_server', 'RWCA'
<para>The REST gateway user will need access. For example, to give the REST API user,
<code>rest_server</code>, administrative access, a command such as this one will
suffice: </para>
<programlisting><![CDATA[
<programlisting language="sql"><![CDATA[
grant 'rest_server', 'RWCA'
]]></programlisting>
<para>For more information about ACLs, please see the <link
@ -476,7 +476,7 @@ grant 'rest_server', 'RWCA'
format. Some of the usecases that uses the tags are Visibility labels, Cell level ACLs, etc. </para>
<para> HFile V3 version from 0.98 onwards supports tags and this feature can be turned on using
the following configuration </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hfile.format.version</name>
<value>3</value>
@ -488,7 +488,7 @@ grant 'rest_server', 'RWCA'
<para> The way rowkeys, column families, qualifiers and values are encoded using different
Encoding Algos, similarly the tags can also be encoded. Tag encoding can be turned on per CF.
Default is always turn ON. To turn on the tag encoding on the HFiles use </para>
<programlisting><![CDATA[
<programlisting language="java"><![CDATA[
HColumnDescriptor#setCompressTags(boolean compressTags)
]]></programlisting>
<para> Note that encoding of tags takes place only if the DataBlockEncoder is enabled for the
@ -496,14 +496,14 @@ HColumnDescriptor#setCompressTags(boolean compressTags)
<para> As we compress the WAL entries using Dictionary the tags present in the WAL can also be
compressed using Dictionary. Every tag is compressed individually using WAL Dictionary. To
turn ON tag compression in WAL dictionary enable the property </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.regionserver.wal.tags.enablecompression</name>
<value>true</value>
</property>
]]></programlisting>
<para> To add tags to every cell during Puts, the following apis are provided </para>
<programlisting><![CDATA[
<programlisting language="java"><![CDATA[
Put#add(byte[] family, byte [] qualifier, byte [] value, Tag[] tag)
Put#add(byte[] family, byte[] qualifier, long ts, byte[] value, Tag[] tag)
]]></programlisting>
@ -1392,7 +1392,7 @@ Put#add(byte[] family, byte[] qualifier, long ts, byte[] value, Tag[] tag)
processes before setting up ACLs. </para>
<para> To enable the AccessController, modify the <code>hbase-site.xml</code> file on every
server machine in the cluster to look like: </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.coprocessor.master.classes</name>
<value>org.apache.hadoop.hbase.security.access.AccessController</value>
@ -1413,21 +1413,21 @@ Put#add(byte[] family, byte[] qualifier, long ts, byte[] value, Tag[] tag)
on configuring it refer to <link
linkend="hbase.accesscontrol.configuration">Access Control</link> section. </para>
<para> The ACLs can be specified for every mutation using the APIs </para>
<programlisting><![CDATA[
<programlisting language="java"><![CDATA[
Mutation.setACL(String user, Permission perms)
Mutation.setACL(Map<String, Permission> perms)
]]></programlisting>
<para> For example, to provide read permission to an user user1 then </para>
<programlisting><![CDATA[
<programlisting language="java"><![CDATA[
put.setACL(“user1”, new Permission(Permission.Action.READ))
]]></programlisting>
<para> Generally the ACL applied on the table and CF takes precedence over Cell level ACL. In
order to make the cell level ACL to take precedence use the following API, </para>
<programlisting><![CDATA[
<programlisting language="java"><![CDATA[
Mutation.setACLStrategy(boolean cellFirstStrategy)
]]></programlisting>
<para> Please note that inorder to use this feature, HFile V3 version should be turned on. </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hfile.format.version</name>
<value>3</value>
@ -1445,7 +1445,7 @@ Mutation.setACLStrategy(boolean cellFirstStrategy)
<example>
<title>Grant</title>
<programlisting><![CDATA[
<programlisting language="sql"><![CDATA[
grant <user|@group> <permissions> [ <table> [ <column family> [ <column qualifier> ] ] ]
]]></programlisting>
</example>
@ -1463,7 +1463,7 @@ grant <user|@group> <permissions> [ <table> [ <column family> [ <column qualifie
<example>
<title>Revoke</title>
<programlisting><![CDATA[
<programlisting language="sql"><![CDATA[
revoke <user|@group> [ <table> [ <column family> [ <column qualifier> ] ] ]
]]></programlisting>
</example>
@ -1472,7 +1472,7 @@ revoke <user|@group> [ <table> [ <column family> [ <column qualifier> ] ] ]
<para> The <code>alter</code> command has been extended to allow ownership
assignment:</para>
<programlisting><![CDATA[
<programlisting language="sql"><![CDATA[
alter 'tablename', {OWNER => 'username|@group'}
]]></programlisting>
</example>
@ -1524,7 +1524,7 @@ user_permission <table>
<para> You have to enable the secure bulk load to work properly. You can modify the
<code>hbase-site.xml</code> file on every server machine in the cluster and add the
SecureBulkLoadEndpoint class to the list of regionserver coprocessors: </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.bulkload.staging.dir</name>
<value>/tmp/hbase-staging</value>
@ -1554,7 +1554,7 @@ user_permission <table>
the cell or even know of its existence. </para>
<para> Visibility expressions like the above can be added when storing or mutating a cell using
the API, </para>
<programlisting>Mutation#setCellVisibility(new CellVisibility(String labelExpession));</programlisting>
<programlisting language="java">Mutation#setCellVisibility(new CellVisibility(String labelExpession));</programlisting>
<para> Where the labelExpression could be &#39;( secret | topsecret ) &amp; !probationary&#39; </para>
<para> We build the user&#39;s label set in the RPC context when a request is first received by
the HBase RegionServer. How users are associated with labels is pluggable. The default plugin
@ -1609,7 +1609,7 @@ user_permission <table>
<para> HBase stores cell level labels as cell tags. HFile version 3 adds the cell tags
support. Be sure to use HFile version 3 by setting this property in every server site
configuration file: </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hfile.format.version</name>
<value>3</value>
@ -1618,7 +1618,7 @@ user_permission <table>
<para> You will also need to make sure the VisibilityController coprocessor is active on every
table to protect by adding it to the list of system coprocessors in the server site
configuration files: </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.coprocessor.master.classes</name>
<value>org.apache.hadoop.hbase.security.visibility.VisibilityController</value>
@ -1680,7 +1680,7 @@ user_permission <table>
xml:id="hbase.encryption.server.configuration">
<title>Configuration</title>
<para> Create a secret key of appropriate length for AES. </para>
<screen><![CDATA[
<screen language="bourne"><![CDATA[
$ keytool -keystore /path/to/hbase/conf/hbase.jks \
-storetype jceks -storepass <password> \
-genseckey -keyalg AES -keysize 128 \
@ -1693,7 +1693,7 @@ $ keytool -keystore /path/to/hbase/conf/hbase.jks \
the HBase service account. </para>
<para> Configure HBase daemons to use a key provider backed by the KeyStore files for
retrieving the cluster master key as needed. </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.crypto.keyprovider</name>
<value>org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider</value>
@ -1705,7 +1705,7 @@ $ keytool -keystore /path/to/hbase/conf/hbase.jks \
]]></programlisting>
<para> By default the HBase service account name will be used to resolve the cluster master
key, but you can store it with any arbitrary alias and configure HBase appropriately: </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.crypto.master.key.name</name>
<value>hbase</value>
@ -1715,14 +1715,14 @@ $ keytool -keystore /path/to/hbase/conf/hbase.jks \
should also have its permissions set to be readable only by the HBase service account. </para>
<para> Transparent encryption is a feature of HFile version 3. Be sure to use HFile version 3
by setting this property in every server site configuration file: </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hfile.format.version</name>
<value>3</value>
</property>
]]></programlisting>
<para> Finally, configure the secure WAL in every server site configuration file: </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.regionserver.hlog.reader.impl</name>
<value>org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader</value>
@ -1769,7 +1769,7 @@ $ keytool -keystore /path/to/hbase/conf/hbase.jks \
<para> Master key rotation can be achieved by updating the KeyStore to contain a new master
key, as described above, with also the old master key added to the KeyStore under a
different alias. Then, configure fallback to the old master key in the HBase site file: </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.crypto.master.alternate.key.name</name>
<value>hbase.old</value>

View File

@ -348,13 +348,13 @@ is evaluated as
<itemizedlist>
<title>Syntax</title>
<listitem>
<programlisting>KeyOnlyFilter ()</programlisting>
<programlisting language="java">KeyOnlyFilter ()</programlisting>
</listitem>
</itemizedlist>
<itemizedlist>
<title>Example</title>
<listitem>
<programlisting>KeyOnlyFilter ()"</programlisting>
<programlisting language="java">KeyOnlyFilter ()"</programlisting>
</listitem>
</itemizedlist>
</listitem>
@ -368,13 +368,13 @@ is evaluated as
<itemizedlist>
<title>Syntax</title>
<listitem>
<programlisting>FirstKeyOnlyFilter ()</programlisting>
<programlisting language="java">FirstKeyOnlyFilter ()</programlisting>
</listitem>
</itemizedlist>
<itemizedlist>
<title>Example</title>
<listitem>
<programlisting>FirstKeyOnlyFilter ()</programlisting>
<programlisting language="java">FirstKeyOnlyFilter ()</programlisting>
</listitem>
</itemizedlist>
</listitem>
@ -388,13 +388,13 @@ is evaluated as
<itemizedlist>
<title>Syntax</title>
<listitem>
<programlisting>PrefixFilter (&lt;row_prefix>)</programlisting>
<programlisting language="java">PrefixFilter (&lt;row_prefix>)</programlisting>
</listitem>
</itemizedlist>
<itemizedlist>
<title>Example</title>
<listitem>
<programlisting>PrefixFilter (Row)</programlisting>
<programlisting language="java">PrefixFilter (Row)</programlisting>
</listitem>
</itemizedlist>
</listitem>
@ -409,13 +409,13 @@ is evaluated as
<itemizedlist>
<title>Syntax</title>
<listitem>
<programlisting>ColumnPrefixFilter(&lt;column_prefix>)</programlisting>
<programlisting language="java">ColumnPrefixFilter(&lt;column_prefix>)</programlisting>
</listitem>
</itemizedlist>
<itemizedlist>
<title>Example</title>
<listitem>
<programlisting>ColumnPrefixFilter(Col)</programlisting>
<programlisting language="java">ColumnPrefixFilter(Col)</programlisting>
</listitem>
</itemizedlist>
</listitem>
@ -430,13 +430,13 @@ is evaluated as
<itemizedlist>
<title>Syntax</title>
<listitem>
<programlisting>MultipleColumnPrefixFilter(&lt;column_prefix>, &lt;column_prefix>, …, &lt;column_prefix>)</programlisting>
<programlisting language="java">MultipleColumnPrefixFilter(&lt;column_prefix>, &lt;column_prefix>, …, &lt;column_prefix>)</programlisting>
</listitem>
</itemizedlist>
<itemizedlist>
<title>Example</title>
<listitem>
<programlisting>MultipleColumnPrefixFilter(Col1, Col2)</programlisting>
<programlisting language="java">MultipleColumnPrefixFilter(Col1, Col2)</programlisting>
</listitem>
</itemizedlist>
</listitem>
@ -449,14 +449,14 @@ is evaluated as
<itemizedlist>
<title>Syntax</title>
<listitem>
<programlisting>ColumnCountGetFilter
<programlisting language="java">ColumnCountGetFilter
(&lt;limit>)</programlisting>
</listitem>
</itemizedlist>
<itemizedlist>
<title>Example</title>
<listitem>
<programlisting>ColumnCountGetFilter (4)</programlisting>
<programlisting language="java">ColumnCountGetFilter (4)</programlisting>
</listitem>
</itemizedlist>
</listitem>
@ -469,13 +469,13 @@ is evaluated as
<itemizedlist>
<title>Syntax</title>
<listitem>
<programlisting>PageFilter (&lt;page_size&gt;)</programlisting>
<programlisting language="java">PageFilter (&lt;page_size&gt;)</programlisting>
</listitem>
</itemizedlist>
<itemizedlist>
<title>Example</title>
<listitem>
<programlisting>PageFilter (2)</programlisting>
<programlisting language="java">PageFilter (2)</programlisting>
</listitem>
</itemizedlist>
</listitem>
@ -489,13 +489,13 @@ is evaluated as
<itemizedlist>
<title>Syntax</title>
<listitem>
<programlisting>ColumnPaginationFilter(&lt;limit>, &lt;offset>)</programlisting>
<programlisting language="java">ColumnPaginationFilter(&lt;limit>, &lt;offset>)</programlisting>
</listitem>
</itemizedlist>
<itemizedlist>
<title>Example</title>
<listitem>
<programlisting>ColumnPaginationFilter (3, 5)</programlisting>
<programlisting language="java">ColumnPaginationFilter (3, 5)</programlisting>
</listitem>
</itemizedlist>
</listitem>
@ -509,13 +509,13 @@ is evaluated as
<itemizedlist>
<title>Syntax</title>
<listitem>
<programlisting>InclusiveStopFilter(&lt;stop_row_key>)</programlisting>
<programlisting language="java">InclusiveStopFilter(&lt;stop_row_key>)</programlisting>
</listitem>
</itemizedlist>
<itemizedlist>
<title>Example</title>
<listitem>
<programlisting>InclusiveStopFilter ('Row2')</programlisting>
<programlisting language="java">InclusiveStopFilter ('Row2')</programlisting>
</listitem>
</itemizedlist>
</listitem>
@ -528,13 +528,13 @@ is evaluated as
<itemizedlist>
<title>Syntax</title>
<listitem>
<programlisting>TimeStampsFilter (&lt;timestamp>, &lt;timestamp>, ... ,&lt;timestamp>)</programlisting>
<programlisting language="java">TimeStampsFilter (&lt;timestamp>, &lt;timestamp>, ... ,&lt;timestamp>)</programlisting>
</listitem>
</itemizedlist>
<itemizedlist>
<title>Example</title>
<listitem>
<programlisting>TimeStampsFilter (5985489, 48895495, 58489845945)</programlisting>
<programlisting language="java">TimeStampsFilter (5985489, 48895495, 58489845945)</programlisting>
</listitem>
</itemizedlist>
</listitem>
@ -549,13 +549,13 @@ is evaluated as
<itemizedlist>
<title>Syntax</title>
<listitem>
<programlisting>RowFilter (&lt;compareOp>, &lt;row_comparator>)</programlisting>
<programlisting language="java">RowFilter (&lt;compareOp>, &lt;row_comparator>)</programlisting>
</listitem>
</itemizedlist>
<itemizedlist>
<title>Example</title>
<listitem>
<programlisting>RowFilter (&lt;=, xyz)</programlisting>
<programlisting language="java">RowFilter (&lt;=, xyz)</programlisting>
</listitem>
</itemizedlist>
</listitem>
@ -570,13 +570,13 @@ is evaluated as
<itemizedlist>
<title>Syntax</title>
<listitem>
<programlisting>QualifierFilter (&lt;compareOp&gt;, &lt;qualifier_comparator>)</programlisting>
<programlisting language="java">QualifierFilter (&lt;compareOp&gt;, &lt;qualifier_comparator>)</programlisting>
</listitem>
</itemizedlist>
<itemizedlist>
<title>Example</title>
<listitem>
<programlisting>QualifierFilter (=, Column1)</programlisting>
<programlisting language="java">QualifierFilter (=, Column1)</programlisting>
</listitem>
</itemizedlist>
</listitem>
@ -591,13 +591,13 @@ is evaluated as
<itemizedlist>
<title>Syntax</title>
<listitem>
<programlisting>QualifierFilter (&lt;compareOp>,&lt;qualifier_comparator>)</programlisting>
<programlisting language="java">QualifierFilter (&lt;compareOp>,&lt;qualifier_comparator>)</programlisting>
</listitem>
</itemizedlist>
<itemizedlist>
<title>Example</title>
<listitem>
<programlisting>QualifierFilter (=,Column1)</programlisting>
<programlisting language="java">QualifierFilter (=,Column1)</programlisting>
</listitem>
</itemizedlist>
</listitem>
@ -611,13 +611,13 @@ is evaluated as
<itemizedlist>
<title>Syntax</title>
<listitem>
<programlisting>ValueFilter (&lt;compareOp>,&lt;value_comparator>) </programlisting>
<programlisting language="java">ValueFilter (&lt;compareOp>,&lt;value_comparator>) </programlisting>
</listitem>
</itemizedlist>
<itemizedlist>
<title>Example</title>
<listitem>
<programlisting>ValueFilter (!=, Value)</programlisting>
<programlisting language="java">ValueFilter (!=, Value)</programlisting>
</listitem>
</itemizedlist>
</listitem>
@ -640,26 +640,26 @@ is evaluated as
<itemizedlist>
<title>Syntax</title>
<listitem>
<programlisting><![CDATA[DependentColumnFilter (<family>,<qualifier>, <boolean>, <compare operator>, <value
<programlisting language="java"><![CDATA[DependentColumnFilter (<family>,<qualifier>, <boolean>, <compare operator>, <value
comparator)]]></programlisting>
</listitem>
<listitem>
<programlisting><![CDATA[DependentColumnFilter (<family>,<qualifier>, <boolean>)]]></programlisting>
<programlisting language="java"><![CDATA[DependentColumnFilter (<family>,<qualifier>, <boolean>)]]></programlisting>
</listitem>
<listitem>
<programlisting>DependentColumnFilter (&lt;family>,&lt;qualifier>)</programlisting>
<programlisting language="java">DependentColumnFilter (&lt;family>,&lt;qualifier>)</programlisting>
</listitem>
</itemizedlist>
<itemizedlist>
<title>Example</title>
<listitem>
<programlisting>DependentColumnFilter (conf, blacklist, false, >=, zebra)</programlisting>
<programlisting language="java">DependentColumnFilter (conf, blacklist, false, >=, zebra)</programlisting>
</listitem>
<listitem>
<programlisting>DependentColumnFilter (conf, 'blacklist', true)</programlisting>
<programlisting language="java">DependentColumnFilter (conf, 'blacklist', true)</programlisting>
</listitem>
<listitem>
<programlisting>DependentColumnFilter (conf, 'blacklist')</programlisting>
<programlisting language="java">DependentColumnFilter (conf, 'blacklist')</programlisting>
</listitem>
</itemizedlist>
</listitem>
@ -683,16 +683,16 @@ is evaluated as
<itemizedlist>
<title>Syntax</title>
<listitem>
<programlisting>SingleColumnValueFilter(&lt;family>,&lt;qualifier>, &lt;compare operator>, &lt;comparator>, &lt;filterIfColumnMissing_boolean>, &lt;latest_version_boolean>)</programlisting>
<programlisting language="java">SingleColumnValueFilter(&lt;family>,&lt;qualifier>, &lt;compare operator>, &lt;comparator>, &lt;filterIfColumnMissing_boolean>, &lt;latest_version_boolean>)</programlisting>
</listitem>
<listitem>
<programlisting>SingleColumnValueFilter(&lt;family>, &lt;qualifier>, &lt;compare operator>, &lt;comparator>)</programlisting>
<programlisting language="java">SingleColumnValueFilter(&lt;family>, &lt;qualifier>, &lt;compare operator>, &lt;comparator>)</programlisting>
</listitem>
</itemizedlist>
<itemizedlist>
<title>Example</title>
<listitem>
<programlisting>SingleColumnValueFilter (FamilyA, Column1, &lt;=, abc, true, false)</programlisting>
<programlisting language="java">SingleColumnValueFilter (FamilyA, Column1, &lt;=, abc, true, false)</programlisting>
</listitem>
<listitem>
<programlisting>SingleColumnValueFilter (FamilyA, Column1, &lt;=, abc)</programlisting>
@ -710,19 +710,19 @@ is evaluated as
<itemizedlist>
<title>Syntax</title>
<listitem>
<programlisting>SingleColumnValueExcludeFilter('&lt;family>', '&lt;qualifier>', &lt;compare operator>, '&lt;comparator>', &lt;latest_version_boolean>, &lt;filterIfColumnMissing_boolean>)</programlisting>
<programlisting language="java">SingleColumnValueExcludeFilter('&lt;family>', '&lt;qualifier>', &lt;compare operator>, '&lt;comparator>', &lt;latest_version_boolean>, &lt;filterIfColumnMissing_boolean>)</programlisting>
</listitem>
<listitem>
<programlisting>SingleColumnValueExcludeFilter('&lt;family>', '&lt;qualifier>', &lt;compare operator>, '&lt;comparator>')</programlisting>
<programlisting language="java">SingleColumnValueExcludeFilter('&lt;family>', '&lt;qualifier>', &lt;compare operator>, '&lt;comparator>')</programlisting>
</listitem>
</itemizedlist>
<itemizedlist>
<title>Example</title>
<listitem>
<programlisting>SingleColumnValueExcludeFilter (FamilyA, Column1, &lt;=, abc, false, true)</programlisting>
<programlisting language="java">SingleColumnValueExcludeFilter (FamilyA, Column1, &lt;=, abc, false, true)</programlisting>
</listitem>
<listitem>
<programlisting>SingleColumnValueExcludeFilter (FamilyA, Column1, &lt;=, abc)</programlisting>
<programlisting language="java">SingleColumnValueExcludeFilter (FamilyA, Column1, &lt;=, abc)</programlisting>
</listitem>
</itemizedlist>
</listitem>
@ -739,13 +739,13 @@ is evaluated as
<itemizedlist>
<title>Syntax</title>
<listitem>
<programlisting>ColumnRangeFilter (&lt;minColumn>, &lt;minColumnInclusive_bool>, &lt;maxColumn>, &lt;maxColumnInclusive_bool>)</programlisting>
<programlisting language="java">ColumnRangeFilter (&lt;minColumn>, &lt;minColumnInclusive_bool>, &lt;maxColumn>, &lt;maxColumnInclusive_bool>)</programlisting>
</listitem>
</itemizedlist>
<itemizedlist>
<title>Example</title>
<listitem>
<programlisting>ColumnRangeFilter (abc, true, xyz, false)</programlisting>
<programlisting language="java">ColumnRangeFilter (abc, true, xyz, false)</programlisting>
</listitem>
</itemizedlist>
</listitem>

View File

@ -81,7 +81,7 @@ public void receiveSpan(Span span);
change your config to use zipkin receiver, distribute the new configuration and then (rolling)
restart. </para>
<para> Here is the example of manual setup procedure. </para>
<screen><![CDATA[
<screen language="bourne"><![CDATA[
$ git clone https://github.com/cloudera/htrace
$ cd htrace/htrace-zipkin
$ mvn compile assembly:single
@ -92,7 +92,7 @@ $ cp target/htrace-zipkin-*-jar-with-dependencies.jar $HBASE_HOME/lib/
for a <varname>hbase.zipkin.collector-hostname</varname> and
<varname>hbase.zipkin.collector-port</varname> property with a value describing the Zipkin
collector server to which span information are sent. </para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<property>
<name>hbase.trace.spanreceiver.classes</name>
<value>org.htrace.impl.ZipkinSpanReceiver</value>
@ -118,7 +118,7 @@ $ cp target/htrace-zipkin-*-jar-with-dependencies.jar $HBASE_HOME/lib/
<title>Client Modifications</title>
<para> In order to turn on tracing in your client code, you must initialize the module sending
spans to receiver once per client process. </para>
<programlisting><![CDATA[
<programlisting language="java"><![CDATA[
private SpanReceiverHost spanReceiverHost;
...
@ -129,13 +129,13 @@ private SpanReceiverHost spanReceiverHost;
<para>Then you simply start tracing span before requests you think are interesting, and close it
when the request is done. For example, if you wanted to trace all of your get operations, you
change this: </para>
<programlisting><![CDATA[
<programlisting language="java"><![CDATA[
HTable table = new HTable(conf, "t1");
Get get = new Get(Bytes.toBytes("r1"));
Result res = table.get(get);
]]></programlisting>
<para>into: </para>
<programlisting><![CDATA[
<programlisting language="java"><![CDATA[
TraceScope ts = Trace.startSpan("Gets", Sampler.ALWAYS);
try {
HTable table = new HTable(conf, "t1");
@ -146,7 +146,7 @@ try {
}
]]></programlisting>
<para>If you wanted to trace half of your 'get' operations, you would pass in: </para>
<programlisting><![CDATA[
<programlisting language="java"><![CDATA[
new ProbabilitySampler(0.5)
]]></programlisting>
<para>in lieu of <varname>Sampler.ALWAYS</varname> to <classname>Trace.startSpan()</classname>.

View File

@ -128,7 +128,7 @@
this or confirm this is happening GC logging can be turned on in the Java virtual machine. </para>
<para> To enable, in <filename>hbase-env.sh</filename>, uncomment one of the below lines
:</para>
<programlisting>
<programlisting language="bourne">
# This enables basic gc logging to the .out file.
# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps"
@ -194,13 +194,13 @@
collections take but if its too small, objects are promoted to old gen too quickly). In the
below we constrain new gen size to 64m. </para>
<para> Add the below line in <filename>hbase-env.sh</filename>:
<programlisting>
<programlisting language="bourne">
export SERVER_GC_OPTS="$SERVER_GC_OPTS -XX:NewSize=64m -XX:MaxNewSize=64m"
</programlisting>
</para>
<para> Similarly, to enable GC logging for client processes, uncomment one of the below lines
in <filename>hbase-env.sh</filename>:</para>
<programlisting>
<programlisting language="bourne">
# This enables basic gc logging to the .out file.
# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps"
@ -293,7 +293,7 @@ export SERVER_GC_OPTS="$SERVER_GC_OPTS -XX:NewSize=64m -XX:MaxNewSize=64m"
<title>zkcli</title>
<para><code>zkcli</code> is a very useful tool for investigating ZooKeeper-related issues.
To invoke:
<programlisting>
<programlisting language="bourne">
./hbase zkcli -server host:port &lt;cmd&gt; &lt;args&gt;
</programlisting>
The commands (and arguments) are:</para>
@ -377,7 +377,7 @@ Swap: 16008732k total, 14348k used, 15994384k free, 11106908k cached
<para>
<code>jps</code> is shipped with every JDK and gives the java process ids for the current
user (if root, then it gives the ids for all users). Example:</para>
<programlisting>
<programlisting language="bourne">
hadoop@sv4borg12:~$ jps
1322 TaskTracker
17789 HRegionServer
@ -421,7 +421,7 @@ hadoop@sv4borg12:~$ jps
</itemizedlist>
<para> You can then do stuff like checking out the full command line that started the
process:</para>
<programlisting>
<programlisting language="bourne">
hadoop@sv4borg12:~$ ps aux | grep HRegionServer
hadoop 17789 155 35.2 9067824 8604364 ? S&lt;l Mar04 9855:48 /usr/java/jdk1.6.0_14/bin/java -Xmx8000m -XX:+DoEscapeAnalysis -XX:+AggressiveOpts -XX:+UseConcMarkSweepGC -XX:NewSize=64m -XX:MaxNewSize=64m -XX:CMSInitiatingOccupancyFraction=88 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:/export1/hadoop/logs/gc-hbase.log -Dcom.sun.management.jmxremote.port=10102 -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.password.file=/home/hadoop/hbase/conf/jmxremote.password -Dcom.sun.management.jmxremote -Dhbase.log.dir=/export1/hadoop/logs -Dhbase.log.file=hbase-hadoop-regionserver-sv4borg12.log -Dhbase.home.dir=/home/hadoop/hbase -Dhbase.id.str=hadoop -Dhbase.root.logger=INFO,DRFA -Djava.library.path=/home/hadoop/hbase/lib/native/Linux-amd64-64 -classpath /home/hadoop/hbase/bin/../conf:[many jars]:/home/hadoop/hadoop/conf org.apache.hadoop.hbase.regionserver.HRegionServer start
</programlisting>
@ -791,7 +791,7 @@ at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210)
<code>HADOOP_CLASSPATH</code> set to include the HBase dependencies. The "hbase classpath"
utility can be used to do this easily. For example (substitute VERSION with your HBase
version):</para>
<programlisting>
<programlisting language="bourne">
HADOOP_CLASSPATH=`hbase classpath` hadoop jar $HBASE_HOME/hbase-VERSION.jar rowcounter usertable
</programlisting>
<para>See <link
@ -817,11 +817,11 @@ at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210)
<title>HDFS Utilization of Tables and Regions</title>
<para>To determine how much space HBase is using on HDFS use the <code>hadoop</code> shell
commands from the NameNode. For example... </para>
<para><programlisting>hadoop fs -dus /hbase/</programlisting> ...returns the summarized disk
<para><programlisting language="bourne">hadoop fs -dus /hbase/</programlisting> ...returns the summarized disk
utilization for all HBase objects. </para>
<para><programlisting>hadoop fs -dus /hbase/myTable</programlisting> ...returns the summarized
<para><programlisting language="bourne">hadoop fs -dus /hbase/myTable</programlisting> ...returns the summarized
disk utilization for the HBase table 'myTable'. </para>
<para><programlisting>hadoop fs -du /hbase/myTable</programlisting> ...returns a list of the
<para><programlisting language="bourne">hadoop fs -du /hbase/myTable</programlisting> ...returns a list of the
regions under the HBase table 'myTable' and their disk utilization. </para>
<para>For more information on HDFS shell commands, see the <link
xlink:href="http://hadoop.apache.org/common/docs/current/file_system_shell.html">HDFS
@ -1061,7 +1061,7 @@ ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: ZooKeeper session expi
<para>If you wish to increase the session timeout, add the following to your
<filename>hbase-site.xml</filename> to increase the timeout from the default of 60
seconds to 120 seconds. </para>
<programlisting>
<programlisting language="xml">
<![CDATA[<property>
<name>zookeeper.session.timeout</name>
<value>1200000</value>
@ -1345,13 +1345,13 @@ security.provider.1=sun.security.pkcs11.SunPKCS11 ${java.home}/lib/security/nss.
detail at <link
xlink:href="http://www.centos.org/docs/5/html/5.1/Deployment_Guide/s3-proc-sys-vm.html" />. </para>
<para>To find the current value on your system, run the following command:</para>
<screen>[user@host]# <userinput>cat /proc/sys/vm/min_free_kbytes</userinput></screen>
<screen language="bourne">[user@host]# <userinput>cat /proc/sys/vm/min_free_kbytes</userinput></screen>
<para>Next, raise the value. Try doubling, then quadrupling the value. Note that setting the
value too low or too high could have detrimental effects on your system. Consult your
operating system vendor for specific recommendations.</para>
<para>Use the following command to modify the value of <code>min_free_kbytes</code>,
substituting <replaceable>&lt;value&gt;</replaceable> with your intended value:</para>
<screen>[user@host]# <userinput>echo &lt;value&gt; > /proc/sys/vm/min_free_kbytes</userinput></screen>
<screen language="bourne">[user@host]# <userinput>echo &lt;value&gt; > /proc/sys/vm/min_free_kbytes</userinput></screen>
</section>
</section>

View File

@ -158,7 +158,7 @@
<para>HDFS and ZooKeeper should be up and running during the upgrade process.</para>
</note>
<para>hbase-0.96.0 comes with an upgrade script. Run
<programlisting>$ bin/hbase upgrade</programlisting> to see its usage. The script
<programlisting language="bourne">$ bin/hbase upgrade</programlisting> to see its usage. The script
has two main modes: -check, and -execute. </para>
<section>
<title>check</title>
@ -205,7 +205,7 @@ There are some HFileV1, or corrupt files (files with incorrect major version)
<para>By default, the check step scans the hbase root directory (defined as
hbase.rootdir in the configuration). To scan a specific directory only, pass the
<emphasis>-dir</emphasis> option.</para>
<screen>$ bin/hbase upgrade -check -dir /myHBase/testTable</screen>
<screen language="bourne">$ bin/hbase upgrade -check -dir /myHBase/testTable</screen>
<para>The above command would detect HFileV1s in the /myHBase/testTable directory. </para>
<para> Once the check step reports all the HFileV1 files have been rewritten, it is
safe to proceed with the upgrade. </para>
@ -246,7 +246,7 @@ There are some HFileV1, or corrupt files (files with incorrect major version)
<para> To run the <emphasis>execute</emphasis> step, make sure that first you have
copied hbase-0.96.0 binaries everywhere under servers and under clients. Make
sure the 0.94.0 cluster is down. Then do as follows:</para>
<screen>$ bin/hbase upgrade -execute</screen>
<screen language="bourne">$ bin/hbase upgrade -execute</screen>
<para>Here is some sample output.</para>
<programlisting>
Starting Namespace upgrade
@ -265,7 +265,7 @@ Successfully completed Log splitting
</programlisting>
<para> If the output from the execute step looks good, stop the zookeeper instance
you started to do the upgrade:
<programlisting>$ ./hbase/bin/hbase-daemon.sh stop zookeeper</programlisting>
<programlisting language="bourne">$ ./hbase/bin/hbase-daemon.sh stop zookeeper</programlisting>
Now start up hbase-0.96.0. </para>
</section>
<section

View File

@ -92,7 +92,7 @@
has ZooKeeper persist data under <filename>/tmp</filename> which is often cleared on system
restart. In the example below we have ZooKeeper persist to
<filename>/user/local/zookeeper</filename>.</para>
<programlisting><![CDATA[
<programlisting language="java"><![CDATA[
<configuration>
...
<property>
@ -146,7 +146,7 @@
<para>To point HBase at an existing ZooKeeper cluster, one that is not managed by HBase, set
<varname>HBASE_MANAGES_ZK</varname> in <filename>conf/hbase-env.sh</filename> to
false</para>
<screen>
<screen language="bourne">
...
# Tell HBase whether it should manage its own instance of Zookeeper or not.
export HBASE_MANAGES_ZK=false</screen>
@ -160,7 +160,7 @@
regular start/stop scripts. If you would like to run ZooKeeper yourself, independent of HBase
start/stop, you would do the following</para>
<screen>
<screen language="bourne">
${HBASE_HOME}/bin/hbase-daemons.sh {start,stop} zookeeper
</screen>
@ -225,7 +225,7 @@ ${HBASE_HOME}/bin/hbase-daemons.sh {start,stop} zookeeper
<para>On each host that will run an HBase client (e.g. <code>hbase shell</code>), add the
following file to the HBase home directory's <filename>conf</filename> directory:</para>
<programlisting>
<programlisting language="java">
Client {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=false
@ -244,7 +244,7 @@ Client {
configuration file in the conf directory of the node's <filename>HBASE_HOME</filename>
directory that looks like the following:</para>
<programlisting>
<programlisting language="java">
Server {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
@ -276,7 +276,7 @@ Client {
<para>Modify your <filename>hbase-env.sh</filename> to include the following:</para>
<programlisting>
<programlisting language="bourne">
export HBASE_OPTS="-Djava.security.auth.login.config=$CLIENT_CONF"
export HBASE_MANAGES_ZK=true
export HBASE_ZOOKEEPER_OPTS="-Djava.security.auth.login.config=$HBASE_SERVER_CONF"
@ -290,7 +290,7 @@ export HBASE_REGIONSERVER_OPTS="-Djava.security.auth.login.config=$HBASE_SERVER_
<para>Modify your <filename>hbase-site.xml</filename> on each node that will run zookeeper,
master or regionserver to contain:</para>
<programlisting><![CDATA[
<programlisting language="java"><![CDATA[
<configuration>
<property>
<name>hbase.zookeeper.quorum</name>
@ -332,7 +332,7 @@ bin/hbase regionserver start
<section>
<title>External Zookeeper Configuration</title>
<para>Add a JAAS configuration file that looks like:</para>
<programlisting>
<programlisting language="java">
Client {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
@ -348,7 +348,7 @@ Client {
<para>Modify your hbase-env.sh to include the following:</para>
<programlisting>
<programlisting language="bourne">
export HBASE_OPTS="-Djava.security.auth.login.config=$CLIENT_CONF"
export HBASE_MANAGES_ZK=false
export HBASE_MASTER_OPTS="-Djava.security.auth.login.config=$HBASE_SERVER_CONF"
@ -359,7 +359,7 @@ export HBASE_REGIONSERVER_OPTS="-Djava.security.auth.login.config=$HBASE_SERVER_
<para>Modify your <filename>hbase-site.xml</filename> on each node that will run a master or
regionserver to contain:</para>
<programlisting><![CDATA[
<programlisting language="xml"><![CDATA[
<configuration>
<property>
<name>hbase.zookeeper.quorum</name>
@ -377,13 +377,13 @@ export HBASE_REGIONSERVER_OPTS="-Djava.security.auth.login.config=$HBASE_SERVER_
Quorum hosts.</para>
<para> Add a <filename>zoo.cfg</filename> for each Zookeeper Quorum host containing:</para>
<programlisting>
<programlisting language="java">
authProvider.1=org.apache.zookeeper.server.auth.SASLAuthenticationProvider
kerberos.removeHostFromPrincipal=true
kerberos.removeRealmFromPrincipal=true
</programlisting>
<para>Also on each of these hosts, create a JAAS configuration file containing:</para>
<programlisting>
<programlisting language="java">
Server {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
@ -397,7 +397,7 @@ Server {
pathname of this file as <filename>$ZK_SERVER_CONF</filename> below. </para>
<para> Start your Zookeepers on each Zookeeper Quorum host with:</para>
<programlisting>
<programlisting language="bourne">
SERVER_JVMFLAGS="-Djava.security.auth.login.config=$ZK_SERVER_CONF" bin/zkServer start
</programlisting>
@ -482,7 +482,7 @@ bin/hbase regionserver &amp;
<para> You must override the standard hadoop-core jar file from the
<code>target/cached_classpath.txt</code> file with the version containing the
HADOOP-7070 fix. You can use the following script to do this:</para>
<screen>
<screen language="bourne">
echo `find ~/.m2 -name "*hadoop-core*7070*SNAPSHOT.jar"` ':' `cat target/cached_classpath.txt` | sed 's/ //g' > target/tmp.txt
mv target/tmp.txt target/cached_classpath.txt
</screen>