HBASE-11190 Fix easy typos in documentation
git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1595318 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
ccf33cdb42
commit
3f0f2a2bfe
|
@ -68,8 +68,8 @@
|
|||
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="preface.xml" />
|
||||
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="getting_started.xml" />
|
||||
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="configuration.xml" />
|
||||
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="upgrading.xml" />
|
||||
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="shell.xml" />
|
||||
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="upgrading.xml"/>
|
||||
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="shell.xml"/>
|
||||
|
||||
<chapter xml:id="datamodel">
|
||||
<title>Data Model</title>
|
||||
|
@ -658,7 +658,7 @@ htable.put(put);
|
|||
</chapter> <!-- data model -->
|
||||
|
||||
<!-- schema design -->
|
||||
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="schema_design.xml" />
|
||||
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="schema_design.xml"/>
|
||||
|
||||
<chapter xml:id="mapreduce">
|
||||
<title>HBase and MapReduce</title>
|
||||
|
@ -1319,11 +1319,16 @@ scan.setFilter(list);
|
|||
</section>
|
||||
<section xml:id="client.filter.cv"><title>Column Value</title>
|
||||
<section xml:id="client.filter.cv.scvf"><title>SingleColumnValueFilter</title>
|
||||
<para><link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.html">SingleColumnValueFilter</link>
|
||||
can be used to test column values for equivalence (<code><link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/CompareFilter.CompareOp.html">CompareOp.EQUAL</link>
|
||||
</code>), inequality (<code>CompareOp.NOT_EQUAL</code>), or ranges
|
||||
(e.g., <code>CompareOp.GREATER</code>). The folowing is example of testing equivalence a column to a String value "my value"...
|
||||
<programlisting>
|
||||
<para><link
|
||||
xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.html"
|
||||
>SingleColumnValueFilter</link> can be used to test column values for equivalence
|
||||
(<code><link
|
||||
xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/CompareFilter.CompareOp.html"
|
||||
>CompareOp.EQUAL</link>
|
||||
</code>), inequality (<code>CompareOp.NOT_EQUAL</code>), or ranges (e.g.,
|
||||
<code>CompareOp.GREATER</code>). The following is example of testing equivalence a
|
||||
column to a String value "my value"...
|
||||
<programlisting>
|
||||
SingleColumnValueFilter filter = new SingleColumnValueFilter(
|
||||
cf,
|
||||
column,
|
||||
|
@ -1604,9 +1609,9 @@ rs.close();
|
|||
<listitem>Single access priority: The first time a block is loaded from HDFS it normally has this priority and it will be part of the first group to be considered
|
||||
during evictions. The advantage is that scanned blocks are more likely to get evicted than blocks that are getting more usage.
|
||||
</listitem>
|
||||
<listitem>Mutli access priority: If a block in the previous priority group is accessed again, it upgrades to this priority. It is thus part of the second group
|
||||
considered during evictions.
|
||||
</listitem>
|
||||
<listitem>Multi access priority: If a block in the previous priority group is accessed
|
||||
again, it upgrades to this priority. It is thus part of the second group considered
|
||||
during evictions. </listitem>
|
||||
<listitem>In-memory access priority: If the block's family was configured to be "in-memory", it will be part of this priority disregarding the number of times it
|
||||
was accessed. Catalog tables are configured like this. This group is the last one considered during evictions.
|
||||
</listitem>
|
||||
|
@ -2418,13 +2423,13 @@ All the settings that apply to normal compactions (file size limits, etc.) apply
|
|||
|
||||
</chapter> <!-- architecture -->
|
||||
|
||||
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="external_apis.xml" />
|
||||
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="external_apis.xml"/>
|
||||
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="cp.xml" />
|
||||
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="performance.xml" />
|
||||
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="troubleshooting.xml" />
|
||||
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="performance.xml"/>
|
||||
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="troubleshooting.xml"/>
|
||||
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="case_studies.xml" />
|
||||
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="ops_mgt.xml" />
|
||||
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="developer.xml" />
|
||||
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="ops_mgt.xml"/>
|
||||
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="developer.xml"/>
|
||||
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="zookeeper.xml" />
|
||||
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="cp.xml" />
|
||||
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="community.xml" />
|
||||
|
@ -2533,9 +2538,8 @@ All the settings that apply to normal compactions (file size limits, etc.) apply
|
|||
<qandaentry xml:id="faq.changing.rowkeys">
|
||||
<question><para>Can I change a table's rowkeys?</para></question>
|
||||
<answer>
|
||||
<para>
|
||||
This is a very common quesiton. You can't. See <xref linkend="changing.rowkeys" />.
|
||||
</para>
|
||||
<para> This is a very common question. You can't. See <xref
|
||||
linkend="changing.rowkeys"/>. </para>
|
||||
</answer>
|
||||
</qandaentry>
|
||||
<qandaentry xml:id="faq.apis">
|
||||
|
|
|
@ -29,7 +29,7 @@
|
|||
<title>Apache HBase Configuration</title>
|
||||
<para>This chapter is the Not-So-Quick start guide to Apache HBase configuration. It goes
|
||||
over system requirements, Hadoop setup, the different Apache HBase run modes, and the
|
||||
various configurations in HBase. Please read this chapter carefully. At a mimimum
|
||||
various configurations in HBase. Please read this chapter carefully. At a minimum
|
||||
ensure that all <xref linkend="basic.prerequisites" /> have
|
||||
been satisfied. Failure to do so will cause you (and us) grief debugging strange errors
|
||||
and/or data loss.</para>
|
||||
|
@ -778,7 +778,7 @@ stopping hbase...............</programlisting> Shutdown can take a moment to
|
|||
<title><filename>hbase-env.sh</filename></title>
|
||||
<para>Set HBase environment variables in this file.
|
||||
Examples include options to pass the JVM on start of
|
||||
an HBase daemon such as heap size and garbarge collector configs.
|
||||
an HBase daemon such as heap size and garbage collector configs.
|
||||
You can also set configurations for HBase configuration, log directories,
|
||||
niceness, ssh options, where to locate process pid files,
|
||||
etc. Open the file at
|
||||
|
|
|
@ -492,12 +492,13 @@ HBase have a character not usually seen in other projects.</para>
|
|||
<section xml:id="hbase.moduletests">
|
||||
<title>Apache HBase Modules</title>
|
||||
<para>As of 0.96, Apache HBase is split into multiple modules which creates "interesting" rules for
|
||||
how and where tests are written. If you are writting code for <classname>hbase-server</classname>, see
|
||||
<xref linkend="hbase.unittests"/> for how to write your tests; these tests can spin
|
||||
up a minicluster and will need to be categorized. For any other module, for example
|
||||
<classname>hbase-common</classname>, the tests must be strict unit tests and just test the class
|
||||
under test - no use of the HBaseTestingUtility or minicluster is allowed (or even possible
|
||||
given the dependency tree).</para>
|
||||
how and where tests are written. If you are writing code for
|
||||
<classname>hbase-server</classname>, see <xref linkend="hbase.unittests"/> for
|
||||
how to write your tests; these tests can spin up a minicluster and will need to be
|
||||
categorized. For any other module, for example <classname>hbase-common</classname>,
|
||||
the tests must be strict unit tests and just test the class under test - no use of
|
||||
the HBaseTestingUtility or minicluster is allowed (or even possible given the
|
||||
dependency tree).</para>
|
||||
<section xml:id="hbase.moduletest.run">
|
||||
<title>Running Tests in other Modules</title>
|
||||
If the module you are developing in has no other dependencies on other HBase modules, then
|
||||
|
@ -643,22 +644,22 @@ error will be reported when a non-existent test case is specified.
|
|||
|
||||
<section xml:id="hbase.unittests.test.faster">
|
||||
<title>Running tests faster</title>
|
||||
<para>
|
||||
By default, <code>$ mvn test -P runAllTests</code> runs 5 tests in parallel.
|
||||
It can be increased on a developer's machine. Allowing that you can have 2
|
||||
tests in parallel per core, and you need about 2Gb of memory per test (at the
|
||||
extreme), if you have an 8 core, 24Gb box, you can have 16 tests in parallel.
|
||||
but the memory available limits it to 12 (24/2), To run all tests with 12 tests
|
||||
in parallell, do this:
|
||||
<command>mvn test -P runAllTests -Dsurefire.secondPartThreadCount=12</command>.
|
||||
To increase the speed, you can as well use a ramdisk. You will need 2Gb of memory
|
||||
to run all tests. You will also need to delete the files between two test run.
|
||||
The typical way to configure a ramdisk on Linux is:
|
||||
<programlisting>$ sudo mkdir /ram2G
|
||||
<para> By default, <code>$ mvn test -P runAllTests</code> runs 5 tests in parallel. It can be
|
||||
increased on a developer's machine. Allowing that you can have 2 tests in
|
||||
parallel per core, and you need about 2Gb of memory per test (at the extreme),
|
||||
if you have an 8 core, 24Gb box, you can have 16 tests in parallel. but the
|
||||
memory available limits it to 12 (24/2), To run all tests with 12 tests in
|
||||
parallel, do this: <command>mvn test -P runAllTests
|
||||
-Dsurefire.secondPartThreadCount=12</command>. To increase the speed, you
|
||||
can as well use a ramdisk. You will need 2Gb of memory to run all tests. You
|
||||
will also need to delete the files between two test run. The typical way to
|
||||
configure a ramdisk on Linux is:
|
||||
<programlisting>$ sudo mkdir /ram2G
|
||||
sudo mount -t tmpfs -o size=2048M tmpfs /ram2G</programlisting>
|
||||
You can then use it to run all HBase tests with the command:
|
||||
<command>mvn test -P runAllTests -Dsurefire.secondPartThreadCount=12 -Dtest.build.data.basedirectory=/ram2G</command>
|
||||
</para>
|
||||
You can then use it to run all HBase tests with the command: <command>mvn test
|
||||
-P runAllTests -Dsurefire.secondPartThreadCount=12
|
||||
-Dtest.build.data.basedirectory=/ram2G</command>
|
||||
</para>
|
||||
</section>
|
||||
|
||||
<section xml:id="hbase.unittests.cmds.test.hbasetests">
|
||||
|
@ -818,7 +819,7 @@ This actually runs ALL the integration tests.
|
|||
<programlisting>mvn failsafe:verify</programlisting>
|
||||
The above command basically looks at all the test results (so don't remove the 'target' directory) for test failures and reports the results.</para>
|
||||
|
||||
<section xml:id="maven.build.commanas.integration.tests2">
|
||||
<section xml:id="maven.build.commands.integration.tests2">
|
||||
<title>Running a subset of Integration tests</title>
|
||||
<para>This is very similar to how you specify running a subset of unit tests (see above), but use the property
|
||||
<code>it.test</code> instead of <code>test</code>.
|
||||
|
@ -970,9 +971,9 @@ pecularity that is probably fixable but we've not spent the time trying to figur
|
|||
Similarly, for 3.0, you would just replace the profile value. Note that Hadoop-3.0.0-SNAPSHOT does not currently have a
|
||||
deployed maven artificat - you will need to build and install your own in your local maven repository if you want to run against this profile.
|
||||
</para>
|
||||
<para>
|
||||
In earilier verions of Apache HBase, you can build against older versions of Apache Hadoop, notably, Hadoop 0.22.x and 0.23.x.
|
||||
If you are running, for example HBase-0.94 and wanted to build against Hadoop 0.23.x, you would run with:</para>
|
||||
<para> In earilier versions of Apache HBase, you can build against older versions of Apache
|
||||
Hadoop, notably, Hadoop 0.22.x and 0.23.x. If you are running, for example
|
||||
HBase-0.94 and wanted to build against Hadoop 0.23.x, you would run with:</para>
|
||||
<programlisting>mvn -Dhadoop.profile=22 ...</programlisting>
|
||||
</section>
|
||||
</section>
|
||||
|
@ -1420,8 +1421,7 @@ Bar bar = foo.getBar(); <--- imagine there's an extra space(s) after the
|
|||
<para>
|
||||
Committers do this. See <link xlink:href="http://wiki.apache.org/hadoop/Hbase/HowToCommit">How To Commit</link> in the Apache HBase wiki.
|
||||
</para>
|
||||
<para>Commiters will also resolve the Jira, typically after the patch passes a build.
|
||||
</para>
|
||||
<para>Committers will also resolve the Jira, typically after the patch passes a build. </para>
|
||||
<section xml:id="committer.tests">
|
||||
<title>Committers are responsible for making sure commits do not break the build or tests</title>
|
||||
<para>
|
||||
|
|
|
@ -295,7 +295,8 @@
|
|||
<para><emphasis role="bold">Description:</emphasis> This filter takes two
|
||||
arguments – a limit and offset. It returns limit number of columns after offset number
|
||||
of columns. It does this for all the rows</para>
|
||||
<para><emphasis role="bold">Syntax:</emphasis> ColumnPaginationFilter(‘<limit>’, ‘<offest>’) </para>
|
||||
<para><emphasis role="bold">Syntax:</emphasis> ColumnPaginationFilter(‘<limit>’,
|
||||
‘<offset>’) </para>
|
||||
<para><emphasis role="bold">Example:</emphasis> "ColumnPaginationFilter (3, 5)" </para>
|
||||
</listitem>
|
||||
|
||||
|
|
|
@ -36,10 +36,11 @@
|
|||
|
||||
<para>Here we list HBase tools for administration, analysis, fixup, and debugging.</para>
|
||||
<section xml:id="canary"><title>Canary</title>
|
||||
<para>There is a Canary class can help users to canary-test the HBase cluster status, with every column-family for every regions or regionservers granularity. To see the usage,
|
||||
<programlisting>$ ${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.tool.Canary -help</programlisting>
|
||||
Will output
|
||||
<programlisting>Usage: bin/hbase org.apache.hadoop.hbase.tool.Canary [opts] [table1 [table2]...] | [regionserver1 [regionserver2]..]
|
||||
<para>There is a Canary class can help users to canary-test the HBase cluster status, with every
|
||||
column-family for every regions or regionservers granularity. To see the usage,
|
||||
<programlisting>$ ${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.tool.Canary -help</programlisting>
|
||||
Will output
|
||||
<programlisting>Usage: bin/hbase org.apache.hadoop.hbase.tool.Canary [opts] [table1 [table2]...] | [regionserver1 [regionserver2]..]
|
||||
where [opts] are:
|
||||
-help Show this help and exit.
|
||||
-regionserver replace the table argument to regionserver,
|
||||
|
@ -49,25 +50,46 @@ Will output
|
|||
-e Use region/regionserver as regular expression
|
||||
which means the region/regionserver is regular expression pattern
|
||||
-f <B> stop whole program if first error occurs, default is true
|
||||
-t <N> timeout for a check, default is 600000 (milisecs)</programlisting>
|
||||
This tool will return non zero error codes to user for collaborating with other monitoring tools, such as Nagios.
|
||||
The error code definitions are...
|
||||
<programlisting>private static final int USAGE_EXIT_CODE = 1;
|
||||
-t <N> timeout for a check, default is 600000 (milliseconds)</programlisting>
|
||||
This tool will return non zero error codes to user for collaborating with other monitoring
|
||||
tools, such as Nagios. The error code definitions are...
|
||||
<programlisting>private static final int USAGE_EXIT_CODE = 1;
|
||||
private static final int INIT_ERROR_EXIT_CODE = 2;
|
||||
private static final int TIMEOUT_ERROR_EXIT_CODE = 3;
|
||||
private static final int ERROR_EXIT_CODE = 4;</programlisting>
|
||||
Here are some examples based on the following given case. There are two HTable called test-01 and test-02, they have two column family cf1 and cf2 respectively, and deployed on the 3 regionservers. see following table.
|
||||
<table>
|
||||
<tgroup cols='3' align='center' colsep='1' rowsep='1'><colspec colname='regionserver' align='center'/><colspec colname='test-01' align='center'/><colspec colname='test-02' align='center'/>
|
||||
<thead>
|
||||
<row><entry>RegionServer</entry><entry>test-01</entry><entry>test-02</entry></row>
|
||||
</thead><tbody>
|
||||
<row><entry>rs1</entry><entry>r1</entry> <entry>r2</entry></row>
|
||||
<row><entry>rs2</entry><entry>r2</entry> <entry></entry></row>
|
||||
<row><entry>rs3</entry><entry>r2</entry> <entry>r1</entry></row>
|
||||
</tbody></tgroup></table>
|
||||
Following are some examples based on the previous given case.
|
||||
</para>
|
||||
Here are some examples based on the following given case. There are two HTable called
|
||||
test-01 and test-02, they have two column family cf1 and cf2 respectively, and deployed on
|
||||
the 3 regionservers. see following table. <table>
|
||||
<tgroup cols="3" align="center" colsep="1" rowsep="1">
|
||||
<colspec colname="regionserver" align="center"/>
|
||||
<colspec colname="test-01" align="center"/>
|
||||
<colspec colname="test-02" align="center"/>
|
||||
<thead>
|
||||
<row>
|
||||
<entry>RegionServer</entry>
|
||||
<entry>test-01</entry>
|
||||
<entry>test-02</entry>
|
||||
</row>
|
||||
</thead>
|
||||
<tbody>
|
||||
<row>
|
||||
<entry>rs1</entry>
|
||||
<entry>r1</entry>
|
||||
<entry>r2</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>rs2</entry>
|
||||
<entry>r2</entry>
|
||||
<entry/>
|
||||
</row>
|
||||
<row>
|
||||
<entry>rs3</entry>
|
||||
<entry>r2</entry>
|
||||
<entry>r1</entry>
|
||||
</row>
|
||||
</tbody>
|
||||
</tgroup>
|
||||
</table> Following are some examples based on the previous given case. </para>
|
||||
<section><title>Canary test for every column family (store) of every region of every table</title>
|
||||
<para>
|
||||
<programlisting>$ ${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.tool.Canary</programlisting>
|
||||
|
|
|
@ -45,10 +45,9 @@
|
|||
</section>
|
||||
<section xml:id="perf.network">
|
||||
<title>Network</title>
|
||||
<para>
|
||||
Perhaps the most important factor in avoiding network issues degrading Hadoop and HBbase performance is the switching hardware
|
||||
that is used, decisions made early in the scope of the project can cause major problems when you double or triple the size of your cluster (or more).
|
||||
</para>
|
||||
<para> Perhaps the most important factor in avoiding network issues degrading Hadoop and HBase
|
||||
performance is the switching hardware that is used, decisions made early in the scope of the
|
||||
project can cause major problems when you double or triple the size of your cluster (or more). </para>
|
||||
<para>
|
||||
Important items to consider:
|
||||
<itemizedlist>
|
||||
|
@ -400,7 +399,7 @@ Deferred log flush can be configured on tables via <link
|
|||
<section xml:id="perf.hbase.client.regiongroup">
|
||||
<title>HBase Client: Group Puts by RegionServer</title>
|
||||
<para>In addition to using the writeBuffer, grouping <classname>Put</classname>s by RegionServer can reduce the number of client RPC calls per writeBuffer flush.
|
||||
There is a utility <classname>HTableUtil</classname> currently on TRUNK that does this, but you can either copy that or implement your own verison for
|
||||
There is a utility <classname>HTableUtil</classname> currently on TRUNK that does this, but you can either copy that or implement your own version for
|
||||
those still on 0.90.x or earlier.
|
||||
</para>
|
||||
</section>
|
||||
|
|
|
@ -141,7 +141,7 @@ admin.enableTable(table);
|
|||
</para>
|
||||
<para>See <xref linkend="keyvalue"/> for more information on HBase stores data internally to see why this is important.</para>
|
||||
</section>
|
||||
<section xml:id="keysize.atttributes"><title>Attributes</title>
|
||||
<section xml:id="keysize.attributes"><title>Attributes</title>
|
||||
<para>Although verbose attribute names (e.g., "myVeryImportantAttribute") are easier to read, prefer shorter attribute names (e.g., "via")
|
||||
to store in HBase.
|
||||
</para>
|
||||
|
@ -335,10 +335,11 @@ public static byte[][] getHexSplits(String startKey, String endKey, int numRegio
|
|||
<link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Result.html">Result</link>, so anything that can be
|
||||
converted to an array of bytes can be stored as a value. Input could be strings, numbers, complex objects, or even images as long as they can rendered as bytes.
|
||||
</para>
|
||||
<para>There are practical limits to the size of values (e.g., storing 10-50MB objects in HBase would probably be too much to ask);
|
||||
search the mailling list for conversations on this topic. All rows in HBase conform to the <xref linkend="datamodel">datamodel</xref>, and
|
||||
that includes versioning. Take that into consideration when making your design, as well as block size for the ColumnFamily.
|
||||
</para>
|
||||
<para>There are practical limits to the size of values (e.g., storing 10-50MB objects in HBase
|
||||
would probably be too much to ask); search the mailing list for conversations on this topic.
|
||||
All rows in HBase conform to the <xref linkend="datamodel">datamodel</xref>, and that includes
|
||||
versioning. Take that into consideration when making your design, as well as block size for
|
||||
the ColumnFamily. </para>
|
||||
<section xml:id="counters">
|
||||
<title>Counters</title>
|
||||
<para>
|
||||
|
@ -396,10 +397,11 @@ public static byte[][] getHexSplits(String startKey, String endKey, int numRegio
|
|||
... and solutions are also influenced by the size of the cluster and how much processing power you have to throw at the solution.
|
||||
Common techniques are in sub-sections below. This is a comprehensive, but not exhaustive, list of approaches.
|
||||
</para>
|
||||
<para>It should not be a surprise that secondary indexes require additional cluster space and processing.
|
||||
This is precisely what happens in an RDBMS because the act of creating an alternate index requires both space and processing cycles to update. RBDMS products
|
||||
are more advanced in this regard to handle alternative index management out of the box. However, HBase scales better at larger data volumes, so this is a feature trade-off.
|
||||
</para>
|
||||
<para>It should not be a surprise that secondary indexes require additional cluster space and
|
||||
processing. This is precisely what happens in an RDBMS because the act of creating an
|
||||
alternate index requires both space and processing cycles to update. RDBMS products are more
|
||||
advanced in this regard to handle alternative index management out of the box. However, HBase
|
||||
scales better at larger data volumes, so this is a feature trade-off. </para>
|
||||
<para>Pay attention to <xref linkend="performance"/> when implementing any of these approaches.</para>
|
||||
<para>Additionally, see the David Butler response in this dist-list thread <link xlink:href="http://search-hadoop.com/m/nvbiBp2TDP/Stargate%252Bhbase&subj=Stargate+hbase">HBase, mail # user - Stargate+hbase</link>
|
||||
</para>
|
||||
|
@ -765,10 +767,11 @@ reasonable spread in the keyspace, similar options appear:
|
|||
tall and wide tables. These are general guidelines and not laws - each application must consider its own needs.
|
||||
</para>
|
||||
<section xml:id="schema.smackdown.rowsversions"><title>Rows vs. Versions</title>
|
||||
<para>A common question is whether one should prefer rows or HBase's built-in-versioning. The context is typically where there are
|
||||
"a lot" of versions of a row to be retained (e.g., where it is significantly above the HBase default of 1 max versions). The
|
||||
rows-approach would require storing a timstamp in some portion of the rowkey so that they would not overwite with each successive update.
|
||||
</para>
|
||||
<para>A common question is whether one should prefer rows or HBase's built-in-versioning. The
|
||||
context is typically where there are "a lot" of versions of a row to be retained (e.g.,
|
||||
where it is significantly above the HBase default of 1 max versions). The rows-approach
|
||||
would require storing a timestamp in some portion of the rowkey so that they would not
|
||||
overwite with each successive update. </para>
|
||||
<para>Preference: Rows (generally speaking).
|
||||
</para>
|
||||
</section>
|
||||
|
|
|
@ -175,18 +175,16 @@ hbase(main):018:0>
|
|||
</section>
|
||||
|
||||
<section><title><filename>irbrc</filename></title>
|
||||
<para>Create an <filename>.irbrc</filename> file for yourself in your
|
||||
home directory. Add customizations. A useful one is
|
||||
command history so commands are save across Shell invocations:
|
||||
<programlisting>
|
||||
<para>Create an <filename>.irbrc</filename> file for yourself in your home
|
||||
directory. Add customizations. A useful one is command history so commands are save
|
||||
across Shell invocations:
|
||||
<programlisting>
|
||||
$ more .irbrc
|
||||
require 'irb/ext/save-history'
|
||||
IRB.conf[:SAVE_HISTORY] = 100
|
||||
IRB.conf[:HISTORY_FILE] = "#{ENV['HOME']}/.irb-save-history"</programlisting>
|
||||
See the <application>ruby</application> documentation of
|
||||
<filename>.irbrc</filename> to learn about other possible
|
||||
confiurations.
|
||||
</para>
|
||||
See the <application>ruby</application> documentation of <filename>.irbrc</filename>
|
||||
to learn about other possible configurations. </para>
|
||||
</section>
|
||||
<section><title>LOG data to timestamp</title>
|
||||
<para>
|
||||
|
|
|
@ -620,18 +620,21 @@ Harsh J investigated the issue as part of the mailing list thread
|
|||
</section>
|
||||
<section xml:id="trouble.client.oome.directmemory.leak">
|
||||
<title>Client running out of memory though heap size seems to be stable (but the off-heap/direct heap keeps growing)</title>
|
||||
<para>
|
||||
You are likely running into the issue that is described and worked through in
|
||||
the mail thread <link xhref="http://search-hadoop.com/m/ubhrX8KvcH/Suspected+memory+leak&subj=Re+Suspected+memory+leak">HBase, mail # user - Suspected memory leak</link>
|
||||
and continued over in <link xhref="http://search-hadoop.com/m/p2Agc1Zy7Va/MaxDirectMemorySize+Was%253A+Suspected+memory+leak&subj=Re+FeedbackRe+Suspected+memory+leak">HBase, mail # dev - FeedbackRe: Suspected memory leak</link>.
|
||||
A workaround is passing your client-side JVM a reasonable value for <code>-XX:MaxDirectMemorySize</code>. By default,
|
||||
the <varname>MaxDirectMemorySize</varname> is equal to your <code>-Xmx</code> max heapsize setting (if <code>-Xmx</code> is set).
|
||||
Try seting it to something smaller (for example, one user had success setting it to <code>1g</code> when
|
||||
they had a client-side heap of <code>12g</code>). If you set it too small, it will bring on <code>FullGCs</code> so keep
|
||||
it a bit hefty. You want to make this setting client-side only especially if you are running the new experiemental
|
||||
server-side off-heap cache since this feature depends on being able to use big direct buffers (You may have to keep
|
||||
separate client-side and server-side config dirs).
|
||||
</para>
|
||||
<para> You are likely running into the issue that is described and worked through in the
|
||||
mail thread <link
|
||||
xhref="http://search-hadoop.com/m/ubhrX8KvcH/Suspected+memory+leak&subj=Re+Suspected+memory+leak"
|
||||
>HBase, mail # user - Suspected memory leak</link> and continued over in <link
|
||||
xhref="http://search-hadoop.com/m/p2Agc1Zy7Va/MaxDirectMemorySize+Was%253A+Suspected+memory+leak&subj=Re+FeedbackRe+Suspected+memory+leak"
|
||||
>HBase, mail # dev - FeedbackRe: Suspected memory leak</link>. A workaround is passing
|
||||
your client-side JVM a reasonable value for <code>-XX:MaxDirectMemorySize</code>. By
|
||||
default, the <varname>MaxDirectMemorySize</varname> is equal to your <code>-Xmx</code> max
|
||||
heapsize setting (if <code>-Xmx</code> is set). Try seting it to something smaller (for
|
||||
example, one user had success setting it to <code>1g</code> when they had a client-side heap
|
||||
of <code>12g</code>). If you set it too small, it will bring on <code>FullGCs</code> so keep
|
||||
it a bit hefty. You want to make this setting client-side only especially if you are running
|
||||
the new experimental server-side off-heap cache since this feature depends on being able to
|
||||
use big direct buffers (You may have to keep separate client-side and server-side config
|
||||
dirs). </para>
|
||||
</section>
|
||||
<section xml:id="trouble.client.slowdown.admin">
|
||||
<title>Client Slowdown When Calling Admin Methods (flush, compact, etc.)</title>
|
||||
|
@ -728,15 +731,17 @@ Caused by: java.io.FileNotFoundException: File _partition.lst does not exist.
|
|||
</section>
|
||||
<section xml:id="trouble.namenode.hbase.objects">
|
||||
<title>Browsing HDFS for HBase Objects</title>
|
||||
<para>Somtimes it will be necessary to explore the HBase objects that exist on HDFS. These objects could include the WALs (Write Ahead Logs), tables, regions, StoreFiles, etc.
|
||||
The easiest way to do this is with the NameNode web application that runs on port 50070. The NameNode web application will provide links to the all the DataNodes in the cluster so that
|
||||
they can be browsed seamlessly. </para>
|
||||
<para>Sometimes it will be necessary to explore the HBase objects that exist on HDFS.
|
||||
These objects could include the WALs (Write Ahead Logs), tables, regions, StoreFiles, etc.
|
||||
The easiest way to do this is with the NameNode web application that runs on port 50070. The
|
||||
NameNode web application will provide links to the all the DataNodes in the cluster so that
|
||||
they can be browsed seamlessly. </para>
|
||||
<para>The HDFS directory structure of HBase tables in the cluster is...
|
||||
<programlisting>
|
||||
<filename>/hbase</filename>
|
||||
<filename>/<Table></filename> (Tables in the cluster)
|
||||
<filename>/<Region></filename> (Regions for the table)
|
||||
<filename>/<ColumnFamiy></filename> (ColumnFamilies for the Region for the table)
|
||||
<filename>/<ColumnFamily></filename> (ColumnFamilies for the Region for the table)
|
||||
<filename>/<StoreFile></filename> (StoreFiles for the ColumnFamily for the Regions for the table)
|
||||
</programlisting>
|
||||
</para>
|
||||
|
|
|
@ -27,9 +27,8 @@
|
|||
*/
|
||||
-->
|
||||
<title>Upgrading</title>
|
||||
<para>You cannot skip major verisons upgrading. If you are upgrading from
|
||||
version 0.90.x to 0.94.x, you must first go from 0.90.x to 0.92.x and then go
|
||||
from 0.92.x to 0.94.x.</para>
|
||||
<para>You cannot skip major versions upgrading. If you are upgrading from version 0.90.x to
|
||||
0.94.x, you must first go from 0.90.x to 0.92.x and then go from 0.92.x to 0.94.x.</para>
|
||||
<note><para>It may be possible to skip across versions -- for example go from
|
||||
0.92.2 straight to 0.98.0 just following the 0.96.x upgrade instructions --
|
||||
but we have not tried it so cannot say whether it works or not.</para>
|
||||
|
|
Loading…
Reference in New Issue