Add note that can OOME if many regions and MSLAB on
git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1406306 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
9b3e63dcb7
commit
1ed61755ad
|
@ -47,7 +47,7 @@
|
||||||
<title>Network</title>
|
<title>Network</title>
|
||||||
<para>
|
<para>
|
||||||
Perhaps the most important factor in avoiding network issues degrading Hadoop and HBbase performance is the switching hardware
|
Perhaps the most important factor in avoiding network issues degrading Hadoop and HBbase performance is the switching hardware
|
||||||
that is used, decisions made early in the scope of the project can cause major problems when you double or triple the size of your cluster (or more).
|
that is used, decisions made early in the scope of the project can cause major problems when you double or triple the size of your cluster (or more).
|
||||||
</para>
|
</para>
|
||||||
<para>
|
<para>
|
||||||
Important items to consider:
|
Important items to consider:
|
||||||
|
@ -59,15 +59,15 @@
|
||||||
</para>
|
</para>
|
||||||
<section xml:id="perf.network.1switch">
|
<section xml:id="perf.network.1switch">
|
||||||
<title>Single Switch</title>
|
<title>Single Switch</title>
|
||||||
<para>The single most important factor in this configuration is that the switching capacity of the hardware is capable of
|
<para>The single most important factor in this configuration is that the switching capacity of the hardware is capable of
|
||||||
handling the traffic which can be generated by all systems connected to the switch. Some lower priced commodity hardware
|
handling the traffic which can be generated by all systems connected to the switch. Some lower priced commodity hardware
|
||||||
can have a slower switching capacity than could be utilized by a full switch.
|
can have a slower switching capacity than could be utilized by a full switch.
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
<section xml:id="perf.network.2switch">
|
<section xml:id="perf.network.2switch">
|
||||||
<title>Multiple Switches</title>
|
<title>Multiple Switches</title>
|
||||||
<para>Multiple switches are a potential pitfall in the architecture. The most common configuration of lower priced hardware is a
|
<para>Multiple switches are a potential pitfall in the architecture. The most common configuration of lower priced hardware is a
|
||||||
simple 1Gbps uplink from one switch to another. This often overlooked pinch point can easily become a bottleneck for cluster communication.
|
simple 1Gbps uplink from one switch to another. This often overlooked pinch point can easily become a bottleneck for cluster communication.
|
||||||
Especially with MapReduce jobs that are both reading and writing a lot of data the communication across this uplink could be saturated.
|
Especially with MapReduce jobs that are both reading and writing a lot of data the communication across this uplink could be saturated.
|
||||||
</para>
|
</para>
|
||||||
<para>Mitigation of this issue is fairly simple and can be accomplished in multiple ways:
|
<para>Mitigation of this issue is fairly simple and can be accomplished in multiple ways:
|
||||||
|
@ -85,10 +85,10 @@
|
||||||
<listitem>Poor switch capacity performance</listitem>
|
<listitem>Poor switch capacity performance</listitem>
|
||||||
<listitem>Insufficient uplink to another rack</listitem>
|
<listitem>Insufficient uplink to another rack</listitem>
|
||||||
</itemizedlist>
|
</itemizedlist>
|
||||||
If the the switches in your rack have appropriate switching capacity to handle all the hosts at full speed, the next most likely issue will be caused by homing
|
If the the switches in your rack have appropriate switching capacity to handle all the hosts at full speed, the next most likely issue will be caused by homing
|
||||||
more of your cluster across racks. The easiest way to avoid issues when spanning multiple racks is to use port trunking to create a bonded uplink to other racks.
|
more of your cluster across racks. The easiest way to avoid issues when spanning multiple racks is to use port trunking to create a bonded uplink to other racks.
|
||||||
The downside of this method however, is in the overhead of ports that could potentially be used. An example of this is, creating an 8Gbps port channel from rack
|
The downside of this method however, is in the overhead of ports that could potentially be used. An example of this is, creating an 8Gbps port channel from rack
|
||||||
A to rack B, using 8 of your 24 ports to communicate between racks gives you a poor ROI, using too few however can mean you're not getting the most out of your cluster.
|
A to rack B, using 8 of your 24 ports to communicate between racks gives you a poor ROI, using too few however can mean you're not getting the most out of your cluster.
|
||||||
</para>
|
</para>
|
||||||
<para>Using 10Gbe links between racks will greatly increase performance, and assuming your switches support a 10Gbe uplink or allow for an expansion card will allow you to
|
<para>Using 10Gbe links between racks will greatly increase performance, and assuming your switches support a 10Gbe uplink or allow for an expansion card will allow you to
|
||||||
save your ports for machines as opposed to uplinks.
|
save your ports for machines as opposed to uplinks.
|
||||||
|
@ -128,7 +128,14 @@
|
||||||
slides for background and detail<footnote><para>The latest jvms do better
|
slides for background and detail<footnote><para>The latest jvms do better
|
||||||
regards fragmentation so make sure you are running a recent release.
|
regards fragmentation so make sure you are running a recent release.
|
||||||
Read down in the message,
|
Read down in the message,
|
||||||
<link xlink:href="http://osdir.com/ml/hotspot-gc-use/2011-11/msg00002.html">Identifying concurrent mode failures caused by fragmentation</link>.</para></footnote>.</para>
|
<link xlink:href="http://osdir.com/ml/hotspot-gc-use/2011-11/msg00002.html">Identifying concurrent mode failures caused by fragmentation</link>.</para></footnote>.
|
||||||
|
Be aware that when enabled, each MemStore instance will occupy at least
|
||||||
|
an MSLAB instance of memory. If you have thousands of regions or lots
|
||||||
|
of regions each with many column families, this allocation of MSLAB
|
||||||
|
may be responsible for a good portion of your heap allocation and in
|
||||||
|
an extreme case cause you to OOME. Disable MSLAB in this case, or
|
||||||
|
lower the amount of memory it uses or float less regions per server.
|
||||||
|
</para>
|
||||||
<para>For more information about GC logs, see <xref linkend="trouble.log.gc" />.
|
<para>For more information about GC logs, see <xref linkend="trouble.log.gc" />.
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
|
@ -159,44 +166,44 @@
|
||||||
|
|
||||||
<section xml:id="perf.handlers">
|
<section xml:id="perf.handlers">
|
||||||
<title><varname>hbase.regionserver.handler.count</varname></title>
|
<title><varname>hbase.regionserver.handler.count</varname></title>
|
||||||
<para>See <xref linkend="hbase.regionserver.handler.count"/>.
|
<para>See <xref linkend="hbase.regionserver.handler.count"/>.
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
<section xml:id="perf.hfile.block.cache.size">
|
<section xml:id="perf.hfile.block.cache.size">
|
||||||
<title><varname>hfile.block.cache.size</varname></title>
|
<title><varname>hfile.block.cache.size</varname></title>
|
||||||
<para>See <xref linkend="hfile.block.cache.size"/>.
|
<para>See <xref linkend="hfile.block.cache.size"/>.
|
||||||
A memory setting for the RegionServer process.
|
A memory setting for the RegionServer process.
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
<section xml:id="perf.rs.memstore.upperlimit">
|
<section xml:id="perf.rs.memstore.upperlimit">
|
||||||
<title><varname>hbase.regionserver.global.memstore.upperLimit</varname></title>
|
<title><varname>hbase.regionserver.global.memstore.upperLimit</varname></title>
|
||||||
<para>See <xref linkend="hbase.regionserver.global.memstore.upperLimit"/>.
|
<para>See <xref linkend="hbase.regionserver.global.memstore.upperLimit"/>.
|
||||||
This memory setting is often adjusted for the RegionServer process depending on needs.
|
This memory setting is often adjusted for the RegionServer process depending on needs.
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
<section xml:id="perf.rs.memstore.lowerlimit">
|
<section xml:id="perf.rs.memstore.lowerlimit">
|
||||||
<title><varname>hbase.regionserver.global.memstore.lowerLimit</varname></title>
|
<title><varname>hbase.regionserver.global.memstore.lowerLimit</varname></title>
|
||||||
<para>See <xref linkend="hbase.regionserver.global.memstore.lowerLimit"/>.
|
<para>See <xref linkend="hbase.regionserver.global.memstore.lowerLimit"/>.
|
||||||
This memory setting is often adjusted for the RegionServer process depending on needs.
|
This memory setting is often adjusted for the RegionServer process depending on needs.
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
<section xml:id="perf.hstore.blockingstorefiles">
|
<section xml:id="perf.hstore.blockingstorefiles">
|
||||||
<title><varname>hbase.hstore.blockingStoreFiles</varname></title>
|
<title><varname>hbase.hstore.blockingStoreFiles</varname></title>
|
||||||
<para>See <xref linkend="hbase.hstore.blockingStoreFiles"/>.
|
<para>See <xref linkend="hbase.hstore.blockingStoreFiles"/>.
|
||||||
If there is blocking in the RegionServer logs, increasing this can help.
|
If there is blocking in the RegionServer logs, increasing this can help.
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
<section xml:id="perf.hregion.memstore.block.multiplier">
|
<section xml:id="perf.hregion.memstore.block.multiplier">
|
||||||
<title><varname>hbase.hregion.memstore.block.multiplier</varname></title>
|
<title><varname>hbase.hregion.memstore.block.multiplier</varname></title>
|
||||||
<para>See <xref linkend="hbase.hregion.memstore.block.multiplier"/>.
|
<para>See <xref linkend="hbase.hregion.memstore.block.multiplier"/>.
|
||||||
If there is enough RAM, increasing this can help.
|
If there is enough RAM, increasing this can help.
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
<section xml:id="hbase.regionserver.checksum.verify">
|
<section xml:id="hbase.regionserver.checksum.verify">
|
||||||
<title><varname>hbase.regionserver.checksum.verify</varname></title>
|
<title><varname>hbase.regionserver.checksum.verify</varname></title>
|
||||||
<para>Have HBase write the checksum into the datablock and save
|
<para>Have HBase write the checksum into the datablock and save
|
||||||
having to do the checksum seek whenever you read. See the
|
having to do the checksum seek whenever you read. See the
|
||||||
release note on <link xlink:href="https://issues.apache.org/jira/browse/HBASE-5074">HBASE-5074 support checksums in HBase block cache</link>.
|
release note on <link xlink:href="https://issues.apache.org/jira/browse/HBASE-5074">HBASE-5074 support checksums in HBase block cache</link>.
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
|
@ -218,7 +225,7 @@ more discussion around short circuit reads.
|
||||||
</para>
|
</para>
|
||||||
<para>To enable "short circuit" reads, you must set two configurations.
|
<para>To enable "short circuit" reads, you must set two configurations.
|
||||||
First, the hdfs-site.xml needs to be amended. Set
|
First, the hdfs-site.xml needs to be amended. Set
|
||||||
the property <varname>dfs.block.local-path-access.user</varname>
|
the property <varname>dfs.block.local-path-access.user</varname>
|
||||||
to be the <emphasis>only</emphasis> user that can use the shortcut.
|
to be the <emphasis>only</emphasis> user that can use the shortcut.
|
||||||
This has to be the user that started HBase. Then in hbase-site.xml,
|
This has to be the user that started HBase. Then in hbase-site.xml,
|
||||||
set <varname>dfs.client.read.shortcircuit</varname> to be <varname>true</varname>
|
set <varname>dfs.client.read.shortcircuit</varname> to be <varname>true</varname>
|
||||||
|
@ -241,19 +248,19 @@ the data will still be read.
|
||||||
</section>
|
</section>
|
||||||
<section xml:id="perf.schema">
|
<section xml:id="perf.schema">
|
||||||
<title>Schema Design</title>
|
<title>Schema Design</title>
|
||||||
|
|
||||||
<section xml:id="perf.number.of.cfs">
|
<section xml:id="perf.number.of.cfs">
|
||||||
<title>Number of Column Families</title>
|
<title>Number of Column Families</title>
|
||||||
<para>See <xref linkend="number.of.cfs" />.</para>
|
<para>See <xref linkend="number.of.cfs" />.</para>
|
||||||
</section>
|
</section>
|
||||||
<section xml:id="perf.schema.keys">
|
<section xml:id="perf.schema.keys">
|
||||||
<title>Key and Attribute Lengths</title>
|
<title>Key and Attribute Lengths</title>
|
||||||
<para>See <xref linkend="keysize" />. See also <xref linkend="perf.compression.however" /> for
|
<para>See <xref linkend="keysize" />. See also <xref linkend="perf.compression.however" /> for
|
||||||
compression caveats.</para>
|
compression caveats.</para>
|
||||||
</section>
|
</section>
|
||||||
<section xml:id="schema.regionsize"><title>Table RegionSize</title>
|
<section xml:id="schema.regionsize"><title>Table RegionSize</title>
|
||||||
<para>The regionsize can be set on a per-table basis via <code>setFileSize</code> on
|
<para>The regionsize can be set on a per-table basis via <code>setFileSize</code> on
|
||||||
<link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HTableDescriptor.html">HTableDescriptor</link> in the
|
<link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HTableDescriptor.html">HTableDescriptor</link> in the
|
||||||
event where certain tables require different regionsizes than the configured default regionsize.
|
event where certain tables require different regionsizes than the configured default regionsize.
|
||||||
</para>
|
</para>
|
||||||
<para>See <xref linkend="perf.number.of.regions"/> for more information.
|
<para>See <xref linkend="perf.number.of.regions"/> for more information.
|
||||||
|
@ -269,23 +276,23 @@ the data will still be read.
|
||||||
on each insert. If <varname>ROWCOL</varname>, the hash of the row +
|
on each insert. If <varname>ROWCOL</varname>, the hash of the row +
|
||||||
column family + column family qualifier will be added to the bloom on
|
column family + column family qualifier will be added to the bloom on
|
||||||
each key insert.</para>
|
each key insert.</para>
|
||||||
<para>See <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html">HColumnDescriptor</link> and
|
<para>See <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html">HColumnDescriptor</link> and
|
||||||
<xref linkend="blooms"/> for more information or this answer up in quora,
|
<xref linkend="blooms"/> for more information or this answer up in quora,
|
||||||
<link xlink:href="http://www.quora.com/How-are-bloom-filters-used-in-HBase">How are bloom filters used in HBase?</link>.
|
<link xlink:href="http://www.quora.com/How-are-bloom-filters-used-in-HBase">How are bloom filters used in HBase?</link>.
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
<section xml:id="schema.cf.blocksize"><title>ColumnFamily BlockSize</title>
|
<section xml:id="schema.cf.blocksize"><title>ColumnFamily BlockSize</title>
|
||||||
<para>The blocksize can be configured for each ColumnFamily in a table, and this defaults to 64k. Larger cell values require larger blocksizes.
|
<para>The blocksize can be configured for each ColumnFamily in a table, and this defaults to 64k. Larger cell values require larger blocksizes.
|
||||||
There is an inverse relationship between blocksize and the resulting StoreFile indexes (i.e., if the blocksize is doubled then the resulting
|
There is an inverse relationship between blocksize and the resulting StoreFile indexes (i.e., if the blocksize is doubled then the resulting
|
||||||
indexes should be roughly halved).
|
indexes should be roughly halved).
|
||||||
</para>
|
</para>
|
||||||
<para>See <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html">HColumnDescriptor</link>
|
<para>See <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html">HColumnDescriptor</link>
|
||||||
and <xref linkend="store"/>for more information.
|
and <xref linkend="store"/>for more information.
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
<section xml:id="cf.in.memory">
|
<section xml:id="cf.in.memory">
|
||||||
<title>In-Memory ColumnFamilies</title>
|
<title>In-Memory ColumnFamilies</title>
|
||||||
<para>ColumnFamilies can optionally be defined as in-memory. Data is still persisted to disk, just like any other ColumnFamily.
|
<para>ColumnFamilies can optionally be defined as in-memory. Data is still persisted to disk, just like any other ColumnFamily.
|
||||||
In-memory blocks have the highest priority in the <xref linkend="block.cache" />, but it is not a guarantee that the entire table
|
In-memory blocks have the highest priority in the <xref linkend="block.cache" />, but it is not a guarantee that the entire table
|
||||||
will be in memory.
|
will be in memory.
|
||||||
</para>
|
</para>
|
||||||
|
@ -297,17 +304,17 @@ the data will still be read.
|
||||||
<para>Production systems should use compression with their ColumnFamily definitions. See <xref linkend="compression" /> for more information.
|
<para>Production systems should use compression with their ColumnFamily definitions. See <xref linkend="compression" /> for more information.
|
||||||
</para>
|
</para>
|
||||||
<section xml:id="perf.compression.however"><title>However...</title>
|
<section xml:id="perf.compression.however"><title>However...</title>
|
||||||
<para>Compression deflates data <emphasis>on disk</emphasis>. When it's in-memory (e.g., in the
|
<para>Compression deflates data <emphasis>on disk</emphasis>. When it's in-memory (e.g., in the
|
||||||
MemStore) or on the wire (e.g., transferring between RegionServer and Client) it's inflated.
|
MemStore) or on the wire (e.g., transferring between RegionServer and Client) it's inflated.
|
||||||
So while using ColumnFamily compression is a best practice, but it's not going to completely eliminate
|
So while using ColumnFamily compression is a best practice, but it's not going to completely eliminate
|
||||||
the impact of over-sized Keys, over-sized ColumnFamily names, or over-sized Column names.
|
the impact of over-sized Keys, over-sized ColumnFamily names, or over-sized Column names.
|
||||||
</para>
|
</para>
|
||||||
<para>See <xref linkend="keysize" /> on for schema design tips, and <xref linkend="keyvalue"/> for more information on HBase stores data internally.
|
<para>See <xref linkend="keysize" /> on for schema design tips, and <xref linkend="keyvalue"/> for more information on HBase stores data internally.
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
</section>
|
</section>
|
||||||
</section> <!-- perf schema -->
|
</section> <!-- perf schema -->
|
||||||
|
|
||||||
<section xml:id="perf.writing">
|
<section xml:id="perf.writing">
|
||||||
<title>Writing to HBase</title>
|
<title>Writing to HBase</title>
|
||||||
|
|
||||||
|
@ -335,7 +342,7 @@ throws IOException {
|
||||||
} catch (TableExistsException e) {
|
} catch (TableExistsException e) {
|
||||||
logger.info("table " + table.getNameAsString() + " already exists");
|
logger.info("table " + table.getNameAsString() + " already exists");
|
||||||
// the table already exists...
|
// the table already exists...
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -360,7 +367,7 @@ public static byte[][] getHexSplits(String startKey, String endKey, int numRegio
|
||||||
Table Creation: Deferred Log Flush
|
Table Creation: Deferred Log Flush
|
||||||
</title>
|
</title>
|
||||||
<para>
|
<para>
|
||||||
The default behavior for Puts using the Write Ahead Log (WAL) is that <classname>HLog</classname> edits will be written immediately. If deferred log flush is used,
|
The default behavior for Puts using the Write Ahead Log (WAL) is that <classname>HLog</classname> edits will be written immediately. If deferred log flush is used,
|
||||||
WAL edits are kept in memory until the flush period. The benefit is aggregated and asynchronous <classname>HLog</classname>- writes, but the potential downside is that if
|
WAL edits are kept in memory until the flush period. The benefit is aggregated and asynchronous <classname>HLog</classname>- writes, but the potential downside is that if
|
||||||
the RegionServer goes down the yet-to-be-flushed edits are lost. This is safer, however, than not using WAL at all with Puts.
|
the RegionServer goes down the yet-to-be-flushed edits are lost. This is safer, however, than not using WAL at all with Puts.
|
||||||
</para>
|
</para>
|
||||||
|
@ -368,7 +375,7 @@ WAL edits are kept in memory until the flush period. The benefit is aggregated
|
||||||
Deferred log flush can be configured on tables via <link
|
Deferred log flush can be configured on tables via <link
|
||||||
xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HTableDescriptor.html">HTableDescriptor</link>. The default value of <varname>hbase.regionserver.optionallogflushinterval</varname> is 1000ms.
|
xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HTableDescriptor.html">HTableDescriptor</link>. The default value of <varname>hbase.regionserver.optionallogflushinterval</varname> is 1000ms.
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
<section xml:id="perf.hbase.client.autoflush">
|
<section xml:id="perf.hbase.client.autoflush">
|
||||||
<title>HBase Client: AutoFlush</title>
|
<title>HBase Client: AutoFlush</title>
|
||||||
|
@ -394,25 +401,25 @@ Deferred log flush can be configured on tables via <link
|
||||||
it makes little difference if your load is well distributed across the cluster.
|
it makes little difference if your load is well distributed across the cluster.
|
||||||
</para>
|
</para>
|
||||||
<para>In general, it is best to use WAL for Puts, and where loading throughput
|
<para>In general, it is best to use WAL for Puts, and where loading throughput
|
||||||
is a concern to use <link linkend="perf.batch.loading">bulk loading</link> techniques instead.
|
is a concern to use <link linkend="perf.batch.loading">bulk loading</link> techniques instead.
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
<section xml:id="perf.hbase.client.regiongroup">
|
<section xml:id="perf.hbase.client.regiongroup">
|
||||||
<title>HBase Client: Group Puts by RegionServer</title>
|
<title>HBase Client: Group Puts by RegionServer</title>
|
||||||
<para>In addition to using the writeBuffer, grouping <classname>Put</classname>s by RegionServer can reduce the number of client RPC calls per writeBuffer flush.
|
<para>In addition to using the writeBuffer, grouping <classname>Put</classname>s by RegionServer can reduce the number of client RPC calls per writeBuffer flush.
|
||||||
There is a utility <classname>HTableUtil</classname> currently on TRUNK that does this, but you can either copy that or implement your own verison for
|
There is a utility <classname>HTableUtil</classname> currently on TRUNK that does this, but you can either copy that or implement your own verison for
|
||||||
those still on 0.90.x or earlier.
|
those still on 0.90.x or earlier.
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
<section xml:id="perf.hbase.write.mr.reducer">
|
<section xml:id="perf.hbase.write.mr.reducer">
|
||||||
<title>MapReduce: Skip The Reducer</title>
|
<title>MapReduce: Skip The Reducer</title>
|
||||||
<para>When writing a lot of data to an HBase table from a MR job (e.g., with <link
|
<para>When writing a lot of data to an HBase table from a MR job (e.g., with <link
|
||||||
xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.html">TableOutputFormat</link>), and specifically where Puts are being emitted
|
xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.html">TableOutputFormat</link>), and specifically where Puts are being emitted
|
||||||
from the Mapper, skip the Reducer step. When a Reducer step is used, all of the output (Puts) from the Mapper will get spooled to disk, then sorted/shuffled to other
|
from the Mapper, skip the Reducer step. When a Reducer step is used, all of the output (Puts) from the Mapper will get spooled to disk, then sorted/shuffled to other
|
||||||
Reducers that will most likely be off-node. It's far more efficient to just write directly to HBase.
|
Reducers that will most likely be off-node. It's far more efficient to just write directly to HBase.
|
||||||
</para>
|
</para>
|
||||||
<para>For summary jobs where HBase is used as a source and a sink, then writes will be coming from the Reducer step (e.g., summarize values then write out result).
|
<para>For summary jobs where HBase is used as a source and a sink, then writes will be coming from the Reducer step (e.g., summarize values then write out result).
|
||||||
This is a different processing problem than from the the above case.
|
This is a different processing problem than from the the above case.
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
|
@ -421,16 +428,16 @@ Deferred log flush can be configured on tables via <link
|
||||||
<para>If all your data is being written to one region at a time, then re-read the
|
<para>If all your data is being written to one region at a time, then re-read the
|
||||||
section on processing <link linkend="timeseries">timeseries</link> data.</para>
|
section on processing <link linkend="timeseries">timeseries</link> data.</para>
|
||||||
<para>Also, if you are pre-splitting regions and all your data is <emphasis>still</emphasis> winding up in a single region even though
|
<para>Also, if you are pre-splitting regions and all your data is <emphasis>still</emphasis> winding up in a single region even though
|
||||||
your keys aren't monotonically increasing, confirm that your keyspace actually works with the split strategy. There are a
|
your keys aren't monotonically increasing, confirm that your keyspace actually works with the split strategy. There are a
|
||||||
variety of reasons that regions may appear "well split" but won't work with your data. As
|
variety of reasons that regions may appear "well split" but won't work with your data. As
|
||||||
the HBase client communicates directly with the RegionServers, this can be obtained via
|
the HBase client communicates directly with the RegionServers, this can be obtained via
|
||||||
<link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#getRegionLocation%28byte[]%29">HTable.getRegionLocation</link>.
|
<link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#getRegionLocation%28byte[]%29">HTable.getRegionLocation</link>.
|
||||||
</para>
|
</para>
|
||||||
<para>See <xref linkend="precreate.regions"/>, as well as <xref linkend="perf.configurations"/> </para>
|
<para>See <xref linkend="precreate.regions"/>, as well as <xref linkend="perf.configurations"/> </para>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
</section> <!-- writing -->
|
</section> <!-- writing -->
|
||||||
|
|
||||||
<section xml:id="perf.reading">
|
<section xml:id="perf.reading">
|
||||||
<title>Reading from HBase</title>
|
<title>Reading from HBase</title>
|
||||||
|
|
||||||
|
@ -452,7 +459,7 @@ Deferred log flush can be configured on tables via <link
|
||||||
<para>Scan settings in MapReduce jobs deserve special attention. Timeouts can result (e.g., UnknownScannerException)
|
<para>Scan settings in MapReduce jobs deserve special attention. Timeouts can result (e.g., UnknownScannerException)
|
||||||
in Map tasks if it takes longer to process a batch of records before the client goes back to the RegionServer for the
|
in Map tasks if it takes longer to process a batch of records before the client goes back to the RegionServer for the
|
||||||
next set of data. This problem can occur because there is non-trivial processing occuring per row. If you process
|
next set of data. This problem can occur because there is non-trivial processing occuring per row. If you process
|
||||||
rows quickly, set caching higher. If you process rows more slowly (e.g., lots of transformations per row, writes),
|
rows quickly, set caching higher. If you process rows more slowly (e.g., lots of transformations per row, writes),
|
||||||
then set caching lower.
|
then set caching lower.
|
||||||
</para>
|
</para>
|
||||||
<para>Timeouts can also happen in a non-MapReduce use case (i.e., single threaded HBase client doing a Scan), but the
|
<para>Timeouts can also happen in a non-MapReduce use case (i.e., single threaded HBase client doing a Scan), but the
|
||||||
|
@ -472,8 +479,8 @@ Deferred log flush can be configured on tables via <link
|
||||||
</section>
|
</section>
|
||||||
<section xml:id="perf.hbase.mr.input">
|
<section xml:id="perf.hbase.mr.input">
|
||||||
<title>MapReduce - Input Splits</title>
|
<title>MapReduce - Input Splits</title>
|
||||||
<para>For MapReduce jobs that use HBase tables as a source, if there a pattern where the "slow" map tasks seem to
|
<para>For MapReduce jobs that use HBase tables as a source, if there a pattern where the "slow" map tasks seem to
|
||||||
have the same Input Split (i.e., the RegionServer serving the data), see the
|
have the same Input Split (i.e., the RegionServer serving the data), see the
|
||||||
Troubleshooting Case Study in <xref linkend="casestudies.slownode"/>.
|
Troubleshooting Case Study in <xref linkend="casestudies.slownode"/>.
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
|
@ -522,9 +529,9 @@ htable.close();</programlisting></para>
|
||||||
</section>
|
</section>
|
||||||
<section xml:id="perf.hbase.read.dist">
|
<section xml:id="perf.hbase.read.dist">
|
||||||
<title>Concurrency: Monitor Data Spread</title>
|
<title>Concurrency: Monitor Data Spread</title>
|
||||||
<para>When performing a high number of concurrent reads, monitor the data spread of the target tables. If the target table(s) have
|
<para>When performing a high number of concurrent reads, monitor the data spread of the target tables. If the target table(s) have
|
||||||
too few regions then the reads could likely be served from too few nodes. </para>
|
too few regions then the reads could likely be served from too few nodes. </para>
|
||||||
<para>See <xref linkend="precreate.regions"/>, as well as <xref linkend="perf.configurations"/> </para>
|
<para>See <xref linkend="precreate.regions"/>, as well as <xref linkend="perf.configurations"/> </para>
|
||||||
</section>
|
</section>
|
||||||
<section xml:id="blooms">
|
<section xml:id="blooms">
|
||||||
<title>Bloom Filters</title>
|
<title>Bloom Filters</title>
|
||||||
|
@ -554,7 +561,7 @@ htable.close();</programlisting></para>
|
||||||
</footnote></para>
|
</footnote></para>
|
||||||
<para>See also <xref linkend="schema.bloom" />.
|
<para>See also <xref linkend="schema.bloom" />.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<section xml:id="bloom_footprint">
|
<section xml:id="bloom_footprint">
|
||||||
<title>Bloom StoreFile footprint</title>
|
<title>Bloom StoreFile footprint</title>
|
||||||
|
|
||||||
|
@ -584,7 +591,7 @@ htable.close();</programlisting></para>
|
||||||
data. Obtained on-demand. Stored in the LRU cache, if it is enabled
|
data. Obtained on-demand. Stored in the LRU cache, if it is enabled
|
||||||
(Its enabled by default).</para>
|
(Its enabled by default).</para>
|
||||||
</section>
|
</section>
|
||||||
</section>
|
</section>
|
||||||
<section xml:id="config.bloom">
|
<section xml:id="config.bloom">
|
||||||
<title>Bloom Filter Configuration</title>
|
<title>Bloom Filter Configuration</title>
|
||||||
<section>
|
<section>
|
||||||
|
@ -615,10 +622,10 @@ htable.close();</programlisting></para>
|
||||||
in HBase</link> for more on what this option means.</para>
|
in HBase</link> for more on what this option means.</para>
|
||||||
</section>
|
</section>
|
||||||
</section>
|
</section>
|
||||||
</section> <!-- bloom -->
|
</section> <!-- bloom -->
|
||||||
|
|
||||||
</section> <!-- reading -->
|
</section> <!-- reading -->
|
||||||
|
|
||||||
<section xml:id="perf.deleting">
|
<section xml:id="perf.deleting">
|
||||||
<title>Deleting from HBase</title>
|
<title>Deleting from HBase</title>
|
||||||
<section xml:id="perf.deleting.queue">
|
<section xml:id="perf.deleting.queue">
|
||||||
|
@ -647,20 +654,20 @@ htable.close();</programlisting></para>
|
||||||
<section xml:id="perf.hdfs.curr"><title>Current Issues With Low-Latency Reads</title>
|
<section xml:id="perf.hdfs.curr"><title>Current Issues With Low-Latency Reads</title>
|
||||||
<para>The original use-case for HDFS was batch processing. As such, there low-latency reads were historically not a priority.
|
<para>The original use-case for HDFS was batch processing. As such, there low-latency reads were historically not a priority.
|
||||||
With the increased adoption of Apache HBase this is changing, and several improvements are already in development.
|
With the increased adoption of Apache HBase this is changing, and several improvements are already in development.
|
||||||
See the
|
See the
|
||||||
<link xlink:href="https://issues.apache.org/jira/browse/HDFS-1599">Umbrella Jira Ticket for HDFS Improvements for HBase</link>.
|
<link xlink:href="https://issues.apache.org/jira/browse/HDFS-1599">Umbrella Jira Ticket for HDFS Improvements for HBase</link>.
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
<section xml:id="perf.hdfs.comp"><title>Performance Comparisons of HBase vs. HDFS</title>
|
<section xml:id="perf.hdfs.comp"><title>Performance Comparisons of HBase vs. HDFS</title>
|
||||||
<para>A fairly common question on the dist-list is why HBase isn't as performant as HDFS files in a batch context (e.g., as
|
<para>A fairly common question on the dist-list is why HBase isn't as performant as HDFS files in a batch context (e.g., as
|
||||||
a MapReduce source or sink). The short answer is that HBase is doing a lot more than HDFS (e.g., reading the KeyValues,
|
a MapReduce source or sink). The short answer is that HBase is doing a lot more than HDFS (e.g., reading the KeyValues,
|
||||||
returning the most current row or specified timestamps, etc.), and as such HBase is 4-5 times slower than HDFS in this
|
returning the most current row or specified timestamps, etc.), and as such HBase is 4-5 times slower than HDFS in this
|
||||||
processing context. Not that there isn't room for improvement (and this gap will, over time, be reduced), but HDFS
|
processing context. Not that there isn't room for improvement (and this gap will, over time, be reduced), but HDFS
|
||||||
will always be faster in this use-case.
|
will always be faster in this use-case.
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
<section xml:id="perf.ec2"><title>Amazon EC2</title>
|
<section xml:id="perf.ec2"><title>Amazon EC2</title>
|
||||||
<para>Performance questions are common on Amazon EC2 environments because it is a shared environment. You will
|
<para>Performance questions are common on Amazon EC2 environments because it is a shared environment. You will
|
||||||
not see the same throughput as a dedicated server. In terms of running tests on EC2, run them several times for the same
|
not see the same throughput as a dedicated server. In terms of running tests on EC2, run them several times for the same
|
||||||
|
@ -670,7 +677,7 @@ htable.close();</programlisting></para>
|
||||||
because EC2 issues are practically a separate class of performance issues.
|
because EC2 issues are practically a separate class of performance issues.
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
<section xml:id="perf.casestudy"><title>Case Studies</title>
|
<section xml:id="perf.casestudy"><title>Case Studies</title>
|
||||||
<para>For Performance and Troubleshooting Case Studies, see <xref linkend="casestudies"/>.
|
<para>For Performance and Troubleshooting Case Studies, see <xref linkend="casestudies"/>.
|
||||||
</para>
|
</para>
|
||||||
|
|
Loading…
Reference in New Issue