Apply feedback to MOB docs
This commit is contained in:
parent
b889339006
commit
c4437e2516
|
@ -29,15 +29,30 @@
|
||||||
:toc: left
|
:toc: left
|
||||||
:source-language: java
|
:source-language: java
|
||||||
|
|
||||||
Data comes in many sizes, and saving all of your data in HBase, including binary data such as images and documents, is ideal. HBase can technically handle binary objects with cells that are up to 10MB in size. However, HBase's normal read and write paths are optimized for values smaller than 100KB in size. When HBase deals with large numbers of values up to 10MB, referred to here as medium objects, or MOBs, performance is degraded due to write amplification caused by splits and compactions. HBase ***FIX_VERSION_NUMBER*** adds support for better managing large numbers of MOBs while maintaining performance, consistency, and low operational overhead. MOB support is provided by the work done in link:https://issues.apache.org/jira/browse/HBASE-11339[HBASE-11339].
|
Data comes in many sizes, and saving all of your data in HBase, including binary
|
||||||
|
data such as images and documents, is ideal. While HBase can technically handle
|
||||||
To take advantage of MOB, you need to use <<hfilev3,HFile version 3>>. Optionally, configure the MOB file reader's cache settings for each RegionServer (see <<mob.cache.configure>>), then configure specific columns to hold MOB data.
|
binary objects with cells that are larger than 100 KB in size, HBase's normal
|
||||||
|
read and write paths are optimized for values smaller than 100KB in size. When
|
||||||
Client code does not need to change to take advantage of HBase MOB support. The feature is transparent to the client.
|
HBase deals with large numbers of objects over this threshold, referred to here
|
||||||
|
as medium objects, or MOBs, performance is degraded due to write amplification
|
||||||
|
caused by splits and compactions. When using MOBs, ideally your objects will be between
|
||||||
|
100KB and 10MB. HBase ***FIX_VERSION_NUMBER*** adds support
|
||||||
|
for better managing large numbers of MOBs while maintaining performance,
|
||||||
|
consistency, and low operational overhead. MOB support is provided by the work
|
||||||
|
done in link:https://issues.apache.org/jira/browse/HBASE-11339[HBASE-11339]. To
|
||||||
|
take advantage of MOB, you need to use <<hfilev3,HFile version 3>>. Optionally,
|
||||||
|
configure the MOB file reader's cache settings for each RegionServer (see
|
||||||
|
<<mob.cache.configure>>), then configure specific columns to hold MOB data.
|
||||||
|
Client code does not need to change to take advantage of HBase MOB support. The
|
||||||
|
feature is transparent to the client.
|
||||||
|
|
||||||
=== Configuring Columns for MOB
|
=== Configuring Columns for MOB
|
||||||
|
|
||||||
You can configure columns to support MOB during table creation or alteration, either in HBase Shell or via the Java API. The two relevant properties are the boolean `IS_MOB` and the `MOB_THRESHOLD`, which is the number of bytes at which an object is considered to be a MOB. Only `IS_MOB` is required. If you do not specify the `MOB_THRESHOLD`, the default threshold value of 100 kb is used.
|
You can configure columns to support MOB during table creation or alteration,
|
||||||
|
either in HBase Shell or via the Java API. The two relevant properties are the
|
||||||
|
boolean `IS_MOB` and the `MOB_THRESHOLD`, which is the number of bytes at which
|
||||||
|
an object is considered to be a MOB. Only `IS_MOB` is required. If you do not
|
||||||
|
specify the `MOB_THRESHOLD`, the default threshold value of 100 KB is used.
|
||||||
|
|
||||||
.Configure a Column for MOB Using HBase Shell
|
.Configure a Column for MOB Using HBase Shell
|
||||||
====
|
====
|
||||||
|
@ -102,17 +117,19 @@ Because there can be a large number of MOB files at any time, as compared to the
|
||||||
<name>hbase.mob.cache.evict.period</name>
|
<name>hbase.mob.cache.evict.period</name>
|
||||||
<value>3600</value>
|
<value>3600</value>
|
||||||
<description>
|
<description>
|
||||||
The amount of time in seconds before the mob cache evicts cached mob files.
|
The amount of time in seconds after which an unused file is evicted from the
|
||||||
The default value is 3600 seconds.
|
MOB cache. The default value is 3600 seconds.
|
||||||
</description>
|
</description>
|
||||||
</property>
|
</property>
|
||||||
<property>
|
<property>
|
||||||
<name>hbase.mob.cache.evict.remain.ratio</name>
|
<name>hbase.mob.cache.evict.remain.ratio</name>
|
||||||
<value>0.5f</value>
|
<value>0.5f</value>
|
||||||
<description>
|
<description>
|
||||||
The ratio (between 0.0 and 1.0) of files that remains cached after an eviction
|
A multiplier (between 0.0 and 1.0), which determines how many files remain cached
|
||||||
is triggered when the number of cached mob files exceeds the hbase.mob.file.cache.size.
|
after the threshold of files that remains cached after a cache eviction occurs
|
||||||
The default value is 0.5f.
|
which is triggered by reaching the `hbase.mob.file.cache.size` threshold.
|
||||||
|
The default value is 0.5f, which means that half the files (the least-recently-used
|
||||||
|
ones) are evicted.
|
||||||
</description>
|
</description>
|
||||||
</property>
|
</property>
|
||||||
----
|
----
|
||||||
|
@ -122,7 +139,12 @@ Because there can be a large number of MOB files at any time, as compared to the
|
||||||
|
|
||||||
==== Manually Compacting MOB Files
|
==== Manually Compacting MOB Files
|
||||||
|
|
||||||
To manually compact MOB files, rather than waiting for the <<mob.cache.configure,configuration>> to trigger compaction, use the `compact_mob` or `major_compact_mob` HBase shell commands. These commands require the first argument to be the table name, and take an optional column family as the second argument. If the column family is omitted, all MOB-enabled column families are compacted.
|
To manually compact MOB files, rather than waiting for the
|
||||||
|
<<mob.cache.configure,configuration>> to trigger compaction, use the
|
||||||
|
`compact_mob` or `major_compact_mob` HBase shell commands. These commands
|
||||||
|
require the first argument to be the table name, and take an optional column
|
||||||
|
family as the second argument. If the column family is omitted, all MOB-enabled
|
||||||
|
column families are compacted.
|
||||||
|
|
||||||
----
|
----
|
||||||
hbase> compact_mob 't1', 'c1'
|
hbase> compact_mob 't1', 'c1'
|
||||||
|
@ -131,11 +153,17 @@ hbase> major_compact_mob 't1', 'c1'
|
||||||
hbase> major_compact_mob 't1'
|
hbase> major_compact_mob 't1'
|
||||||
----
|
----
|
||||||
|
|
||||||
These commands are also available via `Admin.compactMob` and `Admin.majorCompactMob` methods.
|
These commands are also available via `Admin.compactMob` and
|
||||||
|
`Admin.majorCompactMob` methods.
|
||||||
|
|
||||||
==== MOB Sweeper
|
==== MOB Sweeper
|
||||||
|
|
||||||
HBase MOB currently relies on a MapReduce job called the Sweeper tool for optimization. The Sweeper tool oalesces small MOB files or MOB files with many deletions or updates. A native MOB compaction tool is still in testing. To configure the Sweeper tool, set the following options:
|
HBase MOB a MapReduce job called the Sweeper tool for
|
||||||
|
optimization. The Sweeper tool oalesces small MOB files or MOB files with many
|
||||||
|
deletions or updates. The Sweeper tool is not required if you use native MOB compaction, which
|
||||||
|
does not rely on MapReduce.
|
||||||
|
|
||||||
|
To configure the Sweeper tool, set the following options:
|
||||||
|
|
||||||
[source,xml]
|
[source,xml]
|
||||||
----
|
----
|
||||||
|
|
Loading…
Reference in New Issue