HBASE-23198 Update ref guide for distributed MOB compaction.
* add design doc for original MOB changes as they were when HBase 2.0 came out * add design doc for distributed MOB compaction * remove configuration and commands no longer relevant after distributed MOB compaction * add in discussion of configuration options * allow asciimath formulas since we use them in the discussion closes #1232 Signed-off-by: Wellington Ramos Chevreuil <wchevreuil@apache.org>
This commit is contained in:
parent
9bd39786df
commit
aff0ff5d97
Binary file not shown.
Binary file not shown.
|
@ -36,22 +36,15 @@ read and write paths are optimized for values smaller than 100KB in size. When
|
|||
HBase deals with large numbers of objects over this threshold, referred to here
|
||||
as medium objects, or MOBs, performance is degraded due to write amplification
|
||||
caused by splits and compactions. When using MOBs, ideally your objects will be between
|
||||
100KB and 10MB (see the <<faq>>). HBase ***FIX_VERSION_NUMBER*** adds support
|
||||
for better managing large numbers of MOBs while maintaining performance,
|
||||
consistency, and low operational overhead. MOB support is provided by the work
|
||||
done in link:https://issues.apache.org/jira/browse/HBASE-11339[HBASE-11339]. To
|
||||
take advantage of MOB, you need to use <<hfilev3,HFile version 3>>. Optionally,
|
||||
100KB and 10MB (see the <<faq>>). HBase 2 added special internal handling of MOBs
|
||||
to maintain performance, consistency, and low operational overhead. MOB support is
|
||||
provided by the work done in link:https://issues.apache.org/jira/browse/HBASE-11339[HBASE-11339].
|
||||
To take advantage of MOB, you need to use <<hfilev3,HFile version 3>>. Optionally,
|
||||
configure the MOB file reader's cache settings for each RegionServer (see
|
||||
<<mob.cache.configure>>), then configure specific columns to hold MOB data.
|
||||
Client code does not need to change to take advantage of HBase MOB support. The
|
||||
feature is transparent to the client.
|
||||
|
||||
MOB compaction
|
||||
|
||||
MOB data is flushed into MOB files after MemStore flush. There will be lots of MOB files
|
||||
after some time. To reduce MOB file count, there is a periodic task which compacts
|
||||
small MOB files into a large one (MOB compaction).
|
||||
|
||||
=== Configuring Columns for MOB
|
||||
|
||||
You can configure columns to support MOB during table creation or alteration,
|
||||
|
@ -79,41 +72,6 @@ hcd.setMobThreshold(102400L);
|
|||
----
|
||||
====
|
||||
|
||||
=== Configure MOB Compaction Policy
|
||||
|
||||
By default, MOB files for one specific day are compacted into one large MOB file.
|
||||
To reduce MOB file count more, there are other MOB Compaction policies supported.
|
||||
|
||||
daily policy - compact MOB Files for one day into one large MOB file (default policy)
|
||||
weekly policy - compact MOB Files for one week into one large MOB file
|
||||
montly policy - compact MOB Files for one month into one large MOB File
|
||||
|
||||
.Configure MOB compaction policy Using HBase Shell
|
||||
----
|
||||
hbase> create 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD => 102400, MOB_COMPACT_PARTITION_POLICY => 'daily'}
|
||||
hbase> create 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD => 102400, MOB_COMPACT_PARTITION_POLICY => 'weekly'}
|
||||
hbase> create 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD => 102400, MOB_COMPACT_PARTITION_POLICY => 'monthly'}
|
||||
|
||||
hbase> alter 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD => 102400, MOB_COMPACT_PARTITION_POLICY => 'daily'}
|
||||
hbase> alter 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD => 102400, MOB_COMPACT_PARTITION_POLICY => 'weekly'}
|
||||
hbase> alter 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD => 102400, MOB_COMPACT_PARTITION_POLICY => 'monthly'}
|
||||
----
|
||||
|
||||
=== Configure MOB Compaction mergeable threshold
|
||||
|
||||
If the size of a mob file is less than this value, it's regarded as a small file and needs to
|
||||
be merged in mob compaction. The default value is 1280MB.
|
||||
|
||||
====
|
||||
[source,xml]
|
||||
----
|
||||
<property>
|
||||
<name>hbase.mob.compaction.mergeable.threshold</name>
|
||||
<value>10000000000</value>
|
||||
</property>
|
||||
----
|
||||
====
|
||||
|
||||
=== Testing MOB
|
||||
|
||||
The utility `org.apache.hadoop.hbase.IntegrationTestIngestWithMOB` is provided to assist with testing
|
||||
|
@ -133,9 +91,219 @@ $ sudo -u hbase hbase org.apache.hadoop.hbase.IntegrationTestIngestWithMOB \
|
|||
* `*maxMobDataSize*` is the maximum value for the size of MOB data.
|
||||
The default is 5 kB, expressed in bytes.
|
||||
|
||||
=== MOB architecture
|
||||
|
||||
This section is derived from information found in
|
||||
link:https://issues.apache.org/jira/browse/HBASE-11339[HBASE-11339], which covered the initial GA
|
||||
implementation of MOB in HBase and
|
||||
link:https://issues.apache.org/jira/browse/HBASE-22749[HBASE-22749], which improved things by
|
||||
parallelizing MOB maintenance across the RegionServers. For more information see
|
||||
the last version of the design doc created during the initial work,
|
||||
"link:https://github.com/apache/hbase/blob/master/dev-support/design-docs/HBASE-11339%20MOB%20GA%20design.pdf[HBASE-11339 MOB GA design.pdf]",
|
||||
and the design doc for the distributed mob compaction feature,
|
||||
"link:https://github.com/apache/hbase/blob/master/dev-support/design-docs/HBASE-22749%20MOB%20distributed%20compaction.pdf[HBASE-22749 MOB distributed compaction.pdf]".
|
||||
|
||||
|
||||
==== Overview
|
||||
|
||||
The MOB feature reduces the overall IO load for configured column families by storing values that
|
||||
are larger than the configured threshold outside of the normal regions to avoid splits, merges, and
|
||||
most importantly normal compactions.
|
||||
|
||||
When a cell is first written to a region it is stored in the WAL and memstore regardless of value
|
||||
size. When memstores from a column family configured to use MOB are eventually flushed two hfiles
|
||||
are written simultaneously. Cells with a value smaller than the threshold size are written to a
|
||||
normal region hfile. Cells with a value larger than the threshold are written into a special MOB
|
||||
hfile and also have a MOB reference cell written into the normal region HFile. As the Region Server
|
||||
flushes a MOB enabled memstore and closes a given normal region HFile it appends metadata that lists
|
||||
each of the special MOB hfiles referenced by the cells within.
|
||||
|
||||
MOB reference cells have the same key as the cell they are based on. The value of the reference cell
|
||||
is made up of two pieces of metadata: the size of the actual value and the MOB hfile that contains
|
||||
the original cell. In addition to any tags originally written to HBase, the reference cell prepends
|
||||
two additional tags. The first is a marker tag that says the cell is a MOB reference. This can be
|
||||
used later to scan specifically just for reference cells. The second stores the namespace and table
|
||||
at the time the MOB hfile is written out. This tag is used to optimize how the MOB system finds
|
||||
the underlying value in MOB hfiles after a series of HBase snapshot operations (ref HBASE-12332).
|
||||
Note that tags are only available within HBase servers and by default are not sent over RPCs.
|
||||
|
||||
All MOB hfiles for a given table are managed within a logical region that does not directly serve
|
||||
requests. When these MOB hfiles are created from a flush or MOB compaction they are placed in a
|
||||
dedicated mob data area under the hbase root directory specific to the namespace, table, mob
|
||||
logical region, and column family. In general that means a path structured like:
|
||||
|
||||
----
|
||||
%HBase Root Dir%/mobdir/data/%namespace%/%table%/%logical region%/%column family%/
|
||||
----
|
||||
|
||||
With default configs, an example table named 'some_table' in the
|
||||
default namespace with a MOB enabled column family named 'foo' this HDFS directory would be
|
||||
|
||||
----
|
||||
/hbase/mobdir/data/default/some_table/372c1b27e3dc0b56c3a031926e5efbe9/foo/
|
||||
----
|
||||
|
||||
These MOB hfiles are maintained by special chores in the HBase Master and across the individual
|
||||
Region Servers. Specifically those chores take care of enforcing TTLs and compacting them. Note that
|
||||
this compaction is primarily a matter of controlling the total number of files in HDFS because our
|
||||
operational assumptions for MOB data is that it will seldom update or delete.
|
||||
|
||||
When a given MOB hfile is no longer needed as a result of our compaction process then a chore in
|
||||
the Master will take care of moving it to the archive just
|
||||
like any normal hfile. Because the table's mob region is independent of all the normal regions it
|
||||
can coexist with them in the regular archive storage area:
|
||||
|
||||
----
|
||||
/hbase/archive/data/default/some_table/372c1b27e3dc0b56c3a031926e5efbe9/foo/
|
||||
----
|
||||
|
||||
The same hfile cleaning chores that take care of eventually deleting unneeded archived files from
|
||||
normal regions thus also will take care of these MOB hfiles. As such, if there is a snapshot of a
|
||||
MOB enabled table then the cleaning system will make sure those MOB files stick around in the
|
||||
archive area as long as they are needed by a snapshot or a clone of a snapshot.
|
||||
|
||||
==== MOB compaction
|
||||
|
||||
Each time the memstore for a MOB enabled column family performs a flush HBase will write values over
|
||||
the MOB threshold into MOB specific hfiles. When normal region compaction occurs the Region Server
|
||||
rewrites the normal data files while maintaining references to these MOB files without rewriting
|
||||
them. Normal client lookups for MOB values transparently will receive the original values because
|
||||
the Region Server internals take care of using the reference data to then pull the value out of a
|
||||
specific MOB file. This indirection means that building up a large number of MOB hfiles doesn't
|
||||
impact the overall time to retrieve any specific MOB cell. Thus, we need not perform compactions of
|
||||
the MOB hfiles nearly as often as normal hfiles. As a result, HBase saves IO by not rewriting MOB
|
||||
hfiles as a part of the periodic compactions a Region Server does on its own.
|
||||
|
||||
However, if deletes and updates of MOB cells are frequent then this indirection will begin to waste
|
||||
space. The only way to stop using the space of a particular MOB hfile is to ensure no cells still
|
||||
hold references to it. To do that we need to ensure we have written the current values into a new
|
||||
MOB hfile. If our backing filesystem has a limitation on the number of files that can be present, as
|
||||
HDFS does, then even if we do not have deletes or updates of MOB cells eventually there will be a
|
||||
sufficient number of MOB hfiles that we will need to coallesce them.
|
||||
|
||||
Periodically a chore in the master coordinates having the region servers
|
||||
perform a special major compaction that also handles rewritting new MOB files. Like all compactions
|
||||
the Region Server will create updated hfiles that hold both the cells that are smaller than the MOB
|
||||
threshold and cells that hold references to the newly rewritten MOB file. Because this rewriting has
|
||||
the advantage of looking across all active cells for the region our several small MOB files should
|
||||
end up as a single MOB file per region. The chore defaults to running weekly and can be
|
||||
configured by setting `hbase.mob.compaction.chore.period` to the desired period in seconds.
|
||||
|
||||
====
|
||||
[source,xml]
|
||||
----
|
||||
<property>
|
||||
<name>hbase.mob.compaction.chore.period</name>
|
||||
<value>2592000</value>
|
||||
<description>Example of changing the chore period from a week to a month.</description>
|
||||
</property>
|
||||
----
|
||||
====
|
||||
|
||||
By default, the periodic MOB compaction coordination chore will attempt to keep every region
|
||||
busy doing compactions in parallel in order to maximize the amount of work done on the cluster.
|
||||
If you need to tune the amount of IO this compaction generates on the underlying filesystem, you
|
||||
can control how many concurrent region-level compaction requests are allowed by setting
|
||||
`hbase.mob.major.compaction.region.batch.size` to an integer number greater than zero. If you set
|
||||
the configuration to 0 then you will get the default behavior of attempting to do all regions in
|
||||
parallel.
|
||||
|
||||
====
|
||||
[source,xml]
|
||||
----
|
||||
<property>
|
||||
<name>hbase.mob.major.compaction.region.batch.size</name>
|
||||
<value>1</value>
|
||||
<description>Example of switching from "as parallel as possible" to "serially"</description>
|
||||
</property>
|
||||
----
|
||||
====
|
||||
|
||||
==== MOB file archiving
|
||||
|
||||
Eventually we will have MOB hfiles that are no longer needed. Either clients will overwrite the
|
||||
value or a MOB-rewriting compaction will store a reference to a newer larger MOB hfile. Because any
|
||||
given MOB cell could have originally been written either in the current region or in a parent region
|
||||
that existed at some prior point in time, individual Region Servers do not decide when it is time
|
||||
to archive MOB hfiles. Instead a periodic chore in the Master evaluates MOB hfiles for archiving.
|
||||
|
||||
A MOB HFile will be subject to archiving under any of the following conditions:
|
||||
|
||||
* Any MOB HFile older than the column family's TTL
|
||||
* Any MOB HFile older than a "too recent" threshold with no references to it from the regular hfiles
|
||||
for all regions in a column family
|
||||
|
||||
To determine if a MOB HFile meets the second criteria the chore extracts metadata from the regular
|
||||
HFiles for each MOB enabled column family for a given table. That metadata enumerates the complete
|
||||
set of MOB HFiles needed to satisfy the references stored in the normal HFile area.
|
||||
|
||||
The period of the cleaner chore can be configued by setting `hbase.master.mob.cleaner.period` to a
|
||||
positive integer number of seconds. It defaults to running daily. You should not need to tune it
|
||||
unless you have a very aggressive TTL or a very high rate of MOB updates with a correspondingly
|
||||
high rate of non-MOB compactions.
|
||||
|
||||
=== MOB Optimization Tasks
|
||||
|
||||
==== Further limiting write amplification
|
||||
|
||||
If your MOB workload has few to no updates or deletes then you can opt-in to MOB compactions that
|
||||
optimize for limiting the amount of write amplification. It acheives this by setting a
|
||||
size threshold to ignore MOB files during the compaction process. When a given region goes
|
||||
through MOB compaction it will evaluate the size of the MOB file that currently holds the actual
|
||||
value and skip rewriting the value if that file is over threshold.
|
||||
|
||||
The bound of write amplification in this mode can be approximated as
|
||||
stem:["Write Amplification" = log_K(M/S)] where *K* is the number of files in compaction
|
||||
selection, *M* is the configurable threshold for MOB files size, and *S* is the minmum size of
|
||||
memstore flushes that create MOB files in the first place. For example given 5 files picked up per
|
||||
compaction, a threshold of 1 GB, and a flush size of 10MB the write amplification will be
|
||||
stem:[log_5((1GB)/(10MB)) = log_5(100) = 2.86].
|
||||
|
||||
If we are using an underlying filesystem with a limitation on the number of files, such as HDFS,
|
||||
and we know our expected data set size we can choose our maximum file size in order to approach
|
||||
this limit but stay within it in order to minimize write amplification. For example, if we expect to
|
||||
store a petabyte and we have a conservative limitation of a million files in our HDFS instance, then
|
||||
stem:[(1PB)/(1M) = 1GB] gives us a target limitation of a gigabyte per MOB file.
|
||||
|
||||
To opt-in to this compaction mode you must set `hbase.mob.compaction.type` to `optimized`. The
|
||||
default MOB size threshold in this mode is set to 1GB. It can be changed by setting
|
||||
`hbase.mob.compactions.max.file.size` to a positive integer number of bytes.
|
||||
|
||||
|
||||
====
|
||||
[source,xml]
|
||||
----
|
||||
<property>
|
||||
<name>hbase.mob.compaction.type</name>
|
||||
<value>optimized</value>
|
||||
<description>opt-in to write amplification optimized mob compaction.</description>
|
||||
</property>
|
||||
<property>
|
||||
<name>hbase.mob.compactions.max.file.size</name>
|
||||
<value>10737418240</value>
|
||||
<description>Example of tuning the max mob file size to 10GB</dscription>
|
||||
</property>
|
||||
----
|
||||
====
|
||||
|
||||
Additionally, when operating in this mode the compaction process will seek to avoid writing MOB
|
||||
files that are over the max file threshold. As it is writing out a additional MOB values into a MOB
|
||||
hfile it will check to see if the additional data causes the hfile to be over the max file size.
|
||||
When the hfile of MOB values reaches limit, the MOB hfile is committed to the MOB storage area and
|
||||
a new one is created. The hfile with reference cells will track the complete set of MOB hfiles it
|
||||
needs in its metadata.
|
||||
|
||||
.Be mindful of total time to complete compaction of a region
|
||||
[WARNING]
|
||||
====
|
||||
When using the write amplification optimized compaction mode you need to watch for the maximum time
|
||||
to compact a single region. If it nears an hour you should read through the troubleshooting section
|
||||
below <<mob.troubleshoot.cleaner.toonew>>. Failure to make the adjustments discussed there could
|
||||
lead to dataloss.
|
||||
====
|
||||
|
||||
[[mob.cache.configure]]
|
||||
=== Configuring the MOB Cache
|
||||
==== Configuring the MOB Cache
|
||||
|
||||
|
||||
Because there can be a large number of MOB files at any time, as compared to the number of HFiles,
|
||||
|
@ -181,85 +349,61 @@ suit your environment, and restart or rolling restart the RegionServer.
|
|||
----
|
||||
====
|
||||
|
||||
=== MOB Optimization Tasks
|
||||
|
||||
==== Manually Compacting MOB Files
|
||||
|
||||
To manually compact MOB files, rather than waiting for the
|
||||
<<mob.cache.configure,configuration>> to trigger compaction, use the
|
||||
`compact` or `major_compact` HBase shell commands. These commands
|
||||
periodic chore to trigger compaction, use the
|
||||
`major_compact` HBase shell commands. These commands
|
||||
require the first argument to be the table name, and take a column
|
||||
family as the second argument. and take a compaction type as the third argument.
|
||||
family as the second argument. If used with a column family that includes MOB data, then
|
||||
these operator requests will result in the MOB data being compacted.
|
||||
|
||||
----
|
||||
hbase> compact 't1', 'c1’, ‘MOB’
|
||||
hbase> major_compact 't1', 'c1’, ‘MOB’
|
||||
hbase> major_compact 't1'
|
||||
hbase> major_compact 't2', 'c1’
|
||||
----
|
||||
|
||||
These commands are also available via `Admin.compact` and
|
||||
`Admin.majorCompact` methods.
|
||||
|
||||
=== MOB architecture
|
||||
|
||||
This section is derived from information found in
|
||||
link:https://issues.apache.org/jira/browse/HBASE-11339[HBASE-11339]. For more information see
|
||||
the attachment on that issue
|
||||
"link:https://issues.apache.org/jira/secure/attachment/12724468/HBase%20MOB%20Design-v5.pdf[Base MOB Design-v5.pdf]".
|
||||
|
||||
==== Overview
|
||||
The MOB feature reduces the overall IO load for configured column families by storing values that
|
||||
are larger than the configured threshold outside of the normal regions to avoid splits, merges, and
|
||||
most importantly normal compactions.
|
||||
|
||||
When a cell is first written to a region it is stored in the WAL and memstore regardless of value
|
||||
size. When memstores from a column family configured to use MOB are eventually flushed two hfiles
|
||||
are written simultaneously. Cells with a value smaller than the threshold size are written to a
|
||||
normal region hfile. Cells with a value larger than the threshold are written into a special MOB
|
||||
hfile and also have a MOB reference cell written into the normal region HFile.
|
||||
|
||||
MOB reference cells have the same key as the cell they are based on. The value of the reference cell
|
||||
is made up of two pieces of metadata: the size of the actual value and the MOB hfile that contains
|
||||
the original cell. In addition to any tags originally written to HBase, the reference cell prepends
|
||||
two additional tags. The first is a marker tag that says the cell is a MOB reference. This can be
|
||||
used later to scan specifically just for reference cells. The second stores the namespace and table
|
||||
at the time the MOB hfile is written out. This tag is used to optimize how the MOB system finds
|
||||
the underlying value in MOB hfiles after a series of HBase snapshot operations (ref HBASE-12332).
|
||||
Note that tags are only available within HBase servers and by default are not sent over RPCs.
|
||||
|
||||
All MOB hfiles for a given table are managed within a logical region that does not directly serve
|
||||
requests. When these MOB hfiles are created from a flush or MOB compaction they are placed in a
|
||||
dedicated mob data area under the hbase root directory specific to the namespace, table, mob
|
||||
logical region, and column family. In general that means a path structured like:
|
||||
|
||||
----
|
||||
%HBase Root Dir%/mobdir/data/%namespace%/%table%/%logical region%/%column family%/
|
||||
----
|
||||
|
||||
With default configs, an example table named 'some_table' in the
|
||||
default namespace with a MOB enabled column family named 'foo' this HDFS directory would be
|
||||
|
||||
----
|
||||
/hbase/mobdir/data/default/some_table/372c1b27e3dc0b56c3a031926e5efbe9/foo/
|
||||
----
|
||||
|
||||
These MOB hfiles are maintained by special chores in the HBase Master rather than by any individual
|
||||
Region Server. Specifically those chores take care of enforcing TTLs and compacting them. Note that
|
||||
this compaction is primarily a matter of controlling the total number of files in HDFS because our
|
||||
operational assumptions for MOB data is that it will seldom update or delete.
|
||||
|
||||
When a given MOB hfile is no longer needed as a result of our compaction process it is archived just
|
||||
like any normal hfile. Because the table's mob region is independent of all the normal regions it
|
||||
can coexist with them in the regular archive storage area:
|
||||
|
||||
----
|
||||
/hbase/archive/data/default/some_table/372c1b27e3dc0b56c3a031926e5efbe9/foo/
|
||||
----
|
||||
|
||||
The same hfile cleaning chores that take care of eventually deleting unneeded archived files from
|
||||
normal regions thus also will take care of these MOB hfiles.
|
||||
This same request can be made via the `Admin.majorCompact` Java API.
|
||||
|
||||
=== MOB Troubleshooting
|
||||
|
||||
[[mob.troubleshoot.cleaner.toonew]]
|
||||
==== Adjusting the MOB cleaner's tolerance for new hfiles
|
||||
|
||||
The MOB cleaner chore ignores all MOB hfiles that were created more recently than an hour prior to
|
||||
the start of the chore to ensure we don't miss the reference metadata from the corresponding regular
|
||||
hfile. Without this safety check it would be possible for the cleaner chore to see a MOB hfile for
|
||||
an in progress flush or compaction and prematurely archive the MOB data. This default buffer should
|
||||
be sufficient for normal use.
|
||||
|
||||
You will need to adjust the tolerance if you use write amplification optimized MOB compaction and
|
||||
the combination of your underlying filesystem performance and data shape is such that it could take
|
||||
more than an hour to complete major compaction of a single region. For example, if your MOB data is
|
||||
distributed such that your largest region adds 80GB of MOB data between compactions that include
|
||||
rewriting MOB data and your HDFS cluster is only capable of writing 20MB/s for a single file then
|
||||
when performing the optimized compaction the Region Server will take about a minute to write the
|
||||
first 1GB MOB hfile and then another hour and seven minutes to write the remaining seventy-nine 1GB
|
||||
MOB hfiles before finally committing the new reference hfile at the end of the compaction. Given
|
||||
this example, you would need a larger tolerance window.
|
||||
|
||||
You will also need to adjust the tolerance if Region Server flush operations take longer than an
|
||||
hour for the two HDFS move operations needed to commit both the MOB hfile and the normal hfile that
|
||||
references it. Such a delay should not happen with a normally configured and healthy HDFS and HBase.
|
||||
|
||||
The cleaner's window for "too recent" is controlled by setting `hbase.mob.min.age.archive` to a
|
||||
positive integer number of milliseconds.
|
||||
|
||||
====
|
||||
[source,xml]
|
||||
----
|
||||
<property>
|
||||
<name>hbase.mob.min.age.archive</name>
|
||||
<value>86400000</value>
|
||||
<description>Example of tuning the cleaner to only archive files older than a day.</dscription>
|
||||
</property>
|
||||
----
|
||||
====
|
||||
|
||||
==== Retrieving MOB metadata through the HBase Shell
|
||||
|
||||
While working on troubleshooting failures in the MOB system you can retrieve some of the internal
|
||||
|
@ -468,3 +612,64 @@ $ hdfs dfs -count /hbase/mobdir/data/default/some_table
|
|||
+
|
||||
This data is spurious and may be reclaimed. You should sideline it, verify your application’s view
|
||||
of the table, and then delete it.
|
||||
|
||||
=== MOB Upgrade Considerations
|
||||
|
||||
Generally, data stored using the MOB feature should transparently continue to work correctly across
|
||||
HBase upgrades.
|
||||
|
||||
==== Upgrading to a version with the "distributed MOB compaction" feature
|
||||
|
||||
Prior to the work in HBASE-22749, "Distributed MOB compactions", HBase had the Master coordinate all
|
||||
compaction maintenance of the MOB hfiles. Centralizing management of the MOB data allowed for space
|
||||
optimizations but safely coordinating that managemet with Region Servers resulted in edge cases that
|
||||
caused data loss (ref link:https://issues.apache.org/jira/browse/HBASE-22075[HBASE-22075]).
|
||||
|
||||
Users of the MOB feature upgrading to a version of HBase that includes HBASE-22749 should be aware
|
||||
of the following changes:
|
||||
|
||||
* The MOB system no longer allows setting "MOB Compaction Policies"
|
||||
* The MOB system no longer attempts to group MOB values by the date of the original cell's timestamp
|
||||
according to said compaction policies, daily or otherwise
|
||||
* The MOB system no longer needs to track individual cell deletes through the use of special
|
||||
files in the MOB storage area with the suffix `_del`. After upgrading you should sideline these
|
||||
files.
|
||||
* Under default configuration the MOB system should take much less time to perform a compaction of
|
||||
MOB stored values. This is a direct consequence of the fact that HBase will place a much larger
|
||||
load on the underlying filesystem when doing compactions of MOB stored values; the additional load
|
||||
should be a multiple on the order of magnitude of number of region servers. I.e. for a cluster
|
||||
with three region servers and two masters the default configuration should have HBase put three
|
||||
times the load on HDFS during major compactions that rewrite MOB data when compared to Master
|
||||
handled MOB compaction; it should also be approximately three times as fast.
|
||||
* When the MOB system detects that a table has hfiles with references to MOB data but the reference
|
||||
hfiles do not yet have the needed file level metadata (i.e. from use of the MOB feature prior to
|
||||
HBASE-22749) then it will refuse to archive _any_ MOB hfiles from that table. The normal course of
|
||||
periodic compactions done by Region Servers will update existing hfiles with MOB references, but
|
||||
until a given table has been through the needed compactions operators should expect to see an
|
||||
increased amount of storage used by the MOB feature.
|
||||
* Performing a compaction with type "MOB" no longer has special handling to compact specifically the
|
||||
MOB hfiles. Instead it will issue a warning and do a compaction of the table. For example using
|
||||
the HBase shell as follows will result in a warning in the Master logs followed by a major
|
||||
compaction of the 'example' table in its entirety or for the 'big' column respectively.
|
||||
+
|
||||
----
|
||||
hbase> major_compact 'example', nil, 'MOB'
|
||||
hbase> major_compact 'example', 'big', 'MOB'
|
||||
----
|
||||
+
|
||||
The same is true for directly using the Java API for
|
||||
`admin.majorCompact(TableName.valueOf("example"), CompactType.MOB)`.
|
||||
* Similarly, manually performing a major compaction on a table or region will also handle compacting
|
||||
the MOB stored values for that table or region respectively.
|
||||
|
||||
The following configuration setting has been deprecated and replaced:
|
||||
|
||||
* `hbase.master.mob.ttl.cleaner.period` has been replaced with `hbase.master.mob.cleaner.period`
|
||||
|
||||
The following configuration settings are no longer used:
|
||||
|
||||
* `hbase.mob.compaction.mergeable.threshold`
|
||||
* `hbase.mob.delfile.max.count`
|
||||
* `hbase.mob.compaction.batch.size`
|
||||
* `hbase.mob.compactor.class`
|
||||
* `hbase.mob.compaction.threads.max`
|
||||
|
|
|
@ -38,6 +38,7 @@
|
|||
:experimental:
|
||||
:source-language: java
|
||||
:leveloffset: 0
|
||||
:stem:
|
||||
|
||||
// Logo for HTML -- doesn't render in PDF
|
||||
ifdef::backend-html5[]
|
||||
|
|
Loading…
Reference in New Issue