HBASE-26265 Update ref guide to mention the new store file tracker im… (#3942)
This commit is contained in:
parent
771e552cf4
commit
f16b7b1bfa
|
@ -0,0 +1,145 @@
|
|||
////
|
||||
/**
|
||||
*
|
||||
* Licensed to the Apache Software Foundation (ASF) under one
|
||||
* or more contributor license agreements. See the NOTICE file
|
||||
* distributed with this work for additional information
|
||||
* regarding copyright ownership. The ASF licenses this file
|
||||
* to you under the Apache License, Version 2.0 (the
|
||||
* "License"); you may not use this file except in compliance
|
||||
* with the License. You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
////
|
||||
|
||||
[[storefiletracking]]
|
||||
= Store File Tracking
|
||||
:doctype: book
|
||||
:numbered:
|
||||
:toc: left
|
||||
:icons: font
|
||||
:experimental:
|
||||
|
||||
== Overview
|
||||
|
||||
This feature introduces an abstraction layer to track store files still used/needed by store
|
||||
engines, allowing for plugging different approaches of identifying store
|
||||
files required by the given store.
|
||||
|
||||
Historically, HBase internals have relied on creating hfiles on temporary directories first, renaming
|
||||
those files to the actual store directory at operation commit time. That's a simple and convenient
|
||||
way to separate transient from already finalised files that are ready to serve client reads with data.
|
||||
This approach works well with strong consistent file systems, but with the popularity of less consistent
|
||||
file systems, mainly Object Store which can be used like file systems, dependency on atomic rename operations starts to introduce
|
||||
performance penalties. The Amazon S3 Object Store, in particular, has been the most affected deployment,
|
||||
due to its lack of atomic renames. The HBase community temporarily bypassed this problem by building a distributed locking layer called HBOSS,
|
||||
to guarantee atomicity of operations against S3.
|
||||
|
||||
With *Store File Tracking*, decision on where to originally create new hfiles and how to proceed upon
|
||||
commit is delegated to the specific Store File Tracking implementation.
|
||||
The implementation can be set at the HBase service leve in *hbase-site.xml* or at the
|
||||
Table or Column Family via the TableDescriptor configuration.
|
||||
|
||||
NOTE: When the store file tracking implementation is specified in *hbase_site.xml*, this configuration is also propagated into a tables configuration
|
||||
at table creation time. This is to avoid dangerous configuration mismatches between processes, which
|
||||
could potentially lead to data loss.
|
||||
|
||||
== Available Implementations
|
||||
|
||||
Store File Tracking initial version provides three builtin implementations:
|
||||
|
||||
* DEFAULT
|
||||
* FILE
|
||||
* MIGRATION
|
||||
|
||||
### DEFAULT
|
||||
|
||||
As per the name, this is the Store File Tracking implementation used by default when no explicit
|
||||
configuration has been defined. The DEFAULT tracker implements the standard approach using temporary
|
||||
directories and renames. This is how all previous (implicit) implementation that HBase used to track store files.
|
||||
|
||||
### FILE
|
||||
|
||||
A file tracker implementation that creates new files straight in the store directory, avoiding the
|
||||
need for rename operations. It keeps a list of committed hfiles in memory, backed by meta files, in
|
||||
each store directory. Whenever a new hfile is committed, the list of _tracked files_ in the given
|
||||
store is updated and a new meta file is written with this list contents, discarding the previous
|
||||
meta file now containing an out dated list.
|
||||
|
||||
### MIGRATION
|
||||
|
||||
A special implementation to be used when swapping between Store File Tracking implementations on
|
||||
pre-existing tables that already contain data, and therefore, files being tracked under an specific
|
||||
logic.
|
||||
|
||||
== Usage
|
||||
|
||||
For fresh deployments that don't yet contain any user data, *FILE* implementation can be just set as
|
||||
value for *hbase.store.file-tracker.impl* property in global *hbase-site.xml* configuration, prior
|
||||
to the first hbase start. Omitting this property sets the *DEFAULT* implementation.
|
||||
|
||||
For clusters with data that are upgraded to a version of HBase containing the store file tracking
|
||||
feature, the Store File Tracking implementation can only be changed with the *MIGRATION*
|
||||
implementation, so that the _new tracker_ can safely build its list of tracked files based on the
|
||||
list of the _current tracker_.
|
||||
|
||||
NOTE: MIGRATION tracker should NOT be set at global configuration. To use it, follow below section
|
||||
about setting Store File Tacking at Table or Column Family configuration.
|
||||
|
||||
|
||||
### Configuring for Table or Column Family
|
||||
|
||||
Setting Store File Tracking configuration globally may not always be possible or desired, for example,
|
||||
in the case of upgraded clusters with pre-existing user data.
|
||||
Store File Tracking can be set at Table or Column Family level configuration.
|
||||
For example, to specify *FILE* implementation in the table configuration at table creation time,
|
||||
the following should be applied:
|
||||
|
||||
----
|
||||
create 'my-table', 'f1', 'f2', {CONFIGURATION => {'hbase.store.file-tracker.impl' => 'FILE'}}
|
||||
----
|
||||
|
||||
To define *FILE* for an specific Column Family:
|
||||
|
||||
----
|
||||
create 'my-table', {NAME=> '1', CONFIGURATION => {'hbase.store.file-tracker.impl' => 'FILE'}}
|
||||
----
|
||||
|
||||
### Switching trackers at Table or Column Family
|
||||
|
||||
A very common scenario is to set Store File Tracking on pre-existing HBase deployments that have
|
||||
been upgraded to a version that supports this feature. To apply the FILE tracker, tables effectively
|
||||
need to be migrated from the DEFAULT tracker to the FILE tracker. As explained previously, such
|
||||
process requires the usage of the special MIGRATION tracker implementation, which can only be
|
||||
specified at table or Column Family level.
|
||||
|
||||
For example, to switch _tracker_ from *DEFAULT* to *FILE* in a table configuration:
|
||||
|
||||
----
|
||||
alter 'my-table', CONFIGURATION => {'hbase.store.file-tracker.impl' => 'MIGRATION',
|
||||
'hbase.store.file-tracker.migration.src.impl' => 'DEFAULT',
|
||||
'hbase.store.file-tracker.migration.dst.impl' => 'FILE'}
|
||||
----
|
||||
|
||||
To apply similar switch at column family level configuration:
|
||||
|
||||
----
|
||||
alter 'my-table', {NAME => 'f1', CONFIGURATION => {'hbase.store.file-tracker.impl' => 'MIGRATION',
|
||||
'hbase.store.file-tracker.migration.src.impl' => 'DEFAULT',
|
||||
'hbase.store.file-tracker.migration.dst.impl' => 'FILE'}}
|
||||
----
|
||||
|
||||
Once all table regions have been onlined again, don't forget to disable MIGRATION, by now setting
|
||||
*hbase.store.file-tracker.migration.dst.impl* value as the *hbase.store.file-tracker.impl*. In the above
|
||||
example, that would be as follows:
|
||||
|
||||
----
|
||||
alter 'my-table', CONFIGURATION => {'hbase.store.file-tracker.impl' => 'FILE'}
|
||||
----
|
|
@ -89,6 +89,7 @@ include::_chapters/zookeeper.adoc[]
|
|||
include::_chapters/community.adoc[]
|
||||
include::_chapters/hbtop.adoc[]
|
||||
include::_chapters/tracing.adoc[]
|
||||
include::_chapters/store_file_tracking.adoc[]
|
||||
|
||||
= Appendix
|
||||
|
||||
|
|
Loading…
Reference in New Issue