Depending on which compression codec is used, a short read of the
compressed bytes can cause catastrophic errors that confuse the WAL reader.
This problem can manifest when the reader is actively tailing the WAL for
replication. To avoid these issues when WAL value compression is enabled,
BoundedDelegatingInputStream should assume enough bytes are available to
supply a reader up to its bound. This behavior is valid per the contract
of available(), which provides an _estimate_ of available bytes, and
equivalent to IOUtils.readFully but without requiring an intermediate
buffer.
Added TestReplicationCompressedWAL and TestReplicationValueCompressedWAL.
Without the WALCellCodec change TestReplicationValueCompressedWAL will
fail.
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
We introduced EnvironmentEdgeManager as a way to inject alternate clocks
for unit tests. In order for this to be effective, all callers that would
otherwise use System.currentTimeMillis() must call
EnvironmentEdgeManager.currentTime() instead, except the implementers of
EnvironmentEdge.
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Undo asserts that LZ4 and SNAPPY fails if their native libs are NOT
loaded; as of hadoop 3.3.1, LZ4 and SNAPPY can work w/o native libs.
Signed-off-by: Duo Zhang <zhangduo@apache.org>
WAL storage can be expensive, especially if the cell values
represented in the edits are large, consisting of blobs or
significant lengths of text. Such WALs might need to be kept around
for a fairly long time to satisfy replication constraints on a space
limited (or space-contended) filesystem.
We have a custom dictionary compression scheme for cell metadata that
is engaged when WAL compression is enabled in site configuration.
This is fine for that application, where we can expect the universe
of values and their lengths in the custom dictionaries to be
constrained. For arbitrary cell values it is better to use one of the
available compression codecs, which are suitable for arbitrary albeit
compressible data.
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Remove the deprecated fields, which can be removed in 3.0.0. Marked the
constant OLDEST_TIMESTAMP as InterfaceAudience.Private as it is only use
in classes, which are also marked as InterfaceAudience.Private.
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
This revert of a revert reapplies the PR; the original application was
missing the HBASE JIRA #; thats why it was reverted and then reapplied
w/ the JIRA # added.
This reverts commit 16fe1e95ec.
Adds "hbase.master.executor.merge.dispatch.threads" and defaults to 2.
Also adds additional logging that includes the number of split plans
and merge plans computed for each normalizer run.
Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>