Depending on which compression codec is used, a short read of the
compressed bytes can cause catastrophic errors that confuse the WAL reader.
This problem can manifest when the reader is actively tailing the WAL for
replication. To avoid these issues when WAL value compression is enabled,
BoundedDelegatingInputStream should assume enough bytes are available to
supply a reader up to its bound. This behavior is valid per the contract
of available(), which provides an _estimate_ of available bytes, and
equivalent to IOUtils.readFully but without requiring an intermediate
buffer.
Added TestReplicationCompressedWAL and TestReplicationValueCompressedWAL.
Without the WALCellCodec change TestReplicationValueCompressedWAL will
fail.
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
We introduced EnvironmentEdgeManager as a way to inject alternate clocks
for unit tests. In order for this to be effective, all callers that would
otherwise use System.currentTimeMillis() must call
EnvironmentEdgeManager.currentTime() instead, except the implementers of
EnvironmentEdge.
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Undo asserts that LZ4 and SNAPPY fails if their native libs are NOT
loaded; as of hadoop 3.3.1, LZ4 and SNAPPY can work w/o native libs.
Signed-off-by: Duo Zhang <zhangduo@apache.org>
WAL storage can be expensive, especially if the cell values
represented in the edits are large, consisting of blobs or
significant lengths of text. Such WALs might need to be kept around
for a fairly long time to satisfy replication constraints on a space
limited (or space-contended) filesystem.
We have a custom dictionary compression scheme for cell metadata that
is engaged when WAL compression is enabled in site configuration.
This is fine for that application, where we can expect the universe
of values and their lengths in the custom dictionaries to be
constrained. For arbitrary cell values it is better to use one of the
available compression codecs, which are suitable for arbitrary albeit
compressible data.
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Remove the deprecated fields, which can be removed in 3.0.0. Marked the
constant OLDEST_TIMESTAMP as InterfaceAudience.Private as it is only use
in classes, which are also marked as InterfaceAudience.Private.
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
This revert of a revert reapplies the PR; the original application was
missing the HBASE JIRA #; thats why it was reverted and then reapplied
w/ the JIRA # added.
This reverts commit 16fe1e95ec.
Adds "hbase.master.executor.merge.dispatch.threads" and defaults to 2.
Also adds additional logging that includes the number of split plans
and merge plans computed for each normalizer run.
Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Revert of the revert -- re-applying HBASE-25449 with a change
of renaming the test hdfs XML configuration file as it was adversely
affecting tests using MiniDFS
This reverts commit c218e576fe.
Co-authored-by: Josh Elser <elserj@apache.org>
Signed-off-by: Peter Somogyi <psomogyi@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
* HBASE-25379 Make retry pause time configurable for regionserver short operation RPC (reportRegionStateTransition/reportProcedureDone)
* HBASE-25379 RemoteProcedureResultReporter also should retry after the configured pause time
* Addressed the review comments
Signed-off-by: Yulin Niu <niuyulin@apache.org>
* Using ContiguousCellFormat as a marker alone
* Commit the new file
* Fix the comparator logic that was an oversight
* Fix the sequenceId check order
* Adding few more static methods that helps in scan flow like query
matcher where we have more cols
* Remove ContiguousCellFormat and ensure compare() can be inlined
* applying negation as per review comment
* Fix checkstyle comments
* fix review comments
* Address review comments
* Fix the checkstyle issues
* Fix javadoc
Signed-off-by: stack <stack@apache.org>
Signed-off-by: AnoopSamJohn <anoopsamjohn@apache.org>
Signed-off-by: huaxiangsun <huaxiangsun@apache.org>
Network identities should be bound late. Remote addresses should be
resolved at the last possible moment, just before connect(). Network
identity mappings can change, so our code should not inappropriately
cache them. Otherwise we might miss a change and fail to operate normally.
Revert "HBASE-14544 Allow HConnectionImpl to not refresh the dns on errors"
Removes hbase.resolve.hostnames.on.failure and related code. We always
resolve hostnames, as late as possible.
Preserve InetSocketAddress caching per RPC connection. Avoids potential
lookups per Call.
Replace InetSocketAddress with Address where used as a map key. If we want
to key by hostname and/or resolved address we should be explicit about it.
Using Address chooses mapping by hostname and port only.
Add metrics for potential nameservice resolution attempts, whenever an
InetSocketAddress is instantiated for connect; and metrics for failed
resolution, whenever InetSocketAddress#isUnresolved on the new instance
is true.
* Use ServerName directly to build a stub key
* Resolve and cache ISA on a RpcChannel as late as possible, at first call
* Remove now invalid unit test TestCIBadHostname
We resolve DNS at the latest possible time, at first call, and do not
resolve hostnames for creating stubs at all, so this unit test cannot
work now.
Reviewed-by: Mingliang Liu <liuml07@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
This PR is a follow-up of HBASE-25181 (#2539), where several issues were
discussed on the PR:
1. Currently we use PBKDF2WithHmacSHA1 key generation algorithm to generate a
secret key for HFile / WalFile encryption, when the user is defining a string
encryption key in the hbase shell. This algorithm is not secure enough and
not allowed in certain environments (e.g. on FIPS compliant clusters). We are
changing it to PBKDF2WithHmacSHA384. It will not break backward-compatibility,
as even the tables created by the shell using the new algorithm will be able
to load (e.g. during bulkload / replication) the HFiles serialized with the
key generated by an old algorithm, as the HFiles themselves already contain
the key necessary for their decryption.
Smaller issues fixed by this commit:
2. Improve the documentation e.g. with the changes introduced by HBASE-25181
and also by some points discussed on the Jira ticket of HBASE-25263.
3. In EncryptionUtil.createEncryptionContext the various encryption config
checks should throw IllegalStateExceptions instead of RuntimeExceptions.
4. Test cases in TestEncryptionTest.java should be broken down into smaller
tests.
5. TestEncryptionDisabled.java should use ExpectedException JUnit rule to
validate exceptions.
closes#2676
Signed-off-by: Peter Somogyi <psomogyi@apache.org>
* HBASE-25050 - We initialize Filesystems more than once.
* Ensuring that calling the FS#get() will only ensure FS init.
* Fix for testfailures. We should pass the entire path and no the scheme
alone
* Cases where we don't have a scheme for the URI
* Address review comments
* Add some comments on why FS#get(URI, conf) is getting used
* Adding the comment as per Sean's review
Signed-off-by: Sean Busbey <busbey@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
* HBASE-25187 Improve SizeCachedKV variants initialization
* HBASE-25187 Improve SizeCachedKV variants initialization
* The BBKeyValue also can be optimized
* Change for SizeCachedKeyValue
* Addressing revew comments
* Fixing checkstyle and spot bugs comments
* Spot bug fix for hashCode
* Minor updates make the rowLen as short and some consturctor formatting
* Change two more places where there was a cast