ZStandard supports initialization of compressors and decompressors with a
precomputed dictionary, which can dramatically improve and speed up compression
of tables with small values. For more details, please see
The Case For Small Data Compression
https://github.com/facebook/zstd#the-case-for-small-data-compression
Signed-off-by: Duo Zhang <zhangduo@apache.org>
This reverts commit bfa45841251e3224e3fbb07599ae5dd6791b2846.
This is not ready yet. There are some code paths remaining where store
configuration (CompoundConfiguration) is not passed into the block decoding
context. Found with additional integration tests.
ZStandard supports initialization of compressors and decompressors with a
precomputed dictionary, which can dramatically improve and speed up compression
of tables with small values. For more details, please see
The Case For Small Data Compression
https://github.com/facebook/zstd#the-case-for-small-data-compression
Signed-off-by: Duo Zhang <zhangduo@apache.org>
This change introduces provided compression codecs to HBase as
new Maven modules. Each module provides compression codec support
that formerly required Hadoop native codecs, which in turn relies
on native code integration, which may or may not be available on
a given hardware platform or in an operational environment. We
now provide codecs in the HBase distribution for users whom for
whatever reason cannot or do not wish to deploy the Hadoop native
codecs.
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
We introduced EnvironmentEdgeManager as a way to inject alternate clocks
for unit tests. In order for this to be effective, all callers that would
otherwise use System.currentTimeMillis() must call
EnvironmentEdgeManager.currentTime() instead, except the implementers of
EnvironmentEdge.
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
This revert of a revert reapplies the PR; the original application was
missing the HBASE JIRA #; thats why it was reverted and then reapplied
w/ the JIRA # added.
This reverts commit 16fe1e95ec5ca5528adc8c0aa7dc00f18c3f014f.
Revert of the revert -- re-applying HBASE-25449 with a change
of renaming the test hdfs XML configuration file as it was adversely
affecting tests using MiniDFS
This reverts commit c218e576fe54df208e277365f1ac24f993f2a4b1.
Co-authored-by: Josh Elser <elserj@apache.org>
Signed-off-by: Peter Somogyi <psomogyi@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
* Using ContiguousCellFormat as a marker alone
* Commit the new file
* Fix the comparator logic that was an oversight
* Fix the sequenceId check order
* Adding few more static methods that helps in scan flow like query
matcher where we have more cols
* Remove ContiguousCellFormat and ensure compare() can be inlined
* applying negation as per review comment
* Fix checkstyle comments
* fix review comments
* Address review comments
* Fix the checkstyle issues
* Fix javadoc
Signed-off-by: stack <stack@apache.org>
Signed-off-by: AnoopSamJohn <anoopsamjohn@apache.org>
Signed-off-by: huaxiangsun <huaxiangsun@apache.org>
* HBASE-25050 - We initialize Filesystems more than once.
* Ensuring that calling the FS#get() will only ensure FS init.
* Fix for testfailures. We should pass the entire path and no the scheme
alone
* Cases where we don't have a scheme for the URI
* Address review comments
* Add some comments on why FS#get(URI, conf) is getting used
* Adding the comment as per Sean's review
Signed-off-by: Sean Busbey <busbey@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
Wire up the `ConfigurationObserver` chain for
`RegionNormalizerManager`. The following configuration keys support
hot-reloading:
* hbase.normalizer.throughput.max_bytes_per_sec
* hbase.normalizer.split.enabled
* hbase.normalizer.merge.enabled
* hbase.normalizer.min.region.count
* hbase.normalizer.merge.min_region_age.days
* hbase.normalizer.merge.min_region_size.mb
Note that support for `hbase.normalizer.period` is not provided
here. Support would need to be implemented generally for the `Chore`
subsystem.
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Aman Poonia <aman.poonia.29@gmail.com>
The core change here is to the loop in
`SimpleRegionNormalizer#computeMergeNormalizationPlans`. It's a nested
loop that walks the table's region chain once, looking for contiguous
sequences of regions that meet the criteria for merge. The outer loop
tracks the starting point of the next sequence, the inner loop looks
for the end of that sequence. A single sequence becomes an instance of
`MergeNormalizationPlan`.
Signed-off-by: Huaxiang Sun <huaxiangsun@apache.org>
* Break subclass referencing of MetaCellComparator from superclass CellComparatorImpl
static initializer by moving META_COMPARATOR to subclass MetaCellComparator
Closes#2329
Signed-off-by: Duo Zhang <zhangduo@apache.org>
when neighbor is larger than average size
* add `testMergeEmptyRegions` to explicitly cover different
interleaving of 0-sized regions.
* fix bug where merging a 0-size region is skipped due to large
neighbor.
* remove unused `splitPoint` from `SplitNormalizationPlan`.
* generate `toString`, `hashCode`, and `equals` methods from Apache
Commons Lang3 template on `SplitNormalizationPlan` and
`MergeNormalizationPlan`.
* simplify test to use equality matching over `*NormalizationPlan`
instances as plain pojos.
* test make use of this handy `TableNameTestRule`.
* fix line-length issues in `TestSimpleRegionNormalizer`
Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: huaxiangsun <huaxiangsun@apache.org>
Signed-off-by: Aman Poonia <aman.poonia.29@gmail.com>
* refactor how we use connection and async connection to rely on their access methods
* refactor initialization and cleanup of the shared connection
* incompatibly change HCTU's Configuration member variable to be final so it can be safely accessed from multiple threads.
Closes#2180
Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
This utility is useful for any module that wants to detect
dynamic config changes. Having it to hbase-common makes it
accessible to all the other modules.
Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Move the random free port generate back into hbasecommontestingutility
from hbasetestingutility.
Add a create simple kdc server utility that will start a kdc server and
if a bindexception, create a new one on a new random port in hbase-common.
Add new BoundSocketMaker helpful when trying to manufacture
BindExceptions because of port clash.
Change thrift and http kdc tests to use this new utility (removes
code duplication around kdc server setup).