Avoid the pattern where a Random object is allocated, used once or twice, and
then left for GC. This pattern triggers warnings from some static analysis tools
because this pattern leads to poor effective randomness. In a few cases we were
legitimately suffering from this issue; in others a change is still good to
reduce noise in analysis results.
Use ThreadLocalRandom where there is no requirement to set the seed to gain
good reuse.
Where useful relax use of SecureRandom to simply Random or ThreadLocalRandom,
which are unlikely to block if the system entropy pool is low, if we don't need
crypographically strong randomness for the use case. The exception to this is
normalization of use of Bytes#random to fill byte arrays with randomness.
Because Bytes#random may be used to generate key material it must be backed by
SecureRandom.
Signed-off-by: Duo Zhang <zhangduo@apache.org>
For the `TraceUtil` methods that accept `Callable` and `Runnable` types, make them generic over a
child of `Throwable`. This allows us to consolidate the two method signatures into a single more
flexible definition.
Signed-off-by: Duo Zhang <zhangduo@apache.org>
For batch operations, collect and annotate the associated span with the set of all operations
contained in the batch.
Signed-off-by: Duo Zhang <zhangduo@apache.org>
For batch operations, collect and annotate the associated span with the set of all operations
contained in the batch.
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Add support for `db.system`, `db.connection_string`, `db.user`.
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Huaxiang Sun <huaxiangsun@apache.org>
Co-authored-by: Josh Elser <josh.elser@gmail.com>
Follows the guidance outlined in https://github.com/open-telemetry/opentelemetry-specification/blob/3e380e2/specification/trace/semantic_conventions/database.dm
* all table data operations are assumed to be of type CLIENT
* populate `db.name` and `db.operation` attributes
* name table data operation spans as `db.operation` `db.name`:`db.hbase.table`
note: this implementation deviates from the recommended `db.name`.`db.sql.table` and instead
uses HBase's native String representation of namespace:tablename.
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org>
ZStandard supports initialization of compressors and decompressors with a
precomputed dictionary, which can dramatically improve and speed up compression
of tables with small values. For more details, please see
The Case For Small Data Compression
https://github.com/facebook/zstd#the-case-for-small-data-compression
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Conflicts:
hbase-compression/hbase-compression-zstd/src/main/java/org/apache/hadoop/hbase/io/compress/zstd/ZstdCodec.java
This reverts commit 8ac0b5ed7f.
This is not ready yet. There are some code paths remaining where store
configuration (CompoundConfiguration) is not passed into the block decoding
context. Found with additional integration tests.
ZStandard supports initialization of compressors and decompressors with a
precomputed dictionary, which can dramatically improve and speed up compression
of tables with small values. For more details, please see
The Case For Small Data Compression
https://github.com/facebook/zstd#the-case-for-small-data-compression
Signed-off-by: Duo Zhang <zhangduo@apache.org>
We get and retain Compressor instances in HFileBlockDefaultEncodingContext,
and could in theory call Compressor#reinit when setting up the context,
to update compression parameters like level and buffer size, but we do
not plumb through the CompoundConfiguration from the Store into the
encoding context. As a consequence we can only update codec parameters
globally in system site conf files.
Fine grained configurability is important for algorithms like ZStandard
(ZSTD), which offers more than 20 compression levels, where at level 1
it is almost as fast as LZ4, and where at higher levels it utilizes
computationally expensive techniques to rival LZMA at compression ratio
but trades off significantly for reduced compresson throughput. The ZSTD
level that should be set for a given column family or table will vary by
use case.
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Conflicts:
hbase-compression/hbase-compression-zstd/src/main/java/org/apache/hadoop/hbase/io/compress/zstd/ZstdDecompressor.java
hbase-server/src/test/java/org/apache/hadoop/hbase/io/compress/HFileTestBase.java
hbase-server/src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
hbase-server/src/test/java/org/apache/hadoop/hbase/io/encoding/TestSeekToBlockWithEncoders.java
This change introduces provided compression codecs to HBase as
new Maven modules. Each module provides compression codec support
that formerly required Hadoop native codecs, which in turn relies
on native code integration, which may or may not be available on
a given hardware platform or in an operational environment. We
now provide codecs in the HBase distribution for users whom for
whatever reason cannot or do not wish to deploy the Hadoop native
codecs.
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Conflicts:
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileEncryption.java
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java
hbase-server/src/test/java/org/apache/hadoop/hbase/wal/TestCompressedWAL.java
Introduce `hfile.onheap.block.cache.fixed.size`
and default to disable. when using ClientSideRegionScanner
it will be enabled with a fixed size for caching
INDEX/LEAF_INDEX block when a client, e.g.
snapshot scanner, scans the entire HFile
and does not need to seek/reseek to index
block multiple times.
Signed-off-by: Josh Elser <elserj@apache.org>
17/17 commits of HBASE-22120, original commit f36e153964
Co-authored-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Peter Somogyi <psomogyi@apache.org>
10/17 commits of HBASE-22120, original commit f6ff519dd0
Co-authored-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Peter Somogyi <psomogyi@apache.org>
8/17 commits of HBASE-22120, original commit 2be2c63f0d
Co-authored-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Peter Somogyi <psomogyi@apache.org>
6/17 commits of HBASE-22120, original commit ae2c62ffaa
Co-authored-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Peter Somogyi <psomogyi@apache.org>
In some situations, a caller may know that it is properly managing the
Kerberos ticket to talk to HBase. In these situations, it's possible
that AuthUtil still tries to do renewals, but just fails repeatedly to
do so. Give a configuration flag for such clients to be able to tell
AuthUtil to simply stop trying.
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Depending on which compression codec is used, a short read of the
compressed bytes can cause catastrophic errors that confuse the WAL reader.
This problem can manifest when the reader is actively tailing the WAL for
replication. To avoid these issues when WAL value compression is enabled,
BoundedDelegatingInputStream should assume enough bytes are available to
supply a reader up to its bound. This behavior is valid per the contract
of available(), which provides an _estimate_ of available bytes, and
equivalent to IOUtils.readFully but without requiring an intermediate
buffer.
Added TestReplicationCompressedWAL and TestReplicationValueCompressedWAL.
Without the WALCellCodec change TestReplicationValueCompressedWAL will
fail.
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Undo asserts that LZ4 and SNAPPY fails if their native libs are NOT
loaded; as of hadoop 3.3.1, LZ4 and SNAPPY can work w/o native libs.
Signed-off-by: Duo Zhang <zhangduo@apache.org>
WAL storage can be expensive, especially if the cell values
represented in the edits are large, consisting of blobs or
significant lengths of text. Such WALs might need to be kept around
for a fairly long time to satisfy replication constraints on a space
limited (or space-contended) filesystem.
We have a custom dictionary compression scheme for cell metadata that
is engaged when WAL compression is enabled in site configuration.
This is fine for that application, where we can expect the universe
of values and their lengths in the custom dictionaries to be
constrained. For arbitrary cell values it is better to use one of the
available compression codecs, which are suitable for arbitrary albeit
compressible data.
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
First extract an hbase-coprocessor module used by hbase-client, hbase-server.
This is prerequisite to extracting an hbase-wal module.
M hbase-common/src/main/java/org/apache/hadoop/hbase/Abortable.java
M hbase-common/src/main/java/org/apache/hadoop/hbase/DoNotRetryIOException.java
M hbase-common/src/main/java/org/apache/hadoop/hbase/util/SortedList.java
Move to hbase-common. Its a generic Interface. Need by
M hbase-coprocessor/src/main/java/org/apache/hadoop/hbase/Coprocessor.java
M hbase-coprocessor/src/main/java/org/apache/hadoop/hbase/CoprocessorEnvironment.java
M hbase-coprocessor/src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEnvironment.java
M hbase-coprocessor/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java
M hbase-coprocessor/src/main/java/org/apache/hadoop/hbase/coprocessor/CoreCoprocessor.java
M hbase-coprocessor/src/main/java/org/apache/hadoop/hbase/coprocessor/ObserverContext.java
M hbase-coprocessor/src/main/java/org/apache/hadoop/hbase/coprocessor/ObserverContextImpl.java
M hbase-coprocessor/src/main/java/org/apache/hadoop/hbase/coprocessor/ReadOnlyConfiguration.java
Move to hbase-coprocessor.
M hbase-endpoint/src/main/java/org/apache/hadoop/hbase/client/coprocessor/BigDecimalColumnInterpreter.java
M hbase-client/src/main/java/org/apache/hadoop/hbase/client/coprocessor/DoubleColumnInterpreter.java
M hbase-endpoint/src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
M hbase-endpoint/src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java
Moved to hbase-endpoint where they are used.
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java
Include region name when toString'd.
M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALCoprocessorHost.java
Include WAL name when toString'd.
M hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java
Add utility used in testing here from CoprocessorHost.
Added new interface in TableDescriptor which allows user to define RSGroup name while creating or modifying a table.
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Pankaj Kumar<pankajkumar@apache.org>
Adds "hbase.master.executor.merge.dispatch.threads" and defaults to 2.
Also adds additional logging that includes the number of split plans
and merge plans computed for each normalizer run.
(cherry picked from commit 36b4698cea)
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org>
Signed-off-by: Andrew Purtell <apurtell@apache.org>
Revert of the revert -- re-applying HBASE-25449 with a change
of renaming the test hdfs XML configuration file as it was adversely
affecting tests using MiniDFS
This reverts commit c218e576fe.
Co-authored-by: Josh Elser <elserj@apache.org>
Signed-off-by: Peter Somogyi <psomogyi@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
* HBASE-25379 Make retry pause time configurable for regionserver short operation RPC (reportRegionStateTransition/reportProcedureDone)
* HBASE-25379 RemoteProcedureResultReporter also should retry after the configured pause time
* Addressed the review comments
Signed-off-by: Yulin Niu <niuyulin@apache.org>
(cherry picked from commit c96fbf0407)
* Using ContiguousCellFormat as a marker alone
* Commit the new file
* Fix the comparator logic that was an oversight
* Fix the sequenceId check order
* Adding few more static methods that helps in scan flow like query
matcher where we have more cols
* Remove ContiguousCellFormat and ensure compare() can be inlined
* applying negation as per review comment
* Fix checkstyle comments
* fix review comments
* Address review comments
Signed-off-by: stack <stack@apache.org>
Signed-off-by: AnoopSamJohn <anoopsamjohn@apache.org>
Signed-off-by: huaxiangsun <huaxiangsun@apache.org>