Wire up the `ConfigurationObserver` chain for
`RegionNormalizerManager`. The following configuration keys support
hot-reloading:
* hbase.normalizer.throughput.max_bytes_per_sec
* hbase.normalizer.split.enabled
* hbase.normalizer.merge.enabled
* hbase.normalizer.min.region.count
* hbase.normalizer.merge.min_region_age.days
* hbase.normalizer.merge.min_region_size.mb
Note that support for `hbase.normalizer.period` is not provided
here. Support would need to be implemented generally for the `Chore`
subsystem.
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Aman Poonia <aman.poonia.29@gmail.com>
The core change here is to the loop in
`SimpleRegionNormalizer#computeMergeNormalizationPlans`. It's a nested
loop that walks the table's region chain once, looking for contiguous
sequences of regions that meet the criteria for merge. The outer loop
tracks the starting point of the next sequence, the inner loop looks
for the end of that sequence. A single sequence becomes an instance of
`MergeNormalizationPlan`.
Signed-off-by: Huaxiang Sun <huaxiangsun@apache.org>
Remove the RegionStates.include method as its name is ambiguous.
Add more comments to describe the logic on why we filter region like
this.
Signed-off-by: Toshihiro Suzuki <brfrn169@gmail.com>
Remove the SingletonCoprocessorService interface targeted for removal in
3.0.0.
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Remove the deprecated RpcSchedulerFactory#create(Configuration,
PriorityFunction) method from the interface and in all implementing
classes.
Signed-off-by: Duo Zhang <zhangduo@apache.org>
* HBASE-25065 WAL archival can be batched/throttled and also done by a separate thread
* Fix checkstyle issues
* Address review comments
* checkstyle comments
* Addressing final review comments
Signed-off-by: Michael Stack <stack@apache.org>
Make it so WALPlayer can replay recovered.edits files.
hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/WALInputFormat.java
Allow for WAL files that do NOT have a startime in their name.
Use the 'generic' WAL-filename parser instead of the one that
used be local here. Implement support for 'startTime' filter.
Previous was just not implemented.
hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/WALPlayer.java
Checkstyle.
hbase-server/src/main/java/org/apache/hadoop/hbase/wal/AbstractFSWALProvider.java
Use the new general WAL name timestamp parser.
hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WAL.java
Utility for parsing timestamp from WAL filename.
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRecoveredEdits.java
Export attributes about the local recovered.edits test file
so other tests can play with it.
Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org>
Implement a rate limiter for the normalizer. Implemented in terms of
MB/sec of affacted region size (the same metrics used to make
normalization decisions). Uses Guava `RateLimiter` to perform the
resource accounting. `RateLimiter` works by blocking (uninterruptible
😖) the calling thread. Thus, the whole construction of the normalizer
subsystem needed refactoring. See the provided `package-info.java` for
an overview of this new structure.
Introduces a new configuration,
`hbase.normalizer.throughput.max_bytes_per_sec`, for specifying a
limit on the throughput of actions executed by the normalizer. Note
that while this configuration value is in bytes, the minimum honored
valued `1_000_000`. Supports values configured using the
human-readable suffixes honored by `Configuration.getLongBytes`
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Huaxiang Sun <huaxiangsun@apache.com>
Signed-off-by: Michael Stack <stack@apache.org>
Change the test to wait for evidence that the active master has seen
that the backup master killed by the test has gone away. This is done
before proceeding to validate that the dead backup is correctly
omitted from the ClusterStatus report.
Also, minor fixup to several assertions, using `assertEquals` instead
of `assertTrue(...equals(...))` and correcting expected vs. actual
ordering of assertion arguments.
Signed-off-by: Michael Stack <stack@apache.org>
Give the comparator a more descriptive name, a better location,
and make it work even when passed hbase:meta WAL files.
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Guanghao Zhang <zghao@apache.org>
* HBASE-24967 The table.jsp cost long time to load if the table include closed regions
* fix it by another way
* fix review issue
* fix checkstyle warnings
* fix checkstyle warning
Pass WALFactory to Replication instead of WALProvider. WALFactory has all
WALProviders in it, not just the user-space WALProvider. Do this so
ReplicationService has access to all WALProviders in the Server (To be
exploited by the follow-on patch in HBASE-25055)
Editing logging around region replicas: shortening and adding context.
Checkstyle fixes in edited files while I was in there.
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Closes#2422
Untangle RegionInfo, RegionInfoBuilder, and MutableRegionInfo static
initializations some. Move MutableRegionInfo from inner-class of
RegionInfoBuilder to be (package private) standalone. Undo static
initializing references from RI to RIB.
Co-authored-by: Nick Dimiduk <ndimiduk@apache.org>
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
* Admin API getLogEntries() for ring buffer use-cases: so far, provides balancerDecision and slowLogResponse
* Refactor RPC call for similar use-cases
* Single RPC API getLogEntries() for both Master.proto and Admin.proto
Closes#2261
Signed-off-by: Andrew Purtell <apurtell@apache.org>
* Break subclass referencing of MetaCellComparator from superclass CellComparatorImpl
static initializer by moving META_COMPARATOR to subclass MetaCellComparator
Closes#2329
Signed-off-by: Duo Zhang <zhangduo@apache.org>
This patch adds the ability to discover newly added masters
dynamically on the master registry side. The trigger for the
re-fetch is either periodic (5 mins) or any registry RPC failure.
Master server information is cached in masters to avoid repeated
ZK lookups.
Updates the client side connection metrics to maintain a counter
per RPC type so that clients have visibility into counts grouped
by RPC method name.
I didn't add the method to ZK registry interface since there
is a design discussion going on in splittable meta doc. We can
add it later if needed.
Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Adds region state check on hbck2 assigns/unassigns. Returns pid of -1
if in inappropriate state with logging explaination which suggests
passing override if operator wants to assign/unassign anyways. Here
is an example of what happens now if hbck2 tries an unassign and
Region already unassigned:
2020-08-19 11:22:06,926 INFO [RpcServer.default.FPBQ.Fifo.handler=1,queue=0,port=50086] assignment.AssignmentManager(820): Failed {ENCODED => d1112e553991e938b6852f87774c91ee, NAME => 'TestHbck,zzzzz,1597861310769.d1112e553991e938b6852f87774c91ee.', STARTKEY => 'zzzzz', ENDKEY => ''} unassign, override=false; set override to by-pass state checks.
org.apache.hadoop.hbase.client.DoNotRetryRegionException: Unexpected state for state=CLOSED, location=null, table=TestHbck, region=d1112e553991e938b6852f87774c91ee
at org.apache.hadoop.hbase.master.assignment.AssignmentManager.preTransitCheck(AssignmentManager.java:583)
at org.apache.hadoop.hbase.master.assignment.AssignmentManager.createOneUnassignProcedure(AssignmentManager.java:812)
at org.apache.hadoop.hbase.master.MasterRpcServices.unassigns(MasterRpcServices.java:2616)
at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$HbckService$2.callBlockingMethod(MasterProtos.java)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:397)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)
Previous it would just create the unassign anyways. Now must pass override
to queue the procedure regardless. Safer.
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java
javadoc on assigns/unassigns. Minor refactor in assigns/unassigns to cater to
case where procedure may come back null (if override not set and fails state checks).
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
checkstyle cleanups.
Clarifying javadoc on how there is no state checking when bulk assigns creating/enabling
tables.
createOneAssignProcedure and createOneUnassignProcedure now handle exceptions which now
can be thrown if no override and region state is not appropriate.
Aggregation of createAssignProcedure and createUnassignProcedure instances adding in
region state check invoked if override is NOT set.
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStateNode.java
Change to setProcedure so it returns passed proc as result instead of void
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Guanghao Zhang <zghao@apache.org>
Signed-off-by: Anoop Sam John <anoopsamjohn@apache.org>
Introduce an additional method to our Admin interface that allow an
operator to selectivly run the normalizer. The IPC protocol supports
general table name select via compound filter.
Signed-off-by: Sean Busbey <busbey@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
when neighbor is larger than average size
* add `testMergeEmptyRegions` to explicitly cover different
interleaving of 0-sized regions.
* fix bug where merging a 0-size region is skipped due to large
neighbor.
* remove unused `splitPoint` from `SplitNormalizationPlan`.
* generate `toString`, `hashCode`, and `equals` methods from Apache
Commons Lang3 template on `SplitNormalizationPlan` and
`MergeNormalizationPlan`.
* simplify test to use equality matching over `*NormalizationPlan`
instances as plain pojos.
* test make use of this handy `TableNameTestRule`.
* fix line-length issues in `TestSimpleRegionNormalizer`
Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: huaxiangsun <huaxiangsun@apache.org>
Signed-off-by: Aman Poonia <aman.poonia.29@gmail.com>
Looped through the test 100 times and it passes. Without the patch it fails
every ~10 runs or so.
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
Allow specifying base WALEntry filter on construction of
ReplicationSource. Add means of being able to filter WALs by name.
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
Add constructor that allows passing a predicate for filtering *in* WALs
and a list of filters for filtering *out* WALEntries. The latter was
hardcoded to filter out system-table WALEntries. The former did not
exist but we'll need it if Replication takes in more than just the
default Provider.
Signed-off-by Anoop Sam John <anoopsamjohn@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Looped through the test 100 times and it passes. Without the patch it fails
every ~10 runs or so.
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Michael Stack <stack@apache.org>
Signed-off-by: Anoop Sam John <anoopsamjohn@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Sean Busbey <busbey@apache.org>
* RegionMover to ignore move failures for split/merged regions with ack mode
* Refactor MoveWithAck and MoveWithoutAck as high level classes
* UT for RegionMover gracefully handling split/merged regions while loading regions and throwing failure while loading offline regions
Closes#2172
Signed-off-by: Sean Busbey <busbey@apache.org>
Signed-off-by: Ted Yu <tyu@apache.org>
* null check for writer if not initialized yet during syncrunner run
* Revert "null check for writer if not initialized yet during syncrunner run"
This reverts commit 72932ad0df.
* Writer check while trying to attain safe point
Signed-off-by: Anoop Sam John <anoopsamjohn@apache.com>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Ramkrishna <ramkrishna@apache.org>
* refactor how we use connection and async connection to rely on their access methods
* refactor initialization and cleanup of the shared connection
* incompatibly change HCTU's Configuration member variable to be final so it can be safely accessed from multiple threads.
Closes#2180
Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
We observed this delete call to be a bottleneck for table with lots of
regions. Patch attempts to parallelize them.
Signed-off-by: Andrew Purtell <apurtell@apache.org>
Also fix three bugs:
* We were trying to delete non-empty directory; weren't doing
accounting for meta WALs where meta had moved off the server
(successfully)
* We were deleting split WALs rather than archiving them.
* We were not handling corrupt files.
Deprecations and removal of tests of old system.
Signed-off-by: Anoop Sam John <anoopsamjohn@apache.org>
- make sure to always clean up excess meta replicas not just when their
number get decreased
- make sure NotServingRegionException is handled properly even when
wrapped
- add test
Signed-off-by: Peter Somogyi <psomogyi@apache.org>
* HBASE-24382 Flush partial stores of region filtered by seqId when archive wal due to too many wals
* fix checkstyle and javadoc issue
* fix javadoc issues
* move the geting of stores to HRegion, since it should not be part of FlushPolicy, and comment fix
* fix checkstyle issue
* add some comment
* remove the forceFlushAllStores since we can use families to determine how to select stores to flush
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: stack <stack@duboce.net>
Writing a test for this is tricky. There is enough coverage for
functional tests. Only concern is performance, but there is enough
logging for it to detect timed out/badly performing sync calls.
Additionally, this patch decouples the ZK event processing into it's
own thread rather than doing it in the EventThread's context. That
avoids deadlocks and stalls of the event thread.
Signed-off-by: Andrew Purtell <apurtell@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Purge query Master for table descriptors; make do w/ generic options.
Logging cleanup.
hbase-server/src/main/java/org/apache/hadoop/hbase/wal/BoundedRecoveredHFilesOutputSink.java
Undo fetching Table Descriptor. Not reliably available at recovery time.
Signed-off-by: Duo Zhang <zhangduo@apache.org>
hbase-server/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java
Don't register as a chore on construction if no coordination state
manager instance (there is no instance when procv2 WAL splitter).
hbase-server/src/main/java/org/apache/hadoop/hbase/master/SplitWALManager.java
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
Edit logs.
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionRemoteProcedureBase.java
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerCrashProcedure.java
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/SplitWALProcedure.java
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/SplitWALRemoteProcedure.java
Add proc name rather than rely on default behavior. Add detail to the
toString.
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
Factoring
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractFSWAL.java
Print the maxLogs... we don't do it any where.
hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WAL.java
Utility method to strip prefix from wal path.
Signed-off-by: Guanghao Zhang <zghao@apache.org>
Signed-off-by: tianjingyun <tianjy@apache.org>
* HBASE-24205 - Avoid metric collection for flushes and compactions
* Use the existing matcher.isUserScan() to decide the scan type
* Added a comment saying when we track the metrics
* HBASE-24205 Create metric to know the number of reads that happens from memstore
* Fix checkstyles and whitespaces
* Checkstyl, whitespace and javadoc
* Fixed review comments
* Fix unused imports
* Rebase with latest commit
* Adding the table vs store metric by consolidating
* Combine get and scan metrics and make all relevant changes
* Track for full row and then increment either memstore or file read
metric
* TestMetricsStore test fix
* Only increment the memstore metric if all cells are from memstore, if
not treat as mixed reads
* Remove metricsstore and aggregate at region level
* Addresses review comments-metric name updated everywhere
* Metric name change
* Review comment changes
Co-authored-by: Ramkrishna <ramkrishna@apache.org>
Signed-off by:Anoop Sam John<anoopsamjohn@gmail.com>
Signed-off by:Viraj Jasani<virajjasani@apache.org>