hbase

Commit Graph

Author	SHA1	Message	Date
Tak Lon (Stephen) Wu	1637ebbc00	HBASE-27589 Rename TestConnectionImplementation in hbase-it to fix javadoc failure (#4990 ) Signed-off-by: Andrew Purtell <apurtell@apache.org>	2023-01-24 18:58:31 -08:00
Mallikarjun	fd11b9bfcf	HBASE-27238 Backport backup restore to 2.x (#4770 ) Signed-off-by: Bryan Beaudreault <bbeaudreault@apache.org>	2023-01-23 19:51:17 -05:00
Nick Dimiduk	4f8e34f436	HBASE-27568 ChaosMonkey add support for JournalNodes (#4963 ) Signed-off-by: Reid Chan <reidchan@apache.org>	2023-01-17 18:36:38 +01:00
Nick Dimiduk	6d82dc1e0b	HBASE-27567 Introduce ChaosMonkey Action to print HDFS Cluster status Signed-off-by: Reid Chan <reidchan@apache.org> Signed-off-by: Duo Zhang <zhangduo@apache.org>	2023-01-16 15:49:12 +01:00
Nick Dimiduk	4855e71468	HBASE-27563 ChaosMonkey sometimes generates invalid boundaries for random item selection Signed-off-by: Duo Zhang <zhangduo@apache.org>	2023-01-12 17:55:25 +01:00
Vaibhav Joshi	2e94f6fb50	HBASE-27527 Port HBASE-27498 to branch-2 (#4919 ) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org>	2022-12-12 11:01:09 -08:00
Duo Zhang	738ed028ad	HBASE-27434 Use ${revision} as placeholder for maven version to make it easier to control the version from command line (#4836 ) Signed-off-by: GeorryHuang <huangzhuoyue@apache.org> (cherry picked from commit `2fc879e863`)	2022-10-24 12:05:01 +08:00
Duo Zhang	5cab4be075	HBASE-27401 Clean up current broken 'n's in our javadoc (#4812 ) Signed-off-by: Andrew Purtell <apurtell@apache.org> (cherry picked from commit `63cdd026f0`) Conflicts: hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupManager.java hbase-client/src/main/java/org/apache/hadoop/hbase/HRegionLocation.java hbase-client/src/main/java/org/apache/hadoop/hbase/client/Append.java hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java hbase-client/src/main/java/org/apache/hadoop/hbase/client/Scan.java hbase-client/src/main/java/org/apache/hadoop/hbase/filter/SingleColumnValueExcludeFilter.java hbase-client/src/main/java/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.java hbase-client/src/main/java/org/apache/hadoop/hbase/security/visibility/VisibilityClient.java hbase-client/src/main/java/org/apache/hadoop/hbase/shaded/protobuf/RequestConverter.java hbase-common/src/main/java/org/apache/hadoop/hbase/CellUtil.java hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestReplication.java hbase-rest/src/test/java/org/apache/hadoop/hbase/rest/client/TestRemoteTable.java hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManager.java hbase-server/src/main/java/org/apache/hadoop/hbase/HBaseServerBase.java hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/MasterObserver.java hbase-server/src/main/java/org/apache/hadoop/hbase/mob/MobUtils.java hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/flush/RegionServerFlushTableProcedureManager.java hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Region.java hbase-server/src/main/java/org/apache/hadoop/hbase/replication/HBaseReplicationEndpoint.java hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSource.java hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtil.java hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java hbase-server/src/test/java/org/apache/hadoop/hbase/tool/TestBulkLoadHFilesSplitRecovery.java hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift/ThriftUtilities.java	2022-10-06 22:13:14 +08:00
Duo Zhang	54089b4d9c	HBASE-27409 Fix the javadoc for WARCRecord (#4814 ) Signed-off-by: Andrew Purtell <apurtell@apache.org> (cherry picked from commit `ced1d642ae`)	2022-10-06 18:19:01 +08:00
Andrew Purtell	e5f551ebf7	HBASE-27252 Clean up error-prone findings in hbase-it Close #4662 Co-authored-by: Duo Zhang <zhangduo@apache.org> Signed-off-by: Duo Zhang <zhangduo@apache.org> Signed-off-by: Viraj Jasani <vjasani@apache.org> (cherry picked from commit `1004876bad`) Conflicts: hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestBackupRestore.java hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestIngest.java hbase-it/src/test/java/org/apache/hadoop/hbase/trace/IntegrationTestSendTraceRequests.java	2022-08-20 23:39:06 +08:00
Duo Zhang	b820188c68	HBASE-27278 Improve TestTlsIPC to reuse existing IPC test code (#4682 ) Signed-off-by: Bryan Beaudreault <bbeaudreault@apache.org> (cherry picked from commit `3309108ca7`)	2022-08-12 13:21:11 +08:00
Duo Zhang	84e7cdb2c0	HBASE-27222 Purge FutureReturnValueIgnored warnings from error prone (#4634 ) Signed-off-by: Andrew Purtell <apurtell@apache.org> (cherry picked from commit `8b091c4061`)	2022-07-26 23:45:15 +08:00
Duo Zhang	99f2ab5aa8	HBASE-27220 Apply the spotless format change in HBASE-27208 to our code base Signed-off-by: Andrew Purtell <apurtell@apache.org>	2022-07-19 10:00:31 +08:00
Andrew Purtell	4e19892a53	HBASE-27088 IntegrationLoadTestCommonCrawl async load improvements (#4488 ) * HBASE-27088 IntegrationLoadTestCommonCrawl async load improvements - Use an async client and work stealing executor for parallelism during loads. - Remove the verification read retries, these are not that effective during replication lag anyway. - Increase max task attempts because S3 might throttle. - Implement a side task that exercises Increments by extracting urls from content and updating a cf that tracks referrer counts. These are not validated at this time. It could be possible to log the increments, sum them with a reducer, and then verify the total, but this is left as a future exercise. Signed-off-by: Viraj Jasani <vjasani@apache.org> * Sum RPC time for writes (loader) and reads (verifier) and mutation bytes submitted. Expose as job counters. * Fix an issue with completion chaining * Pause loading if too many operations are in flight	2022-07-13 09:02:00 -07:00
BukrosSzabolcs	0727015bf5	HBASE-22749 Distributed MOB compactions (#4581 ) * HBASE-22749 Distributed MOB compactions - MOB compaction is now handled in-line with per-region compaction on region servers - regions with mob data store per-hfile metadata about which mob hfiles are referenced - admin requested major compaction will also rewrite MOB files; periodic RS initiated major compaction will not - periodically a chore in the master will initiate a major compaction that will rewrite MOB values to ensure it happens. controlled by 'hbase.mob.compaction.chore.period'. default is weekly - control how many RS the chore requests major compaction on in parallel with 'hbase.mob.major.compaction.region.batch.size'. default is as parallel as possible. - periodic chore in master will scan backing hfiles from regions to get the set of referenced mob hfiles and archive those that are no longer referenced. control period with 'hbase.master.mob.cleaner.period' - Optionally, RS that are compacting mob files can limit write amplification by not rewriting values from mob hfiles over a certain size limit. opt-in by setting 'hbase.mob.compaction.type' to 'optimized'. control threshold by 'hbase.mob.compactions.max.file.size'. default is 1GiB - Should smoothly integrate with existing MOB users via rolling upgrade. will delay old MOB file cleanup until per-region compaction has managed to compact each region at least once so that used mob hfile metadata can be gathered. * HBASE-22749 Distributed MOB compactions fix RestrictedApi Co-authored-by: Vladimir Rodionov <vrodionov@apache.org> Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org>	2022-06-30 20:44:45 +01:00
Duo Zhang	40c2743a0c	HBASE-27023 Fix license issues after running spotless:apply (#4458 ) Signed-off-by: Peter Somogyi <psomogyi@apache.org> Signed-off-by: Xiaolin Ha <haxiaolin@apache.org> (cherry picked from commit `e555ac4a99`)	2022-06-02 20:20:43 +08:00
huaxiangsun	8703cacba5	HBASE-26984 Chaos Monkey thread dies in ITBLL Chaos GracefulRollingRestartRsAction (#4383 ) (#4408 ) There are two cases here: 1. Chaos Monkey thread died and there is no chaos after that. 2. Sometimes, regions are being moved back too quick that region server has not finished its initliazation yet. wait sometime to make sure that region server finishes its initialization. Signed-off-by: Wellington Chevreuil <wellington.chevreuil@gmail.com>	2022-05-05 13:43:15 -05:00
Duo Zhang	1a5b1b266c	HBASE-26899 Run spotless:apply	2022-05-01 22:41:49 +08:00
Duo Zhang	e7eb628025	HBASE-26922 Fix LineLength warnings as much as possible if it can not be fixed by spotless (#4324 ) Signed-off-by: Yulin Niu <niuyulin@apache.org (cherry picked from commit `3ae0d9012c`)	2022-04-09 23:13:49 +08:00
Nick Dimiduk	321c35a6ef	HBASE-26834 Adapt ConnectionRule for both sync and async connections Signed-off-by: Duo Zhang <zhangduo@apache.org>	2022-03-21 12:42:18 +01:00
Duo Zhang	ba14796289	Revert "HBASE-26813 Remove javax.ws.rs-api dependency (#4191 )" MiniYARNCluster needs it This reverts commit `abde344767`.	2022-03-19 19:45:29 +08:00
Nick Dimiduk	abde344767	HBASE-26813 Remove javax.ws.rs-api dependency (#4191 ) This is no longer needed since we've transitioned to the shaded Jersey shipped in hbase-thirdparty. Also drop supplemental models entry. Signed-off-by: Duo Zhang <zhangduo@apache.org> Signed-off-by: Andrew Purtell <apurtell@apache.org>	2022-03-16 17:24:20 +01:00
Duo Zhang	340cc6c6f1	HBASE-26802 Backport the log4j2 changes to branch-2 (#4166 ) Signed-off-by: Andrew Purtell <apurtell@apache.org>	2022-03-11 11:17:43 -08:00
Andrew Purtell	300f9b9576	HBASE-26582 Prune use of Random and SecureRandom objects (#4118 ) Avoid the pattern where a Random object is allocated, used once or twice, and then left for GC. This pattern triggers warnings from some static analysis tools because this pattern leads to poor effective randomness. In a few cases we were legitimately suffering from this issue; in others a change is still good to reduce noise in analysis results. Use ThreadLocalRandom where there is no requirement to set the seed to gain good reuse. Where useful relax use of SecureRandom to simply Random or ThreadLocalRandom, which are unlikely to block if the system entropy pool is low, if we don't need crypographically strong randomness for the use case. The exception to this is normalization of use of Bytes#random to fill byte arrays with randomness. Because Bytes#random may be used to generate key material it must be backed by SecureRandom. Signed-off-by: Duo Zhang <zhangduo@apache.org>	2022-03-08 15:22:00 -08:00
Duo Zhang	4644efb9f3	HBASE-26691 Replacing log4j with reload4j for branch-2.x (#4050 ) Signed-off-by: Andrew Purtell <apurtell@apache.org>	2022-03-04 12:06:34 -08:00
BukrosSzabolcs	77bb153a2e	HBASE-26707: Reduce number of renames during bulkload (#4066 ) (#4122 ) Signed-off-by: Wellington Ramos Chevreuil <wchevreuil@apache.org>	2022-02-25 20:11:41 +00:00
Nick Dimiduk	d8085b43fc	HBASE-26614 Refactor code related to "dump"ing ZK nodes (#3969 ) The code starting at `ZKUtil.dump(ZKWatcher)` is a small mess – it has cyclic dependencies woven through itself, `ZKWatcher` and `RecoverableZooKeeper`. It also initializes a static variable in `ZKUtil` through the factory for `RecoverableZooKeeper` instances. Let's decouple and clean it up. Signed-off-by: Duo Zhang <zhangduo@apache.org> Signed-off-by: Josh Elser <elserj@apache.org>	2022-01-25 09:08:35 -08:00
Duo Zhang	e53712ae99	HBASE-26523 Upgrade hbase-thirdparty dependency to 4.0.1 (#3988 ) Signed-off-by: GeorryHuang <huangzhuoyue@apache.org>	2021-12-31 12:10:08 +08:00
Wellington Ramos Chevreuil	5defd8c35f	HBASE-26556 IT and Chaos Monkey improvements (#3932 ) Signed-off-by: Josh Elser <elserj@apache.org> Reviewed-by: Tak Lon (Stephen) Wu <taklwu@apache.org> (cherry picked from commit `a36d41af73`)	2021-12-14 21:26:53 +00:00
Andrew Purtell	b1bc5f3a5c	Renumber to 2.6.0-SNAPSHOT after branching branch-2.5 Signed-off-by: Andrew Purtell <apurtell@apache.org>	2021-12-08 16:54:32 -08:00
Andrew Purtell	42ff3ac22e	HBASE-26349 Improve recent change to IntegrationTestLoadCommonCrawl (#3744 ) Use a hybrid logical clock for timestamping entries. Using BufferedMutator without HLC was not good because we assign client timestamps, and the store loop is fast enough that on rare occasion two temporally adjacent URLs in the set of WARCs are equivalent and the timestamp does not advance, leading later to a rare false positive CORRUPT finding. While making changes, support direct S3N paths as input paths on the command line. Signed-off-by: Viraj Jasani <vjasani@apache.org>	2021-10-19 14:20:08 -07:00
Andrew Purtell	c3c7d36578	HBASE-26335 Minor improvements to IntegrationTestLoadCommonCrawl (#3731 ) - Use BufferedMutator instead of Table. - Improve row key generator. - Improve retries and log levels. Signed-off-by: Viraj Jasani <vjasani@apache.org>	2021-10-08 10:01:58 -07:00
Tak Lon (Stephen) Wu	d0a53e3f29	HBASE-26133 Backport HBASE-25591 "Upgrade opentelemetry to 0.17.1" to branch-2 (#3608 ) 10/17 commits of HBASE-22120, original commit `f6ff519dd0` Co-authored-by: Duo Zhang <zhangduo@apache.org> Signed-off-by: Duo Zhang <zhangduo@apache.org> Signed-off-by: Peter Somogyi <psomogyi@apache.org>	2021-09-01 15:29:09 -07:00
Tak Lon (Stephen) Wu	665305cc3b	HBASE-26124 Backport HBASE-25373 "Remove HTrace completely in code base and try to make use of OpenTelemetry" to branch-2 (#3529 ) 1/17 commits of HBASE-22120 Signed-off-by: Peter Somogyi <psomogyi@apache.org>	2021-09-01 15:29:09 -07:00
Tak Lon (Stephen) Wu	c11a3e1b39	Revert "HBASE-26124 Backport HBASE-25373 "Remove HTrace completely in code base and try to make use of OpenTelemetry" to branch-2 (#3529 )" This reverts commit `f049301606`.	2021-08-04 15:55:13 -07:00
Tak Lon (Stephen) Wu	f049301606	HBASE-26124 Backport HBASE-25373 "Remove HTrace completely in code base and try to make use of OpenTelemetry" to branch-2 (#3529 ) 1/17 commits of HBASE-22120 Signed-off-by: Peter Somogyi <psomogyi@apache.org>	2021-07-29 09:15:10 -07:00
Andrew Purtell	a4e8ee183e	HBASE-25911 Replace calls to System.currentTimeMillis with EnvironmentEdgeManager.currentTime (#3302 ) We introduced EnvironmentEdgeManager as a way to inject alternate clocks for unit tests. In order for this to be effective, all callers that would otherwise use System.currentTimeMillis() must call EnvironmentEdgeManager.currentTime() instead, except the implementers of EnvironmentEdge. Signed-off-by: Bharath Vissapragada <bharathv@apache.org> Signed-off-by: Duo Zhang <zhangduo@apache.org> Signed-off-by: Viraj Jasani <vjasani@apache.org> Conflicts: hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupManager.java hbase-backup/src/test/java/org/apache/hadoop/hbase/backup/TestBackupBase.java hbase-backup/src/test/java/org/apache/hadoop/hbase/backup/TestBackupManager.java hbase-backup/src/test/java/org/apache/hadoop/hbase/backup/TestBackupSystemTable.java hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestAsyncConnectionTracing.java hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestAsyncRegionLocatorTracing.java hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestBackupRestore.java hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestManyRegions.java hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/MoveRegionsOfTableAction.java hbase-mapreduce/src/test/java/org/apache/hadoop/hbase/mapreduce/TestWALRecordReader.java hbase-mapreduce/src/test/java/org/apache/hadoop/hbase/replication/TestVerifyReplicationCrossDiffHdfs.java hbase-mapreduce/src/test/java/org/apache/hadoop/hbase/snapshot/TestExportSnapshotV1NoCluster.java hbase-rsgroup/src/test/java/org/apache/hadoop/hbase/rsgroup/TestEnableRSGroups.java hbase-rsgroup/src/test/java/org/apache/hadoop/hbase/rsgroup/TestRSGroupsAdmin2.java hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/CallRunner.java hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/SimpleRpcServer.java hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStateStore.java hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/SimpleLoadBalancer.java hbase-server/src/main/java/org/apache/hadoop/hbase/wal/AbstractWALRoller.java hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java hbase-server/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java hbase-server/src/test/java/org/apache/hadoop/hbase/TestMetaTableAccessor.java hbase-server/src/test/java/org/apache/hadoop/hbase/TestMetaUpdatesGoToPriorityQueue.java hbase-server/src/test/java/org/apache/hadoop/hbase/TestSerialization.java hbase-server/src/test/java/org/apache/hadoop/hbase/backup/TestHFileArchiving.java hbase-server/src/test/java/org/apache/hadoop/hbase/client/RestoreSnapshotFromClientSimpleTestBase.java hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAdmin2.java hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestConnection.java hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide3.java hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestMobCloneSnapshotFromClientCloneLinksAfterDelete.java hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestMutationGetCellBuilder.java hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestSnapshotMetadata.java hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestActiveMasterManager.java hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterMetricsWrapper.java hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMetaAssignmentWithStopMaster.java hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestRegionStateStore.java hbase-server/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestLogsCleaner.java hbase-server/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestReplicationHFileCleaner.java hbase-server/src/test/java/org/apache/hadoop/hbase/master/janitor/TestCatalogJanitor.java hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureSchedulerPerformanceEvaluation.java hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestCloneSnapshotProcedure.java hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestRestoreSnapshotProcedure.java hbase-server/src/test/java/org/apache/hadoop/hbase/procedure2/store/region/RegionProcedureStorePerformanceEvaluation.java hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionServerBulkLoad.java hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHStoreFile.java hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMajorCompaction.java hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionOpen.java hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestSimpleTimeRangeTracker.java hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestWideScanner.java hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/compactions/TestCloseChecker.java hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/AbstractTestProtobufLog.java hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/AbstractTestWALReplay.java hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestCombinedAsyncWriter.java hbase-server/src/test/java/org/apache/hadoop/hbase/replication/master/TestRecoverStandbyProcedure.java hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/TestFlushSnapshotFromClient.java hbase-server/src/test/java/org/apache/hadoop/hbase/wal/TestSyncReplicationWALProvider.java hbase-thrift/src/test/java/org/apache/hadoop/hbase/thrift/TestThriftServer.java hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/MiniZooKeeperCluster.java	2021-06-01 12:41:15 -07:00
Michael Stack	61d9b46aab	HBASE-25867 Extra doc around ITBLL (#3242 ) * HBASE-25867 Extra doc around ITBLL Minor edits to a few log messages. Explain how the '-c' option works when passed to ChaosMonkeyRunner. Some added notes on ITBLL. Fix whacky 'R' and 'Not r' thing in Master (shows when you run ITBLL). In HRS, report hostname and port when it checks in (was debugging issue where Master and HRS had different notions of its hostname). Spare a dirty FNFException on startup if base dir not yet in place. * Address Review by Sean Signed-off-by: Sean Busbey <busbey@apache.org>	2021-05-11 19:24:33 +01:00
Andrew Purtell	bf43006b9d	HBASE-25824 IntegrationTestLoadCommonCrawl (#3208 ) This integration test loads successful resource retrieval records from the Common Crawl (https://commoncrawl.org/) public dataset into an HBase table and writes records that can be used to later verify the presence and integrity of those records. Run like: ./bin/hbase org.apache.hadoop.hbase.test.IntegrationTestLoadCommonCrawl \ -Dfs.s3n.awsAccessKeyId=<AWS access key> \ -Dfs.s3n.awsSecretAccessKey=<AWS secret key> \ /path/to/test-CC-MAIN-2021-10-warc.paths.gz \ /path/to/tmp/warc-loader-output Access to the Common Crawl dataset in S3 is made available to anyone by Amazon AWS, but Hadoop's S3N filesystem still requires valid access credentials to initialize. The input path can either specify a directory or a file. The file may optionally be compressed with gzip. If a directory, the loader expects the directory to contain one or more WARC files from the Common Crawl dataset. If a file, the loader expects a list of Hadoop S3N URIs which point to S3 locations for one or more WARC files from the Common Crawl dataset, one URI per line. Lines should be terminated with the UNIX line terminator. Included in hbase-it/src/test/resources/CC-MAIN-2021-10-warc.paths.gz is a list of all WARC files comprising the Q1 2021 crawl archive. There are 64,000 WARC files in this data set, each containing ~1GB of gzipped data. The WARC files contain several record types, such as metadata, request, and response, but we only load the response record types. If the HBase table schema does not specify compression (by default) there is roughly a 10x expansion. Loading the full crawl archive results in a table approximately 640 TB in size. The hadoop-aws jar will be needed at runtime to instantiate the S3N filesystem. Use the -files ToolRunner argument to add it. You can also split the Loader and Verify stages: Load with: ./bin/hbase 'org.apache.hadoop.hbase.test.IntegrationTestLoadCommonCrawl$Loader' \ -files /path/to/hadoop-aws.jar \ -Dfs.s3n.awsAccessKeyId=<AWS access key> \ -Dfs.s3n.awsSecretAccessKey=<AWS secret key> \ /path/to/test-CC-MAIN-2021-10-warc.paths.gz \ /path/to/tmp/warc-loader-output Verify with: ./bin/hbase 'org.apache.hadoop.hbase.test.IntegrationTestLoadCommonCrawl$Verify' \ /path/to/tmp/warc-loader-output Signed-off-by: Michael Stack <stack@apache.org> Conflicts: pom.xml	2021-05-03 18:01:43 -07:00
niuyulin	9cf8a48d20	HBASE-25777 Fix wrong initialization value in StressAssignmentManagerMonkeyFactory (#3164 ) Signed-off-by: meiyi <myimeiyi@gmail.com>	2021-04-19 17:52:53 +08:00
Pankaj	9a170e2c8b	HBASE-25502 IntegrationTestMTTR fails with TableNotFoundException (#2879 )	2021-01-13 11:02:41 -08:00
Viraj Jasani	0788547fea	HBASE-25474 : Bump HBase version on branch-2 (#2871 ) Signed-off-by: stack <stack@apache.org>	2021-01-12 10:20:22 +05:30
Mate Szalay-Beko	95dc87be23	HBASE-25318 Config option for IntegrationTestImportTsv where to generate HFiles to bulkload (#2777 ) IntegrationTestImportTsv is generating HFiles under the working directory of the current hdfs user executing the tool, before bulkloading it into HBase. Assuming you encrypt the HBase root directory within HDFS (using HDFS Transparent Encryption), you can bulkload HFiles only if they sit in the same encryption zone in HDFS as the HBase root directory itself. When IntegrationTestImportTsv is executed against a real distributed cluster and the working directory of the current user (e.g. /user/hbase) is not in the same encryption zone as the HBase root directory (e.g. /hbase/data) then you will get an exception: ``` ERROR org.apache.hadoop.hbase.regionserver.HRegion: There was a partial failure due to IO when attempting to load d : hdfs://mycluster/user/hbase/test-data/22d8460d-04cc-e032-88ca-2cc20a7dd01c/ IntegrationTestImportTsv/hfiles/d/74655e3f8da142cb94bc31b64f0475cc org.apache.hadoop.ipc.RemoteException(java.io.IOException): /user/hbase/test-data/22d8460d-04cc-e032-88ca-2cc20a7dd01c/ IntegrationTestImportTsv/hfiles/d/74655e3f8da142cb94bc31b64f0475cc can't be moved into an encryption zone. ``` In this commit I make it configurable where the IntegrationTestImportTsv generates the HFiles. Co-authored-by: Mate Szalay-Beko <symat@apache.com> Signed-off-by: Peter Somogyi <psomogyi@apache.org>	2021-01-05 10:27:38 +01:00
Lokesh Khurana	6eee9b1049	HBASE-24620 : Add a ClusterManager which submits command to ZooKeeper and its Agent which picks and execute those Commands (#2299 ) Signed-off-by: Aman Poonia <apoonia@salesforce.com> Signed-off-by: Viraj Jasani <vjasani@apache.org>	2020-12-21 15:36:02 +05:30
Duo Zhang	37c2ffdc2b	HBASE-25164 Make ModifyTableProcedure support changing meta replica count (#2513 ) Signed-off-by: Michael Stack <stack@apache.org>	2020-10-13 10:13:48 +08:00
Duo Zhang	7a3bb8aefe	HBASE-25037 Lots of thread pool are changed to non daemon after HBASE-24750 which causes trouble when shutting down (#2407 ) Signed-off-by: Viraj Jasani <vjasani@apache.org>	2020-09-16 22:03:42 +08:00
Joseph295	4acd6735fd	HBASE-24992 log after Generator success when running ITBLL (#2358 ) Signed-off-by: Guanghao Zhang <zghao@apache.org>	2020-09-09 11:08:26 +08:00
Duo Zhang	4455856e9c	HBASE-23834 HBase fails to run on Hadoop 3.3.0/3.2.2/3.1.4 due to jetty version mismatch (#2222 ) Signed-off-by: Viraj Jasani <vjasani@apache.org> Signed-off-by: Josh Elser <elserj@apache.org> Signed-off-by: Peter Somogyi <psomogyi@apache.org>	2020-08-25 15:02:55 +08:00
Nick Dimiduk	c0d7bfb6f7	HBASE-24662 Update DumpClusterStatusAction to notice changes in region server count Sometimes running chaos monkey, I've found that we lose accounting of region servers. I've taken to a manual process of checking the reported list against a known reference. It occurs to me that ChaosMonkey has a known reference, and it can do this accounting for me. Signed-off-by: Viraj Jasani <vjasani@apache.org>	2020-07-21 15:56:40 -07:00
Nick Dimiduk	89cf76c2cd	HBASE-24658 Update PolicyBasedChaosMonkey to handle uncaught exceptions Running `ServerKillingChaosMonkey` via `RESTApiClusterManager` for any duration of time slowly leaks region servers. I see failures on the RESTApi side go unreported on the ChaosMonkey side. It seems like `RuntimeException`s are being thrown and lost. `PolicyBasedChaosMonkey` uses a primitive means of thread management anyway. Update to use a thread pool, thread groups, and an uncaughtExceptionHandler. Signed-off-by: Bharath Vissapragada <bharathv@apache.org> Signed-off-by: Viraj Jasani <vjasani@apache.org>	2020-07-20 17:00:03 -07:00

1 2 3 4 5 ...

569 Commits