480 Commits

Author SHA1 Message Date
Andrew Purtell
6ad5b9e569
HBASE-25824 IntegrationTestLoadCommonCrawl (#3208)
* HBASE-25824 IntegrationTestLoadCommonCrawl

This integration test loads successful resource retrieval records from
the Common Crawl (https://commoncrawl.org/) public dataset into an HBase
table and writes records that can be used to later verify the presence
and integrity of those records.

Run like:

  ./bin/hbase org.apache.hadoop.hbase.test.IntegrationTestLoadCommonCrawl \
    -Dfs.s3n.awsAccessKeyId=<AWS access key> \
    -Dfs.s3n.awsSecretAccessKey=<AWS secret key> \
    /path/to/test-CC-MAIN-2021-10-warc.paths.gz \
    /path/to/tmp/warc-loader-output

Access to the Common Crawl dataset in S3 is made available to anyone by
Amazon AWS, but Hadoop's S3N filesystem still requires valid access
credentials to initialize.

The input path can either specify a directory or a file. The file may
optionally be compressed with gzip. If a directory, the loader expects
the directory to contain one or more WARC files from the Common Crawl
dataset. If a file, the loader expects a list of Hadoop S3N URIs which
point to S3 locations for one or more WARC files from the Common Crawl
dataset, one URI per line. Lines should be terminated with the UNIX line
terminator.

Included in hbase-it/src/test/resources/CC-MAIN-2021-10-warc.paths.gz
is a list of all WARC files comprising the Q1 2021 crawl archive. There
are 64,000 WARC files in this data set, each containing ~1GB of gzipped
data. The WARC files contain several record types, such as metadata,
request, and response, but we only load the response record types. If
the HBase table schema does not specify compression (by default) there
is roughly a 10x expansion. Loading the full crawl archive results in a
table approximately 640 TB in size.

The hadoop-aws jar will be needed at runtime to instantiate the S3N
filesystem. Use the -files ToolRunner argument to add it.

You can also split the Loader and Verify stages:

Load with:

  ./bin/hbase 'org.apache.hadoop.hbase.test.IntegrationTestLoadCommonCrawl$Loader' \
    -files /path/to/hadoop-aws.jar \
    -Dfs.s3n.awsAccessKeyId=<AWS access key> \
    -Dfs.s3n.awsSecretAccessKey=<AWS secret key> \
    /path/to/test-CC-MAIN-2021-10-warc.paths.gz \
    /path/to/tmp/warc-loader-output

Verify with:

  ./bin/hbase 'org.apache.hadoop.hbase.test.IntegrationTestLoadCommonCrawl$Verify' \
    /path/to/tmp/warc-loader-output

Signed-off-by: Michael Stack <stack@apache.org>
2021-05-03 17:59:00 -07:00
Duo Zhang
f6ff519dd0 HBASE-25591 Upgrade opentelemetry to 0.17.1 (#2971)
Signed-off-by: Guanghao Zhang <zghao@apache.org>
2021-04-25 09:23:23 +08:00
Duo Zhang
302d9ea8b8 HBASE-25373 Remove HTrace completely in code base and try to make use of OpenTelemetry
Signed-off-by: stack <stack@apache.org>
2021-04-25 09:23:23 +08:00
niuyulin
e8ac1fbe97
HBASE-25777 Fix wrong initialization value in StressAssignmentManagerMonkeyFactory (#3164)
Signed-off-by: meiyi <myimeiyi@gmail.com>
2021-04-19 17:46:57 +08:00
Pankaj
48d9d196dc
HBASE-25502 IntegrationTestMTTR fails with TableNotFoundException (#2879) 2021-01-13 11:01:26 -08:00
Mate Szalay-Beko
481662ab39
HBASE-25318 Config option for IntegrationTestImportTsv where to generate HFiles to bulkload (#2777)
IntegrationTestImportTsv is generating HFiles under the working directory of the
current hdfs user executing the tool, before bulkloading it into HBase.

Assuming you encrypt the HBase root directory within HDFS (using HDFS
Transparent Encryption), you can bulkload HFiles only if they sit in the same
encryption zone in HDFS as the HBase root directory itself.

When IntegrationTestImportTsv is executed against a real distributed cluster
and the working directory of the current user (e.g. /user/hbase) is not in the
same encryption zone as the HBase root directory (e.g. /hbase/data) then you
will get an exception:

```
ERROR org.apache.hadoop.hbase.regionserver.HRegion: There was a partial failure
due to IO when attempting to load d :
hdfs://mycluster/user/hbase/test-data/22d8460d-04cc-e032-88ca-2cc20a7dd01c/
IntegrationTestImportTsv/hfiles/d/74655e3f8da142cb94bc31b64f0475cc

org.apache.hadoop.ipc.RemoteException(java.io.IOException):
/user/hbase/test-data/22d8460d-04cc-e032-88ca-2cc20a7dd01c/
IntegrationTestImportTsv/hfiles/d/74655e3f8da142cb94bc31b64f0475cc
can't be moved into an encryption zone.
```

In this commit I make it configurable where the IntegrationTestImportTsv
generates the HFiles.

Co-authored-by: Mate Szalay-Beko <symat@apache.com>
Signed-off-by: Peter Somogyi <psomogyi@apache.org>
2021-01-05 09:24:24 +01:00
Lokesh Khurana
f8bd22827a
HBASE-24620 : Add a ClusterManager which submits command to ZooKeeper and its Agent which picks and execute those Commands (#2299)
Signed-off-by: Aman Poonia <apoonia@salesforce.com>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2020-12-21 15:33:36 +05:30
Duo Zhang
92c3bcd9fb
HBASE-25164 Make ModifyTableProcedure support changing meta replica count (#2513)
Signed-off-by: Michael Stack <stack@apache.org>
2020-10-13 09:43:56 +08:00
Duo Zhang
1bb19e0cdd
HBASE-25037 Lots of thread pool are changed to non daemon after HBASE-24750 which causes trouble when shutting down (#2407)
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2020-09-16 21:11:47 +08:00
Joseph295
a589e55a7b
HBASE-24992 log after Generator success when running ITBLL (#2358)
Signed-off-by: Guanghao Zhang <zghao@apache.org>
2020-09-08 11:46:59 +08:00
Duo Zhang
57e49b3959
HBASE-23834 HBase fails to run on Hadoop 3.3.0/3.2.2/3.1.4 due to jetty version mismatch (#2222)
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Josh Elser <elserj@apache.org>
Signed-off-by: Peter Somogyi <psomogyi@apache.org>
2020-08-25 12:05:52 +08:00
Viraj Jasani
ea130249ae
HBASE-24750 : Adding default UncaughtExceptionHandler for Thread factories (ADDENDUM)
Closes #2231
2020-08-11 17:18:47 +05:30
Viraj Jasani
0b604d921a
HBASE-24750 : All ExecutorService should use guava ThreadFactoryBuilder
Closes #2196

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Signed-off-by: Ted Yu <tyu@apache.org>
2020-08-07 20:24:36 +05:30
Duo Zhang
d2f5a5f27b
HBAE-24507 Remove HTableDescriptor and HColumnDescriptor (#2186)
Signed-off-by: stack <stack@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: tedyu <yuzhihong@gmail.com>
2020-08-04 10:31:42 +08:00
Nick Dimiduk
a6e3db5ba5 HBASE-24662 Update DumpClusterStatusAction to notice changes in region server count
Sometimes running chaos monkey, I've found that we lose accounting of
region servers. I've taken to a manual process of checking the
reported list against a known reference. It occurs to me that
ChaosMonkey has a known reference, and it can do this accounting for
me.

Signed-off-by: Viraj Jasani <vjasani@apache.org>
2020-07-21 15:56:22 -07:00
Nick Dimiduk
7ebc617026 HBASE-24658 Update PolicyBasedChaosMonkey to handle uncaught exceptions
Running `ServerKillingChaosMonkey` via `RESTApiClusterManager` for any
duration of time slowly leaks region servers. I see failures on the
RESTApi side go unreported on the ChaosMonkey side. It seems like
`RuntimeException`s are being thrown and lost.

`PolicyBasedChaosMonkey` uses a primitive means of thread management
anyway. Update to use a thread pool, thread groups, and an
uncaughtExceptionHandler.

Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2020-07-20 16:57:11 -07:00
Duo Zhang
16a25b74db
HBASE-24635 Split TestMetaWithReplicas (#1980)
Signed-off-by: Guanghao Zhang <zghao@apache.org>
2020-06-27 10:36:07 +08:00
Sandeep Pal
f22c7a6583
HBASE-23126: Removing the un-used integration test class - IntegrationTestRSGroup
Closes #1936

Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Jan Hentschel <jan.hentschel@ultratendency.com>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2020-06-20 22:44:03 +05:30
Duo Zhang
c91829bb41
HBASE-24491 Remove HRegionInfo (#1830)
Signed-off-by: Guanghao Zhang <zghao@apache.org>
Signed-off-by: Jan Hentschel <jan.hentschel@ultratendency.com>
2020-06-05 22:19:01 +08:00
Nick Dimiduk
5fb9e518ef HBASE-24361 Make RESTApiClusterManager more resilient (#1701)
* sometimes API calls return with null/empty response bodies. thus,
  wrap all API calls in a retry loop.
* calls that submit work in the form of "commands" now retrieve the
  commandId from successful command submission, and track completion
  of that command before returning control to calling context.
* model CM's process state and use that model to guide state
  transitions more intelligently. this guards against, for example,
  the start command failing with an error message like "Role must be
  stopped".
* improvements to logging levels, avoid spamming logs with the
  side-effects of retries at this and higher contexts.
* include references to API documentation, such as it is.

Signed-off-by: stack <stack@apache.org>
2020-05-19 09:53:22 -07:00
Nick Dimiduk
0dae377f53 HBASE-24360 RollingBatchRestartRsAction loses track of dead servers
`RollingBatchRestartRsAction` doesn't handle failure cases when
tracking its list of dead servers. The original author believed that a
failure to restart would result in a retry. However, by removing the
dead server from the failed list, that state is lost, and retry never
occurs. Because this action doesn't ever look back to the current
state of the cluster, relying only on its local state for the current
action invocation, it never realizes the abandoned server is still
dead. Instead, be more careful to only remove the dead server from the
list when the `startRs` invocation claims to have been successful.

Signed-off-by: stack <stack@apache.org>
2020-05-18 13:01:11 -07:00
meiyi
a73132c62b
HBASE-24364 [Chaos Monkey] Invalid data block encoding in ChangeEncodingAction (#1707)
Signed-off-by: Jan Hentschel <janh@apache.org>
2020-05-15 11:30:44 +08:00
Nick Dimiduk
0e81ab08b9 HBASE-24295 [Chaos Monkey] abstract logging through the class hierarchy ; ADDENDUM
Signed-off-by: Jan Hentschel <jan.hentschel@ultratendency.com>
2020-05-07 13:20:15 -07:00
Nick Dimiduk
204a1fad92 HBASE-24295 [Chaos Monkey] abstract logging through the class hierarchy
Adds `protected abstract Logger getLogger()` to `Action` so that
implementation's names are logged when actions are performed.

Signed-off-by: stack <stack@apache.org>
Signed-off-by: Jan Hentschel <jan.hentschel@ultratendency.com>
2020-05-04 11:53:24 -07:00
Nick Dimiduk
e37aafcfc2 HBASE-24260 Add a ClusterManager that issues commands via coprocessor
Implements `ClusterManager` that relies on the new
`ShellExecEndpointCoprocessor` for remote shell command execution.

Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
2020-05-04 10:53:02 -07:00
Nick Dimiduk
a97395d4c0 HBASE-24274 RESTApiClusterManager attempts to deserialize response using serialization API
Use the correct GSON API for deserializing service responses. Add
simple unit test covering a very limited selection of the overall API
surface area, just enough to ensure deserialization works.

Signed-off-by: stack <stack@apache.org>
2020-04-29 13:13:03 -07:00
Duo Zhang
9f52e6b725
HBASE-24249 Move code in FSHDFSUtils to FSUtils and mark related clas… (#1586)
Signed-off-by: stack <stack@apache.org>
2020-04-29 10:44:34 +08:00
Jan Hentschel
75c717d4c2
HBASE-23848 Removed deprecated setStopRow from Scan (#1184)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2020-04-22 15:15:17 +08:00
Jan Hentschel
fded2b9ddc
HBASE-23846 Removed deprecated setMaxVersions(int) from Scan
Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2020-03-31 10:18:27 +02:00
Duo Zhang
eface74407
HBASE-23799 Make our core coprocessors use shaded protobuf (#1280)
Signed-off-by: stack <stack@apache.org>
2020-03-23 08:47:02 +08:00
Jan Hentschel
1b163d98b9
HBASE-23847 Removed deprecated setStartRow from Scan (#1220)
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2020-03-13 08:43:17 +08:00
BukrosSzabolcs
4cb60327be
HBASE-23891: Add an option to Actions to filter out meta RS (#1217)
Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org>
2020-03-06 10:44:39 +00:00
Viraj Jasani
c0301e3fdf
HBASE-23868 : Replace usages of HColumnDescriptor(byte [] familyName)… (#1222)
Signed-off-by: Jan Hentschel <janh@apache.org>
2020-03-02 17:20:03 +05:30
Duo Zhang
7f2d823164 HBASE-23818 Cleanup the remaining RSGroupInfo.getTables call in the code base (#1152)
Signed-off-by: stack <stack@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2020-03-02 15:43:40 +08:00
Duo Zhang
7386369fec HBASE-23276 Add admin methods to get tables within a group (#1118)
Signed-off-by: Guanghao Zhang <zghao@apache.org>
2020-03-02 15:43:40 +08:00
Duo Zhang
c8d892cdef HBASE-23253 Rewrite rsgroup related UTs with the new methods introduced in HBASE-22932 (#813)
Signed-off-by: Guanghao Zhang <zghao@apache.org>
2020-03-02 15:43:40 +08:00
linkaline
72cbb129a4 HBASE-22932 Add rs group management methods in Admin and AsyncAdmin (#657)
Signed-off-by: Duo Zhang <zhangduo@apache.org>
2020-03-02 15:43:40 +08:00
Jan Hentschel
9f223c2236
HBASE-23781 Removed deprecated createTableDescriptor(String) from HBaseTestingUtility
Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Signed-off-by: stack <stack@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
2020-02-25 21:10:09 +01:00
Vladimir Rodionov
b8194b4902 HBASE-22749 Distributed MOB compactions
- MOB compaction is now handled in-line with per-region compaction on region
  servers
- regions with mob data store per-hfile metadata about which mob hfiles are
  referenced
- admin requested major compaction will also rewrite MOB files; periodic RS
  initiated major compaction will not
- periodically a chore in the master will initiate a major compaction that
  will rewrite MOB values to ensure it happens. controlled by
  'hbase.mob.compaction.chore.period'. default is weekly
- control how many RS the chore requests major compaction on in parallel
  with 'hbase.mob.major.compaction.region.batch.size'. default is as
  parallel as possible.
- periodic chore in master will scan backing hfiles from regions to get the
  set of referenced mob hfiles and archive those that are no longer
  referenced. control period with 'hbase.master.mob.cleaner.period'
- Optionally, RS that are compacting mob files can limit write
  amplification by not rewriting values from mob hfiles over a certain size
  limit. opt-in by setting 'hbase.mob.compaction.type' to 'optimized'.
  control threshold by 'hbase.mob.compactions.max.file.size'.
  default is 1GiB
- Should smoothly integrate with existing MOB users via rolling upgrade.
  will delay old MOB file cleanup until per-region compaction has managed
  to compact each region at least once so that used mob hfile metadata can
  be gathered.
2020-02-19 16:06:38 -06:00
Viraj Jasani
2e0edacf72
HBASE-23662 : Replace HColumnDescriptor(String cf) with ColumnFamilyDescriptor
Signed-off-by: Peter Somogyi <psomogyi@apache.org>
Signed-off-by: Jan Hentschel <janh@apache.org>
2020-01-10 20:42:21 -08:00
BukrosSzabolcs
413d4b2d0f HBASE-23566: Fix package/packet terminology problem in chaos monkeys (#933)
s/package/packet/g

Signed-off-by: Sean Busbey <busbey@apache.org>
2019-12-12 16:32:26 -06:00
Nick Dimiduk
a580b1d2e9 HBASE-23552 Format Javadocs on ITBLL
We have this nice description in the java doc on ITBLL but it's
unformatted and thus illegible. Add some formatting so that it can be
read by humans.

Signed-off-by: Jan Hentschel <janh@apache.org>
Signed-off-by: Josh Elser <elserj@apache.org>
2019-12-10 10:37:28 -08:00
BukrosSzabolcs
d69ecf6092 HBASE-23352: Allow chaos monkeys to access cmd line params, and improve FillDiskCommandAction (#885)
Instead of using the default properties when checking for monkey
properties, now we use the ones already extended with command line
params.
Change FillDiskCommandAction to try to stop the remote process if the
command failed with an exception.

Signed-off-by: stack <stack@apache.org>
2019-12-02 10:29:06 +08:00
Peter Somogyi
b1df7df0e0 HBASE-23085 Network and Data related Actions; ADDENDUM (#871)
Fix percentage in String.format

Signed-off-by: Sean Busbey <busbey@apache.org>
2019-11-22 22:08:47 -06:00
ravowlga123
08aae42156 HBASE-18439 Subclasses of o.a.h.h.chaos.actions.Action all use the same logger
Signed-off-by: Jan Hentschel <jan.hentschel@ultratendency.com>
Signed-off-by: Guangxu Cheng <gxcheng@apache.org>
2019-11-08 18:56:21 +01:00
BukrosSzabolcs
d2142a8ebb HBASE-23085 Network and Data related Actions
Add monkey actions:
- manipulate network packages with tc (reorder, loose,...)
- add CPU load
- fill the disk
- corrupt or delete regionserver data files

Extend HBaseClusterManager to allow sudo calls.

Signed-off-by: Josh Elser <elserj@apache.org>
Signed-off-by: Balazs Meszaros <meszibalu@apache.org>
2019-11-06 14:43:00 +01:00
meiyi
946f1e9e25
HBASE-23170 Admin#getRegionServers use ClusterMetrics.Option.SERVERS_NAME (#721) 2019-10-18 09:35:35 +08:00
BukrosSzabolcs
f0dddd1cc2 HBASE-22982: region server suspend/resume and graceful rolling restart actions (#592)
* Add chaos monkey action for suspend/resume region servers
* Add chaos monkey action for graceful rolling restart
* Add these to relevant chaos monkeys

Signed-off-by: Balazs Meszaros <meszibalu@apache.org>
Signed-off-by: Peter Somogyi <psomogyi@apache.org>
2019-09-26 10:07:38 +02:00
linkaline
3e2cfc1140 HBASE-22883 Duplacate codes of method Threads.newDaemonThreadFactory() and class DaemonThreadFactory (#537)
Signed-off-by: stack <stack@apache.org>
2019-08-27 15:27:22 -07:00
Guanghao
64e732dc8b
HBASE-22624 Should sanity check table configuration when clone snapshot to a new table 2019-07-03 11:54:25 +08:00