Andrew Purtell ef1641d277
HBASE-27088 IntegrationLoadTestCommonCrawl async load improvements (#4488)
* HBASE-27088 IntegrationLoadTestCommonCrawl async load improvements

- Use an async client and work stealing executor for parallelism during loads.
- Remove the verification read retries, these are not that effective during
  replication lag anyway.
- Increase max task attempts because S3 might throttle.
- Implement a side task that exercises Increments by extracting urls from
  content and updating a cf that tracks referrer counts. These are not
  validated at this time. It could be possible to log the increments, sum
  them with a reducer, and then verify the total, but this is left as a
  future exercise.

Signed-off-by: Viraj Jasani <vjasani@apache.org>

* Sum RPC time for writes (loader) and reads (verifier) and mutation bytes submitted. Expose as job counters.

* Fix an issue with completion chaining

* Pause loading if too many operations are in flight
2022-07-13 09:01:21 -07:00
2022-05-01 22:15:09 +08:00

Apache HBase [1] is an open-source, distributed, versioned, column-oriented
store modeled after Google' Bigtable: A Distributed Storage System for
Structured Data by Chang et al.[2]  Just as Bigtable leverages the distributed
data storage provided by the Google File System, HBase provides Bigtable-like
capabilities on top of Apache Hadoop [3].

To get started using HBase, the full documentation for this release can be
found under the doc/ directory that accompanies this README.  Using a browser,
open the docs/index.html to view the project home page (or browse to [1]).
The hbase 'book' at http://hbase.apache.org/book.html has a 'quick start'
section and is where you should being your exploration of the hbase project.

The latest HBase can be downloaded from an Apache Mirror [4].

The source code can be found at [5]

The HBase issue tracker is at [6]

Apache HBase is made available under the Apache License, version 2.0 [7]

The HBase mailing lists and archives are listed here [8].

The HBase distribution includes cryptographic software. See the export control
notice here [9].

1. http://hbase.apache.org
2. http://research.google.com/archive/bigtable.html
3. http://hadoop.apache.org
4. http://www.apache.org/dyn/closer.lua/hbase/
5. https://hbase.apache.org/source-repository.html
6. https://hbase.apache.org/issue-tracking.html
7. http://hbase.apache.org/license.html
8. http://hbase.apache.org/mail-lists.html
9. https://hbase.apache.org/export_control.html
Description
No description provided
Readme 550 MiB
Languages
Java 96.1%
Ruby 1.7%
Perl 0.8%
Shell 0.7%
Python 0.3%
Other 0.1%