druid/website
zachjsh 9d4e8053a4
Kinesis adaptive memory management (#15360)
### Description

Our Kinesis consumer works by using the [GetRecords API](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_GetRecords.html) in some number of `fetchThreads`, each fetching some number of records (`recordsPerFetch`) and each inserting into a shared buffer that can hold a `recordBufferSize` number of records. The logic is described in our documentation at: https://druid.apache.org/docs/27.0.0/development/extensions-core/kinesis-ingestion/#determine-fetch-settings 

There is a problem with the logic that this pr fixes: the memory limits rely on a hard-coded “estimated record size” that is `10 KB` if `deaggregate: false` and `1 MB` if `deaggregate: true`. There have been cases where a supervisor had `deaggregate: true` set even though it wasn’t needed, leading to under-utilization of memory and poor ingestion performance.

Users don’t always know if their records are aggregated or not. Also, even if they could figure it out, it’s better to not have to. So we’d like to eliminate the `deaggregate` parameter, which means we need to do memory management more adaptively based on the actual record sizes.

We take advantage of the fact that GetRecords doesn’t return more than 10MB (https://docs.aws.amazon.com/streams/latest/dev/service-sizes-and-limits.html ):

This pr: 

eliminates `recordsPerFetch`, always use the max limit of 10000 records (the default limit if not set)

eliminate `deaggregate`, always have it true

cap `fetchThreads` to ensure that if each fetch returns the max (`10MB`) then we don't exceed our budget (`100MB` or `5% of heap`). In practice this means `fetchThreads` will never be more than `10`. Tasks usually don't have that many processors available to them anyway, so in practice I don't think this will change the number of threads for too many deployments

add `recordBufferSizeBytes` as a bytes-based limit rather than records-based limit for the shared queue. We do know the byte size of kinesis records by at this point. Default should be `100MB` or `10% of heap`, whichever is smaller.

add `maxBytesPerPoll` as a bytes-based limit for how much data we poll from shared buffer at a time. Default is `1000000` bytes.

deprecate `recordBufferSize`, use `recordBufferSizeBytes` instead. Warning is logged if `recordBufferSize` is specified

deprecate `maxRecordsPerPoll`, use `maxBytesPerPoll` instead. Warning is logged if maxRecordsPerPoll` is specified

Fixed issue that when the record buffer is full, the fetchRecords logic throws away the rest of the GetRecords result after `recordBufferOfferTimeout` and starts a new shard iterator. This seems excessively churny. Instead,  wait an unbounded amount of time for queue to stop being full. If the queue remains full, we’ll end up right back waiting for it after the restarted fetch.

There was also a call to `newQ::offer` without check in `filterBufferAndResetBackgroundFetch`, which seemed like it could cause data loss. Now checking return value here, and failing if false.

### Release Note

Kinesis ingestion memory tuning config has been greatly simplified, and a more adaptive approach is now taken for the configuration. Here is a summary of the changes made:

eliminates `recordsPerFetch`, always use the max limit of 10000 records (the default limit if not set)

eliminate `deaggregate`, always have it true

cap `fetchThreads` to ensure that if each fetch returns the max (`10MB`) then we don't exceed our budget (`100MB` or `5% of heap`). In practice this means `fetchThreads` will never be more than `10`. Tasks usually don't have that many processors available to them anyway, so in practice I don't think this will change the number of threads for too many deployments

add `recordBufferSizeBytes` as a bytes-based limit rather than records-based limit for the shared queue. We do know the byte size of kinesis records by at this point. Default should be `100MB` or `10% of heap`, whichever is smaller.

add `maxBytesPerPoll` as a bytes-based limit for how much data we poll from shared buffer at a time. Default is `1000000` bytes.

deprecate `recordBufferSize`, use `recordBufferSizeBytes` instead. Warning is logged if `recordBufferSize` is specified

deprecate `maxRecordsPerPoll`, use `maxBytesPerPoll` instead. Warning is logged if maxRecordsPerPoll` is specified
2024-01-19 14:30:21 -05:00
..
script docs: Anchor link checker (#15624) 2024-01-08 15:19:05 -08:00
src Docusaurus2 upgrade for master (#14411) 2023-08-16 19:01:21 -07:00
static fix: Creating span label not closed (#15323) 2023-11-07 11:01:28 +08:00
.spelling Kinesis adaptive memory management (#15360) 2024-01-19 14:30:21 -05:00
README.md Docusaurus2 upgrade for master (#14411) 2023-08-16 19:01:21 -07:00
docusaurus.config.js Use to trigger search bar of official site (#15652) 2024-01-10 17:18:36 +08:00
package-lock.json docs: update docusaurus 2 stuff (#14864) 2023-09-08 14:19:15 -07:00
package.json Use to trigger search bar of official site (#15652) 2024-01-10 17:18:36 +08:00
redirects.js Revamp design page (#15486) 2023-12-08 11:40:24 -08:00
sidebars.json [Docs] Add a release notes template (#15333) 2023-12-11 11:35:16 +05:30

README.md

Druid doc builder

This website was created with Docusaurus.

To view documentation run:

npm install

Then run:

npm start

The current version of the web site appears in your browser. Edit pages with your favorite editor. Refresh the web page after each edit to review your changes.

Dependencies

  • NodeJS. Use the version Docusaurus specifies, not a newer one. (For example, if 12.x is requested, don't install 16.x.) Docusaurus may require a version newer than that available in your Linux package repository, but older than the latest version. See this page to find the version required by Docusaurus.
  • The Yarn dependency from Docusaurus is optional. (This Yarn is not the Hadoop resource manager, it is a package manager for Node.js).
  • Docusaurus. Installed automatically as part of the the above npm commands.

Variables

Documentation pages can refer to a number of special variables using the {{var}} syntax:

  • DRUIDVERSION - the version of Druid in which the page appears. Allows creating links to files of the same version on GitHub.

The variables are not replaced when running the web site locally using the start command above. They're replaced as part of the process that copies the docs to apache/druid-website-src/.

Spellcheck

Please run a spellcheck before issuing a pull request to avoid a build failure due to spelling issues. Run:

npm run link-lint
npm run spellcheck

If you introduce new (correctly spelled) project names or technical terms, add them to the dictionary in the .spelling file in this directory. Also, terms enclosed in backticks are not spell checked. Example: `symbolName`