Commit Graph

64 Commits

Author SHA1 Message Date
Karan Kumar c4990f56d6
Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
zachjsh 9d4e8053a4
Kinesis adaptive memory management (#15360)
### Description

Our Kinesis consumer works by using the [GetRecords API](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_GetRecords.html) in some number of `fetchThreads`, each fetching some number of records (`recordsPerFetch`) and each inserting into a shared buffer that can hold a `recordBufferSize` number of records. The logic is described in our documentation at: https://druid.apache.org/docs/27.0.0/development/extensions-core/kinesis-ingestion/#determine-fetch-settings 

There is a problem with the logic that this pr fixes: the memory limits rely on a hard-coded “estimated record size” that is `10 KB` if `deaggregate: false` and `1 MB` if `deaggregate: true`. There have been cases where a supervisor had `deaggregate: true` set even though it wasn’t needed, leading to under-utilization of memory and poor ingestion performance.

Users don’t always know if their records are aggregated or not. Also, even if they could figure it out, it’s better to not have to. So we’d like to eliminate the `deaggregate` parameter, which means we need to do memory management more adaptively based on the actual record sizes.

We take advantage of the fact that GetRecords doesn’t return more than 10MB (https://docs.aws.amazon.com/streams/latest/dev/service-sizes-and-limits.html ):

This pr: 

eliminates `recordsPerFetch`, always use the max limit of 10000 records (the default limit if not set)

eliminate `deaggregate`, always have it true

cap `fetchThreads` to ensure that if each fetch returns the max (`10MB`) then we don't exceed our budget (`100MB` or `5% of heap`). In practice this means `fetchThreads` will never be more than `10`. Tasks usually don't have that many processors available to them anyway, so in practice I don't think this will change the number of threads for too many deployments

add `recordBufferSizeBytes` as a bytes-based limit rather than records-based limit for the shared queue. We do know the byte size of kinesis records by at this point. Default should be `100MB` or `10% of heap`, whichever is smaller.

add `maxBytesPerPoll` as a bytes-based limit for how much data we poll from shared buffer at a time. Default is `1000000` bytes.

deprecate `recordBufferSize`, use `recordBufferSizeBytes` instead. Warning is logged if `recordBufferSize` is specified

deprecate `maxRecordsPerPoll`, use `maxBytesPerPoll` instead. Warning is logged if maxRecordsPerPoll` is specified

Fixed issue that when the record buffer is full, the fetchRecords logic throws away the rest of the GetRecords result after `recordBufferOfferTimeout` and starts a new shard iterator. This seems excessively churny. Instead,  wait an unbounded amount of time for queue to stop being full. If the queue remains full, we’ll end up right back waiting for it after the restarted fetch.

There was also a call to `newQ::offer` without check in `filterBufferAndResetBackgroundFetch`, which seemed like it could cause data loss. Now checking return value here, and failing if false.

### Release Note

Kinesis ingestion memory tuning config has been greatly simplified, and a more adaptive approach is now taken for the configuration. Here is a summary of the changes made:

eliminates `recordsPerFetch`, always use the max limit of 10000 records (the default limit if not set)

eliminate `deaggregate`, always have it true

cap `fetchThreads` to ensure that if each fetch returns the max (`10MB`) then we don't exceed our budget (`100MB` or `5% of heap`). In practice this means `fetchThreads` will never be more than `10`. Tasks usually don't have that many processors available to them anyway, so in practice I don't think this will change the number of threads for too many deployments

add `recordBufferSizeBytes` as a bytes-based limit rather than records-based limit for the shared queue. We do know the byte size of kinesis records by at this point. Default should be `100MB` or `10% of heap`, whichever is smaller.

add `maxBytesPerPoll` as a bytes-based limit for how much data we poll from shared buffer at a time. Default is `1000000` bytes.

deprecate `recordBufferSize`, use `recordBufferSizeBytes` instead. Warning is logged if `recordBufferSize` is specified

deprecate `maxRecordsPerPoll`, use `maxBytesPerPoll` instead. Warning is logged if maxRecordsPerPoll` is specified
2024-01-19 14:30:21 -05:00
Tom 18d42cae3f
Kafka emitter wasn't given the correct number of threads. It should be 1 thread per scheduled task. (#15719)
This change intelligently provisions the correct number of threads per scheduled task. 1 for each event type, and 1 for logging the lost events.
This is a change to make this work. But in the future it would be worthwhile to make each task not be greedy and share threads so there isn't a need of a thread per task.
2024-01-18 15:27:40 -06:00
Tom 901ebbb744
Allow for kafka emitter producer secrets to be masked in logs (#15485)
* Allow for kafka emitter producer secrets to be masked in logs instead of being visible

This change will allow for kafka producer config values that should be secrets to not show up in the logs.
This will enhance the security of the people who use the kafka emitter to use this if they want to.
This is opt in and will not affect prior configs for this emitter

* fix checkstyle issue

* change property name
2023-12-15 12:21:21 -05:00
sensor c9be1cb4e8
Clean useless InterruptedException warn in ingestion task log (#15519)
* Clean useless InterruptedException warn in ingestion task log

* test coverage for the code change, manually close the scheduler thread to trigger Interrupt signal

---------

Co-authored-by: Qiong Chen <qiong.chen@shopee.com>
2023-12-15 11:18:53 +08:00
Tom 386cdb95e7
Fixes minor typo in kafka emitter (#15416) 2023-11-23 14:19:39 +05:30
Laksh Singla 5f86072456
Prepare master for Druid 29 (#15121)
Prepare master for Druid 29
2023-10-11 10:33:45 +05:30
Tejaswini Bandlamudi 48b6d2abf9
skip org.owasp:dependency-check on extensions-contrib modules and suppress false-positive gRPC CVEs (#15026) 2023-09-25 12:14:42 +05:30
Kashif Faraz 7f26b80e21
Simplify ServiceMetricEvent.Builder (#14933)
Changes:
- Make ServiceMetricEvent.Builder extend ServiceEventBuilder<ServiceMetricEvent>
and thus convert it to a plain builder rather than a builder of builder.
- Add methods setCreatedTime , setMetricAndValue to the builder
2023-09-01 11:30:45 +05:30
Tejaswini Bandlamudi 550a66d71e
Upgrade jackson-databind to 2.12.7 (#14770)
The current version of jackson-databind is flagged for vulnerabilities CVE-2020-28491 (Although cbor format is not used in druid), CVE-2020-36518 (Seems genuine as deeply nested json in can cause resource exhaustion). Updating the dependency to the latest version 2.12.7 to fix these vulnerabilities.
2023-08-09 12:22:16 +05:30
AmatyaAvadhanula 0412f40d36
Prepare master branch for next release, 28.0.0 (#14595)
* Prepare master branch for next release, 28.0.0
2023-07-18 09:22:30 +05:30
Harini Rajendran 4ff6026d30
Adding SegmentMetadataEvent and publishing them via KafkaEmitter (#14281)
In this PR, we are enhancing KafkaEmitter, to emit metadata about published segments (SegmentMetadataEvent) into a Kafka topic. This segment metadata information that gets published into Kafka, can be used by any other downstream services to query Druid intelligently based on the segments published. The segment metadata gets published into kafka topic in json string format similar to other events.
2023-06-02 21:28:26 +05:30
Clint Wylie 1aef72aa7e
Bump up the version in pom to 27.0.0 in preparation of release (#14051) 2023-04-10 14:56:59 +05:30
Clint Wylie 08b5951cc5
merge druid-core, extendedset, and druid-hll into druid-processing to simplify everything (#13698)
* merge druid-core, extendedset, and druid-hll into druid-processing to simplify everything
* fix poms and license stuff
* mockito is evil
* allow reset of JvmUtils RuntimeInfo if tests used static injection to override
2023-02-17 14:27:41 -08:00
Kashif Faraz 7cf761cee4
Prepare master branch for next release, 26.0.0 (#13401)
* Prepare master branch for next release, 26.0.0

* Use docker image for druid 24.0.1

* Fix version in druid-it-cases pom.xml
2022-11-22 15:31:01 +05:30
Didip Kerabat 66545a0f3d
Fix compiler error: The project was not built since its build path is incomplete. Cannot find the class file for org.slf4j.Logger. Fix the build path then try building this project (#13029)
Co-authored-by: Didip Kerabat <didip@apple.com>
2022-09-06 20:49:41 +05:30
Abhishek Agarwal 618757352b
Bump up the version to 25.0.0 (#12975)
* Bump up the version to 25.0.0

* Fix the version in console
2022-08-29 11:27:38 +05:30
Bartosz Mikulski 0bc9f9f303
#12912 Fix KafkaEmitter not emitting queryType for a native query (#12915)
Fixes KafkaEmitter not emitting queryType for a native query. The Event to JSON serialization was extracted to the external class: EventToJsonSerializer. This was done to simplify the testing logic for the serialization as well as extract the responsibility of serialization to the separate class.

The logic builds ObjectNode incrementally based on the event .toMap method. Parsing each entry individually ensures that the Jackson polymorphic annotations are respected. Not respecting these annotation caused the missing of the queryType from output event.
2022-08-24 14:07:00 +05:30
Abhishek Agarwal 2fe053c5cb
Bump up the versions (#12480) 2022-04-27 14:28:20 +05:30
Xavier Léauté 4c61878f9c
Reduce use of mocking and simplify some tests (#12283)
* remove use of mocks for ServiceMetricEvent
* simplify KafkaEmitterTests by moving to Mockito
* speed up KafkaEmitterTest by adjusting reporting frequency in tests
* remove unnecessary easymock and JUnitParams dependencies
2022-02-26 17:23:09 -08:00
Clint Wylie fe1d8c206a
bump version to 0.23.0-SNAPSHOT (#11670) 2021-09-08 15:56:04 -07:00
Parag Jain b35486fa81
request logs through kafka emitter (#11036)
* request logs through kafka emitter

* travis fixes

* review comments

* kafka emitter unit test

* new line

* travis checks

* checkstyle fix

* count request lost when request topic is null
2021-04-01 11:31:32 +05:30
Jihoon Son 95065bdf1a
Bump dev version to 0.22.0-SNAPSHOT (#10759) 2021-01-15 13:16:23 -08:00
Jonathan Wei 65c0d64676
Update version to 0.21.0-SNAPSHOT (#10450)
* [maven-release-plugin] prepare release druid-0.21.0

* [maven-release-plugin] prepare for next development iteration

* Update web-console versions
2020-10-03 16:08:34 -07:00
Clint Wylie c86e7ce30b
bump version to 0.20.0-SNAPSHOT (#10124) 2020-07-06 15:08:32 -07:00
Aleksey Plekhanov 2c384b61ff
IntelliJ inspection and checkstyle rule for "Collection.EMPTY_* field accesses replaceable with Collections.empty*()" (#9690)
* IntelliJ inspection and checkstyle rule for "Collection.EMPTY_* field accesses replaceable with Collections.empty*()"

* Reverted checkstyle rule

* Added tests to pass CI

* Codestyle
2020-06-18 09:47:07 -07:00
Xavier Léauté 65280a6953
update kafka client version to 2.5.0 (#9902)
- remove dependency on deprecated internal Kafka classes
- keep LZ4 version in line with the version shipped with Kafka
2020-05-27 13:20:32 -07:00
yuanli 8ccc0b241a
Fix some flaws of KafkaEmitter (#9573)
* fix flaws of KafkaEmitter

* fix flaws of KafkaEmitter

* fix flaws of KafkaEmitter

* Update extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java

Co-Authored-By: Himanshu <g.himanshu@gmail.com>

* Update extensions-contrib/kafka-emitter/src/main/java/org/apache/druid/emitter/kafka/KafkaEmitter.java

Co-Authored-By: Himanshu <g.himanshu@gmail.com>

Co-authored-by: Himanshu <g.himanshu@gmail.com>
2020-04-09 23:31:32 -07:00
Jihoon Son 0da8ffc3ff
Bump up development version to 0.19.0-SNAPSHOT (#9586) 2020-03-30 16:24:04 -07:00
Jonathan Wei 4e8368a5d9 Set version to 0.18.0-SNAPSHOT (#9109) 2020-01-02 17:55:10 -05:00
Jonathan Wei 8af41d7cd0 Update version to 0.18.0-incubating-SNAPSHOT (#9009) 2019-12-11 14:04:03 -08:00
jon-wei dfbc066163 Revert "[maven-release-plugin] prepare release druid-0.16.1-incubating-rc1"
This reverts commit a0f21d9b07.
2019-11-27 23:22:43 -08:00
jon-wei 0402ff85b8 Revert "[maven-release-plugin] prepare for next development iteration"
This reverts commit 8ffa71e7e6.
2019-11-27 23:22:32 -08:00
jon-wei 8ffa71e7e6 [maven-release-plugin] prepare for next development iteration 2019-11-27 23:18:48 -08:00
jon-wei a0f21d9b07 [maven-release-plugin] prepare release druid-0.16.1-incubating-rc1 2019-11-27 23:18:37 -08:00
Chi Cao Minh 5f61374cb3 Fix dependency analyze warnings (#8230)
* Fix dependency analyze warnings

Update the maven dependency plugin to the latest version and fix all
warnings for unused declared and used undeclared dependencies in the
compile scope. Added new travis job to add the check to CI. Also fixed
some source code files to use the correct packages for their imports and
updated druid-forbidden-apis to prevent regressions.

* Address review comments

* Adjust scope for org.glassfish.jaxb:jaxb-runtime

* Fix dependencies for hdfs-storage

* Consolidate netty4 versions
2019-09-09 14:37:21 -07:00
Clint Wylie c73a489335
bump master version to 0.17.0-incubating-SNAPSHOT (#8421) 2019-08-28 01:58:36 -07:00
Benedict Jin 14a4238381 Bump JUnitParams from 1.0.4 to 1.1.1 (#8017) 2019-08-20 16:15:12 -07:00
Chi Cao Minh ab71a2e1e4 Revert "Fix dependency analyze warnings (#8128)" (#8189)
This reverts commit 5dd0d8e873.
2019-07-29 11:42:16 -07:00
Chi Cao Minh 5dd0d8e873 Fix dependency analyze warnings (#8128)
* Fix dependency analyze warnings

Update the maven dependency plugin to the latest version and fix all
warnings for unused declared and used undeclared dependencies in the
compile scope. Added new travis job to add the check to CI. Also fixed
some source code files to use the correct packages for their imports.

* Fix licenses and dependencies

* Fix licenses and dependencies again

* Fix integration test dependency

* Address review comments

* Fix unit test dependencies

* Fix integration test dependency

* Fix integration test dependency again

* Fix integration test dependency third time

* Fix integration test dependency fourth time

* Fix compile error

* Fix assert package
2019-07-26 10:49:03 -07:00
Jihoon Son 7abfbb066a Bump up snapshot version to 0.16.0 (#7802) 2019-05-30 17:17:33 -07:00
Jonathan Wei fafbc4a80e
Set version to 0.15.0-incubating-SNAPSHOT (#7014) 2019-02-07 14:02:52 -08:00
Jonathan Wei 8bc5eaa908
Set version to 0.14.0-incubating-SNAPSHOT (#7003) 2019-02-04 19:36:20 -08:00
Jonathan Wei b18d681551 Use kafka_2.12-0.10.2.2 (#6846) 2019-01-10 20:52:55 -08:00
David Lim afb239b17a add missing license headers, in particular to MD files; clean up RAT … (#6563)
* add missing license headers, in particular to MD files; clean up RAT exclusions

* revert inadvertent doc changes

* docs

* cr changes

* fix modified druid-production.svg
2018-11-13 09:38:37 -08:00
Clint Wylie 84598fba3b combine druid-api, druid-common, java-util into druid-core (#6443)
* combine druid-api, druid-common, java-util

* spacing
2018-10-14 20:37:37 -07:00
David Lim 20ab213ba6 change project versions to 0.13.0-incubating-SNAPSHOT (#6453) 2018-10-11 19:28:01 -07:00
QiuMM 0b8085aff7 Prohibit jackson ObjectMapper#reader methods which are deprecated (#6386)
* Prohibit jackson ObjectMapper#reader methods which are deprecated

* address comments
2018-10-03 17:55:20 -03:00
Roman Leventov 3ae563263a
Renamed 'Generic Column' -> 'Numeric Column'; Fixed a few resource leaks in processing; misc refinements (#5957)
This PR accumulates many refactorings and small improvements that I did while preparing the next change set of https://github.com/druid-io/druid/projects/2. I finally decided to make them a separate PR to minimize the volume of the main PR.

Some of the changes:
 - Renamed confusing "Generic Column" term to "Numeric Column" (what it actually implies) in many class names.
 - Generified `ComplexMetricExtractor`
2018-10-02 14:50:22 -03:00
Gian Merlino 431d3d8497
Rename io.druid to org.apache.druid. (#6266)
* Rename io.druid to org.apache.druid.

* Fix META-INF files and remove some benchmark results.

* MonitorsConfig update for metrics package migration.

* Reorder some dimensions in inner queries for some reason.

* Fix protobuf tests.
2018-08-30 09:56:26 -07:00