druid/integration-tests/docker
Vishesh Garg 197c54f673
Auto-Compaction using Multi-Stage Query Engine (#16291)
Description:
Compaction operations issued by the Coordinator currently run using the native query engine.
As majority of the advancements that we are making in batch ingestion are in MSQ, it is imperative
that we support compaction on MSQ to make Compaction more robust and possibly faster. 
For instance, we have seen OOM errors in native compaction that MSQ could have handled by its
auto-calculation of tuning parameters. 

This commit enables compaction on MSQ to remove the dependency on native engine. 

Main changes:
* `DataSourceCompactionConfig` now has an additional field `engine` that can be one of 
`[native, msq]` with `native` being the default.
*  if engine is MSQ, `CompactSegments` duty assigns all available compaction task slots to the
launched `CompactionTask` to ensure full capacity is available to MSQ. This is to avoid stalling which
could happen in case a fraction of the tasks were allotted and they eventually fell short of the number
of tasks required by the MSQ engine to run the compaction.
* `ClientCompactionTaskQuery` has a new field `compactionRunner` with just one `engine` field.
* `CompactionTask` now has `CompactionRunner` interface instance with its implementations
`NativeCompactinRunner` and `MSQCompactionRunner` in the `druid-multi-stage-query` extension.
The objectmapper deserializes `ClientCompactionRunnerInfo` in `ClientCompactionTaskQuery` to the
`CompactionRunner` instance that is mapped to the specified type [`native`, `msq`]. 
* `CompactTask` uses the `CompactionRunner` instance it receives to create the indexing tasks.
* `CompactionTask` to `MSQControllerTask` conversion logic checks whether metrics are present in 
the segment schema. If present, the task is created with a native group-by query; if not, the task is
issued with a scan query. The `storeCompactionState` flag is set in the context.
* Each created `MSQControllerTask` is launched in-place and its `TaskStatus` tracked to determine the
final status of the `CompactionTask`. The id of each of these tasks is the same as that of `CompactionTask`
since otherwise, the workers will be unable to determine the controller task's location for communication
(as they haven't been launched via the overlord).
2024-07-12 16:40:20 +05:30
..
environment-configs Auto-Compaction using Multi-Stage Query Engine (#16291) 2024-07-12 16:40:20 +05:30
ldap-configs Add support for authorizing query context params (#12396) 2022-04-21 14:21:16 +05:30
schema-registry add avro + kafka + schema registry integration test (#10929) 2021-03-08 08:12:12 -08:00
service-supervisords Move service specific JVM parameters to the right in tests (#15325) 2023-11-06 15:45:59 +05:30
test-data Followup changes to 15817 (Segment schema publishing and polling) (#16368) 2024-05-03 19:13:52 +05:30
tls Upgrade RSA Key from 1024 bit to 4096 to eliminate warnings (#11743) 2022-01-11 13:24:09 +08:00
Dockerfile upgrade mysql:mysql-connector-java to 8.2.0 (#16024) 2024-05-06 21:58:37 +08:00
base-setup.sh remove ZooKeeper 3.4 support + pass tests with Java 15 (#11073) 2021-05-25 12:49:49 -07:00
docker-compose.base.yml Use Druid's extension loading for integration test instead of maven (#12095) 2022-01-05 23:33:04 -08:00
docker-compose.cds-coordinator-metadata-query-disabled.yml Followup changes to 15817 (Segment schema publishing and polling) (#16368) 2024-05-03 19:13:52 +05:30
docker-compose.cds-task-schema-publish-disabled.yml Fix cds-coordinator-metadata-query-disabled (#16488) 2024-05-22 20:42:11 +02:00
docker-compose.centralized-datasource-schema.yml Fix empty datasource schema on the Broker when metadata query is disabled (#16645) 2024-06-28 11:06:56 +05:30
docker-compose.cli-indexer.yml integration test for coordinator and overlord leadership client (#10680) 2020-12-17 22:50:12 -08:00
docker-compose.druid-hadoop.yml Integration Tests. (#9854) 2020-06-02 09:38:53 -07:00
docker-compose.high-availability.yml Use Druid's extension loading for integration test instead of maven (#12095) 2022-01-05 23:33:04 -08:00
docker-compose.ldap-security.yml Use Druid's extension loading for integration test instead of maven (#12095) 2022-01-05 23:33:04 -08:00
docker-compose.query-error-test.yml Migrate ITs from Travis to GHA (#13681) 2023-02-01 03:31:29 -08:00
docker-compose.query-retry-test.yml Migrate ITs from Travis to GHA (#13681) 2023-02-01 03:31:29 -08:00
docker-compose.schema-registry-indexer.yml add avro + kafka + schema registry integration test (#10929) 2021-03-08 08:12:12 -08:00
docker-compose.schema-registry.yml add avro + kafka + schema registry integration test (#10929) 2021-03-08 08:12:12 -08:00
docker-compose.security.yml integration test for coordinator and overlord leadership client (#10680) 2020-12-17 22:50:12 -08:00
docker-compose.yml Use Druid's extension loading for integration test instead of maven (#12095) 2022-01-05 23:33:04 -08:00
druid.sh Followup changes to 15817 (Segment schema publishing and polling) (#16368) 2024-05-03 19:13:52 +05:30
run-mysql.sh add missing license headers, in particular to MD files; clean up RAT … (#6563) 2018-11-13 09:38:37 -08:00
supervisord.conf Integration tests for JDK 11 (#9249) 2020-02-12 16:36:31 -08:00
wiki-simple-lookup.json refactor lookups to be more chill to router (#7222) 2019-04-05 14:49:41 -07:00