druid/integration-tests/docker/environment-configs
Vishesh Garg 197c54f673
Auto-Compaction using Multi-Stage Query Engine (#16291)
Description:
Compaction operations issued by the Coordinator currently run using the native query engine.
As majority of the advancements that we are making in batch ingestion are in MSQ, it is imperative
that we support compaction on MSQ to make Compaction more robust and possibly faster. 
For instance, we have seen OOM errors in native compaction that MSQ could have handled by its
auto-calculation of tuning parameters. 

This commit enables compaction on MSQ to remove the dependency on native engine. 

Main changes:
* `DataSourceCompactionConfig` now has an additional field `engine` that can be one of 
`[native, msq]` with `native` being the default.
*  if engine is MSQ, `CompactSegments` duty assigns all available compaction task slots to the
launched `CompactionTask` to ensure full capacity is available to MSQ. This is to avoid stalling which
could happen in case a fraction of the tasks were allotted and they eventually fell short of the number
of tasks required by the MSQ engine to run the compaction.
* `ClientCompactionTaskQuery` has a new field `compactionRunner` with just one `engine` field.
* `CompactionTask` now has `CompactionRunner` interface instance with its implementations
`NativeCompactinRunner` and `MSQCompactionRunner` in the `druid-multi-stage-query` extension.
The objectmapper deserializes `ClientCompactionRunnerInfo` in `ClientCompactionTaskQuery` to the
`CompactionRunner` instance that is mapped to the specified type [`native`, `msq`]. 
* `CompactTask` uses the `CompactionRunner` instance it receives to create the indexing tasks.
* `CompactionTask` to `MSQControllerTask` conversion logic checks whether metrics are present in 
the segment schema. If present, the task is created with a native group-by query; if not, the task is
issued with a scan query. The `storeCompactionState` flag is set in the context.
* Each created `MSQControllerTask` is launched in-place and its `TaskStatus` tracked to determine the
final status of the `CompactionTask`. The id of each of these tasks is the same as that of `CompactionTask`
since otherwise, the workers will be unable to determine the controller task's location for communication
(as they haven't been launched via the overlord).
2024-07-12 16:40:20 +05:30
..
override-examples Removes support for Hadoop 2 (#14763) 2023-08-09 17:47:52 +05:30
test-groups Fix cds-coordinator-metadata-query-disabled (#16488) 2024-05-22 20:42:11 +02:00
broker integration test for coordinator and overlord leadership client (#10680) 2020-12-17 22:50:12 -08:00
common Auto-Compaction using Multi-Stage Query Engine (#16291) 2024-07-12 16:40:20 +05:30
common-ldap upgrade mysql:mysql-connector-java to 8.2.0 (#16024) 2024-05-06 21:58:37 +08:00
coordinator update heap size of coordinator, overlord services in docker IT environment (#14214) 2023-05-12 23:19:48 +05:30
empty-config Use Druid's extension loading for integration test instead of maven (#12095) 2022-01-05 23:33:04 -08:00
historical Increase historical heap for standard IT (#15337) 2023-11-08 15:21:30 +05:30
historical-for-query-error-test More unit tests for JsonParserIterator; Integration tests for query errors (#11091) 2021-04-12 15:08:50 -07:00
indexer Fix byte calculation for maxBytesInMemory to take into account of Sink/Hydrant Object overhead (#10740) 2021-01-27 00:34:56 -08:00
middlemanager increase middlemanager heap server size in tests (#14345) 2023-05-29 10:45:34 +05:30
overlord update heap size of coordinator, overlord services in docker IT environment (#14214) 2023-05-12 23:19:48 +05:30
router Increase heap size for router (#14699) 2023-08-01 08:58:48 +05:30
router-custom-check-tls Fix unit tests and GC settings for Java 15 (#11074) 2021-04-08 10:33:37 -07:00
router-no-client-auth-tls Fix unit tests and GC settings for Java 15 (#11074) 2021-04-08 10:33:37 -07:00
router-permissive-tls Fix unit tests and GC settings for Java 15 (#11074) 2021-04-08 10:33:37 -07:00