Commit Graph

632 Commits

Author SHA1 Message Date
Abhishek Radhakrishnan 9a16d4e219
Move SqlTaskStatus and SqlTaskStausTest from msq module to sql module. (#17380)
- This is a non-functional change that moves SqlTaskStatus and its unit test SqlTaskStatusTest from the msq module to the sql module to help class reuse in other places.
- This refactor is extracted from this PR to facilitate easier review.
- Fix a minor spacing issue in the TaskStartTimeoutFault error message.
2024-10-18 14:39:01 -07:00
Vishesh Garg f576e299db
Allow MSQ engine only for compaction supervisors (#17033)
#16768 added the functionality to run compaction as a supervisor on the overlord.
This patch builds on top of that to restrict MSQ engine to compaction in the supervisor-mode only.
With these changes, users can no longer add MSQ engine as part of datasource compaction config,
or as the default cluster-level compaction engine, on the Coordinator. 

The patch also adds an Overlord runtime property `druid.supervisor.compaction.engine=<msq/native>`
to specify the default engine for compaction supervisors.

Since these updates require major changes to existing MSQ compaction integration tests,
this patch disables MSQ-specific compaction integration tests -- they will be taken up in a follow-up PR.

Key changed/added classes in this patch:
* CompactionSupervisor
* CompactionSupervisorSpec
* CoordinatorCompactionConfigsResource
* OverlordCompactionScheduler
2024-09-24 17:19:16 +05:30
Rishabh Singh dd8c7de144
Reduce the number of ITs in CDS groups (#17059)
* Reduce the number of ITs in CDS groups

The two CDS IT groups cds-task-schema-publish-disabled & cds-coordinator-metadata-query-disabled
runs a lot of ITs serially, leading to increased CI runtime.

Reducing the number of ITs running in the two groups to,
ITAppendBatchIndexTest (append-ingestion)
ITIndexerTest (batch-index)
ITOverwriteBatchIndexTest (batch-index)
ITCompactionSparseColumnIndexTest (compaction)
ITCompactionTaskTest (compaction)
ITKafkaIndexingServiceDataFormatTest (kafka-data-format)

Both these groups will be removed once CDS is enabled by default.
2024-09-17 17:28:03 -07:00
Gian Merlino d789723331
Remove some unnecessary JoinableFactoryWrappers. (#17051)
* Remove some unnecessary JoinableFactoryWrappers.

* Remove unused import.
2024-09-16 11:32:12 +05:30
Abhishek Agarwal 78775ad398
Prepare master for 32.0.0 release (#17022) 2024-09-10 11:01:20 +05:30
Vishesh Garg 37d4174245
Compute `range` partitionsSpec using effective `maxRowsPerSegment` (#16987)
In the compaction config, a range type partitionsSpec supports setting one of maxRowsPerSegment and targetRowsPerSegment. When compaction is run with the native engine, while maxRowsPerSegment = x results in segments of size x, targetRowsPerSegment = y results in segments of size 1.5 * y.

MSQ only supports rowsPerSegment = x as part of its tuning config, the resulting segment size being approx. x -- which is in line with maxRowsPerSegment behaviour in native compaction.

This PR makes the following changes:

use effective maxRowsPerSegment to pass as rowsPerSegment parameter for MSQ
persist rowsPerSegment as maxRowsPerSegment in lastCompactionState for MSQ
Use effective maxRowsPerSegment-based range spec in CompactionStatus check for both Native and MSQ.
2024-09-09 10:53:58 +05:30
Rishabh Singh 40f38f0191
Remove migrated deep storage standard ITs (#16933) 2024-09-05 16:07:33 +05:30
AmatyaAvadhanula bfbd21bce0
Revert "Add integration tests for concurrent append and replace (#16755)" (#17000)
This reverts commit 70bad948e3.
2024-09-04 23:36:49 +05:30
Vishesh Garg e28424ea25
Enable rollup on multi-value dimensions for compaction with MSQ engine (#16937)
Currently compaction with MSQ engine doesn't work for rollup on multi-value dimensions (MVDs), the reason being the default behaviour of grouping on MVD dimensions to unnest the dimension values; for instance grouping on `[s1,s2]` with aggregate `a` will result in two rows: `<s1,a>` and `<s2,a>`. 

This change enables rollup on MVDs (without unnest) by converting MVDs to Arrays before rollup using virtual columns, and then converting them back to MVDs using post aggregators. If segment schema is available to the compaction task (when it ends up downloading segments to get existing dimensions/metrics/granularity), it selectively does the MVD-Array conversion only for known multi-valued columns; else it conservatively performs this conversion for all `string` columns.
2024-09-04 16:28:04 +05:30
AmatyaAvadhanula 70bad948e3
Add integration tests for concurrent append and replace (#16755)
IT for streaming tasks with concurrent compaction
2024-09-03 14:58:15 +05:30
Vishesh Garg e37fe93f09
Add support for a custom `DimensionSchema` in `DataSourceMSQDestination` (#16864)
This PR adds support for passing in a custom DimensionSchema map to MSQ query destination of type DataSourceMSQDestination
2024-08-16 15:24:49 +05:30
Gian Merlino efe0044f9e
Use fuzzy matchers for compaction bytes asserts. (#16870)
* Use fuzzy matchers for compaction bytes asserts.

This still enables us to test that the bytes are zero and nonzero
when they're supposed to be, without having to ge them exactly
right. The need to get bytes exactly right makes it difficult to
ensure ITs pass when making changes to default segment metadata.

* Additional fuzziness.
2024-08-12 10:00:33 +08:00
Adithya Chakilam a7dd436a32
Check if supervisor could be idle on startup (#16844)
Fixes #13936 

In cases where a supervisor is idle and the overlord is restarted for some reason, the supervisor would
start spinning tasks again. In clusters where there are many low throughput streams, this would spike
the task count unnecessarily.

This commit compares the latest stream offset with the ones in metadata during the startup of supervisor
and sets it to idle state if they match.
2024-08-09 14:42:48 +05:30
Vishesh Garg 593c3b2150
Do not support non-idempotent aggregator in MSQ compaction (#16846)
This PR adds checks for verification of DataSourceCompactionConfig and CompactionTask with msq engine to ensure:

each aggregator in metricsSpec is idempotent
metricsSpec is non-null when rollup is set to true
Unit tests and existing compaction ITs have been updated accordingly.
2024-08-06 20:58:08 +05:30
Kashif Faraz 954aaafe0c
Refactor: Clean up compaction config classes (#16810)
Changes:
- Rename `CoordinatorCompactionConfig` to `DruidCompactionConfig`
- Rename `CompactionConfigUpdateRequest` to `ClusterCompactionConfig`
- Refactor methods in `DruidCompactionConfig`
- Clean up `DataSourceCompactionConfigHistory` and its tests
- Clean up tests and add new tests
- Change API path `/druid/coordinator/v1/config/global` to `/druid/coordinator/v1/config/cluster`
2024-07-30 12:17:25 +05:30
AmatyaAvadhanula 92a40d8169
Add API to fetch conflicting task locks (#16799)
* Add API to fetch conflicting active locks
2024-07-30 11:40:48 +05:30
Vishesh Garg e9ea243d97
Enable compaction ITs on MSQ engine (#16778)
Follow-up to #16291, this commit enables a subset of existing native compaction ITs on the MSQ engine.

In the process, the following changes have been introduced in the MSQ compaction flow:
- Populate `metricsSpec` in `CompactionState` from `querySpec` in `MSQControllerTask` instead of `dataSchema`
- Add check for pre-rolled-up segments having `AggregatorFactory` with different input and output column names
- Fix passing missing cluster-by clause in scan queries
- Add annotation of `CompactionState` to tombstone segments
2024-07-30 09:34:46 +05:30
Clint Wylie a34a06e192
remove Firehose and FirehoseFactory (#16758)
changes:
* removed `Firehose` and `FirehoseFactory` and remaining implementations which were mostly no longer used after #16602
* Moved `IngestSegmentFirehose` which was still used internally by Hadoop ingestion to `DatasourceRecordReader.SegmentReader`
* Rename `SQLFirehoseFactoryDatabaseConnector` to `SQLInputSourceDatabaseConnector` and similar renames for sub-classes
* Moved anything remaining in a 'firehose' package somewhere else
* Clean up docs on firehose stuff
2024-07-19 14:37:21 -07:00
Vishesh Garg 197c54f673
Auto-Compaction using Multi-Stage Query Engine (#16291)
Description:
Compaction operations issued by the Coordinator currently run using the native query engine.
As majority of the advancements that we are making in batch ingestion are in MSQ, it is imperative
that we support compaction on MSQ to make Compaction more robust and possibly faster. 
For instance, we have seen OOM errors in native compaction that MSQ could have handled by its
auto-calculation of tuning parameters. 

This commit enables compaction on MSQ to remove the dependency on native engine. 

Main changes:
* `DataSourceCompactionConfig` now has an additional field `engine` that can be one of 
`[native, msq]` with `native` being the default.
*  if engine is MSQ, `CompactSegments` duty assigns all available compaction task slots to the
launched `CompactionTask` to ensure full capacity is available to MSQ. This is to avoid stalling which
could happen in case a fraction of the tasks were allotted and they eventually fell short of the number
of tasks required by the MSQ engine to run the compaction.
* `ClientCompactionTaskQuery` has a new field `compactionRunner` with just one `engine` field.
* `CompactionTask` now has `CompactionRunner` interface instance with its implementations
`NativeCompactinRunner` and `MSQCompactionRunner` in the `druid-multi-stage-query` extension.
The objectmapper deserializes `ClientCompactionRunnerInfo` in `ClientCompactionTaskQuery` to the
`CompactionRunner` instance that is mapped to the specified type [`native`, `msq`]. 
* `CompactTask` uses the `CompactionRunner` instance it receives to create the indexing tasks.
* `CompactionTask` to `MSQControllerTask` conversion logic checks whether metrics are present in 
the segment schema. If present, the task is created with a native group-by query; if not, the task is
issued with a scan query. The `storeCompactionState` flag is set in the context.
* Each created `MSQControllerTask` is launched in-place and its `TaskStatus` tracked to determine the
final status of the `CompactionTask`. The id of each of these tasks is the same as that of `CompactionTask`
since otherwise, the workers will be unable to determine the controller task's location for communication
(as they haven't been launched via the overlord).
2024-07-12 16:40:20 +05:30
Rishabh Singh b9c7664ac3
Fix empty datasource schema on the Broker when metadata query is disabled (#16645)
* Fix build

* Fix empty datasource schema on the broker

* review comment

* Remove unused import
2024-06-28 11:06:56 +05:30
Clint Wylie 37a50e6803
Remove index_realtime and index_realtime_appenderator tasks (#16602)
index_realtime tasks were removed from the documentation in #13107. Even
at that time, they weren't really documented per se— just mentioned. They
existed solely to support Tranquility, which is an obsolete ingestion
method that predates migration of Druid to ASF and is no longer being
maintained. Tranquility docs were also de-linked from the sidebars and
the other doc pages in #11134. Only a stub remains, so people with
links to the page can see that it's no longer recommended.

index_realtime_appenderator tasks existed in the code base, but were
never documented, nor as far as I am aware were they used for any purpose.

This patch removes both task types completely, as well as removes all
supporting code that was otherwise unused. It also updates the stub
doc for Tranquility to be firmer that it is not compatible. (Previously,
the stub doc said it wasn't recommended, and pointed out that it is
built against an ancient 0.9.2 version of Druid.)

ITUnionQueryTest has been migrated to the new integration tests framework and updated to use Kafka ingestion.

Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
2024-06-24 20:13:33 -07:00
Rishabh Singh 4eced9b3c9
Fix CentralizedDatasourceSchema group IT failure (#16636)
* Fix build

* Update datasource name in ITSystemTableBatchIndexTaskTest
2024-06-21 15:40:12 -07:00
George Shiqi Wu f7013e012c
Add new test for handoff API (#16492)
* Add new test for handoff API

* Add new method

* fix test

* Update test
2024-05-28 12:57:51 -07:00
Zoltan Haindrich 44ea4e1c51
Fix cds-coordinator-metadata-query-disabled (#16488)
fixes the issue with the newly enabled `cds-coordiantor-metadata-query-disabled` [split](https://github.com/apache/druid/pull/16468)
* configures to use `prepopulated-data` environment things to configure `S3` for access 
* this is needed because these tests use a [dataset which is loaded from s3](https://github.com/apache/druid/blob/master/integration-tests/docker/test-data/cds-coordinator-metadata-query-disabled-sample-data.sql)
* also undoes the previous [fix](https://github.com/apache/druid/pull/16469) of setting the aws region explicitly as this is a more complete solution - and configuring `prepopulated-data` also sets the region; so that's not needed anymore
2024-05-22 20:42:11 +02:00
Zoltan Haindrich c948201507
Fix cds-task-schema-publish-disabled (#16469)
set AWS_REGION=us-west-2 to avoid retries
2024-05-21 12:18:30 +05:30
zachjsh dd5dc500ce
Catalog integration tests (#16424)
* * add new catalog IT with failure to ensure that it is run in CI

* * actually add failing test referred to and fix checkstyle

* * add some tests

* * fix checkstyle

* * add test descriptions

* * add more tests
2024-05-17 11:49:09 -04:00
Akshat Jain bacdb4c48d
Update integration tests related documentation for better clarity (#16313) 2024-05-13 11:27:21 +05:30
Alberic Liu 92fb0ff718
upgrade mysql:mysql-connector-java to 8.2.0 (#16024)
* upgrade mysql:mysql-connector-java to 8.2.0

* fix the check errors

* remove unused comment
2024-05-06 21:58:37 +08:00
Rishabh Singh c61c3785a0
Followup changes to 15817 (Segment schema publishing and polling) (#16368)
* Fix build

* Nit changes in KillUnreferencedSegmentSchema

* Replace reference to the abbreviation SMQ with Metadata Query, rename inTransit maps in schema cache

* nitpicks

* Remove reference to smq abbreviation from integration-tests

* Remove reference to smq abbreviation from integration-tests

* minor change

* Update index.md

* Add delimiter while computing schema fingerprint hash
2024-05-03 19:13:52 +05:30
Kashif Faraz e5b40b0b8c
Miscellaneous cleanup of load queue references (#16367)
Changes:
- Rename `DataSegmentChangeRequestAndStatus` to `DataSegmentChangeResponse`
- Rename `SegmentLoadDropHandler.Status` to `SegmentChangeStatus`
- Remove method `CoordinatorRunStats.getSnapshotAndReset()` as it was used only in
load queue peon implementations. Using an atomic reference is much simpler.
- Remove `ServerTestHelper.MAPPER`. Use existing `TestHelper.makeJsonMapper()` instead.
2024-05-02 15:59:50 +05:30
Gian Merlino 5d1950d451
MSQ controller: Support in-memory shuffles; towards JVM reuse. (#16168)
* MSQ controller: Support in-memory shuffles; towards JVM reuse.

This patch contains two controller changes that make progress towards a
lower-latency MSQ.

First, support for in-memory shuffles. The main feature of in-memory shuffles,
as far as the controller is concerned, is that they are not fully buffered. That
means that whenever a producer stage uses in-memory output, its consumer must run
concurrently. The controller determines which stages run concurrently, and when
they start and stop.

"Leapfrogging" allows any chain of sort-based stages to use in-memory shuffles
even if we can only run two stages at once. For example, in a linear chain of
stages 0 -> 1 -> 2 where all do sort-based shuffles, we can use in-memory shuffling
for each one while only running two at once. (When stage 1 is done reading input
and about to start writing its output, we can stop 0 and start 2.)

1) New OutputChannelMode enum attached to WorkOrders that tells workers
   whether stage output should be in memory (MEMORY), or use local or durable
   storage.

2) New logic in the ControllerQueryKernel to determine which stages can use
   in-memory shuffling (ControllerUtils#computeStageGroups) and to launch them
   at the appropriate time (ControllerQueryKernel#createNewKernels).

3) New "doneReadingInput" method on Controller (passed down to the stage kernels)
   which allows stages to transition to POST_READING even if they are not
   gathering statistics. This is important because it enables "leapfrogging"
   for HASH_LOCAL_SORT shuffles, and for GLOBAL_SORT shuffles with 1 partition.

4) Moved result-reading from ControllerContext#writeReports to new QueryListener
   interface, which ControllerImpl feeds results to row-by-row while the query
   is still running. Important so we can read query results from the final
   stage using an in-memory channel.

5) New class ControllerQueryKernelConfig holds configs that control kernel
   behavior (such as whether to pipeline, maximum number of concurrent stages,
   etc). Generated by the ControllerContext.

Second, a refactor towards running workers in persistent JVMs that are able to
cache data across queries. This is helpful because I believe we'll want to reuse
JVMs and cached data for latency reasons.

1) Move creation of WorkerManager and TableInputSpecSlicer to the
   ControllerContext, rather than ControllerImpl. This allows managing workers and
   work assignment differently when JVMs are reusable.

2) Lift the Controller Jersey resource out from ControllerChatHandler to a
   reusable resource.

3) Move memory introspection to a MemoryIntrospector interface, and introduce
   ControllerMemoryParameters that uses it. This makes it easier to run MSQ in
   process types other than Indexer and Peon.

Both of these areas will have follow-ups that make similar changes on the
worker side.

* Address static checks.

* Address static checks.

* Fixes.

* Report writer tests.

* Adjustments.

* Fix reports.

* Review updates.

* Adjust name.

* Small changes.
2024-04-30 21:30:27 -07:00
Adarsh Sanjeev 9a2d7c28bc
Prepare master branch for 31.0.0 release (#16333) 2024-04-26 09:22:43 +05:30
Rishabh Singh e30790e013
Introduce Segment Schema Publishing and Polling for Efficient Datasource Schema Building (#15817)
Issue: #14989

The initial step in optimizing segment metadata was to centralize the construction of datasource schema in the Coordinator (#14985). Thereafter, we addressed the problem of publishing schema for realtime segments (#15475). Subsequently, our goal is to eliminate the requirement for regularly executing queries to obtain segment schema information.

This is the final change which involves publishing segment schema for finalized segments from task and periodically polling them in the Coordinator.
2024-04-24 22:22:53 +05:30
Laksh Singla b9bbde5c0a
Fix deadlock that can occur while merging group by results (#15420)
This PR prevents such a deadlock from happening by acquiring the merge buffers in a single place and passing it down to the runner that might need it.
2024-04-22 14:10:44 +05:30
Kashif Faraz 81d7b6ebe1
Fix OverlordClient to read reports as a concrete `ReportMap` (#16226)
Follow up to #16217 

Changes:
- Update `OverlordClient.getReportAsMap()` to return `TaskReport.ReportMap`
- Move the following classes to `org.apache.druid.indexer.report` in the `druid-processing` module
  - `TaskReport`
  - `KillTaskReport`
  - `IngestionStatsAndErrorsTaskReport`
  - `TaskContextReport`
  - `TaskReportFileWriter`
  - `SingleFileTaskReportFileWriter`
  - `TaskReportSerdeTest`
- Remove `MsqOverlordResourceTestClient` as it had only one method
which is already present in `OverlordResourceTestClient` itself
2024-04-15 08:00:59 +05:30
YongGang da9feb4430
Introduce TaskContextReport for reporting task context (#16041)
Changes:
- Add `TaskContextEnricher` interface to improve task management and monitoring
- Invoke `enrichContext` in `TaskQueue.add()` whenever a new task is submitted to the Overlord
- Add `TaskContextReport` to write out task context information in reports
2024-04-12 08:57:49 +05:30
Zoltan Haindrich 1df41db46d
Migrate to use docker compose v2 (#16232)
https://github.com/actions/runner-images/issues/9557
2024-04-03 12:32:55 +02:00
Kashif Faraz 4df4896674
Refactor: Add common method in AbstractBatchIndexTask to create ingestion stats report (#16202)
Changes
-  No functional changes
- Add method `AbstractBatchIndexTask.buildIngestionStatsReport()` used in several batch tasks
- Add utility method `AbstractBatchIndexTask.addBuildSegmentStatsToReport()`
- Use boolean argument to represent a full report instead of the String `full` 
in internal methods. (REST API remains unchanged.)
- Rename `IngestionStatsAndErrorsTaskReportData` to `IngestionStatsAndErrors`
- Clean up some of the methods
2024-03-28 23:07:00 +05:30
Adarsh Sanjeev 86a24012a6
Add security ITs for sending tasks to overlord (#16131)
* Add security ITs for sending tasks to overlord

* Add security ITs for sending tasks to overlord

* Resolve test flakiness
2024-03-18 09:33:40 +05:30
Kashif Faraz 1682d4570d
Increase delay to allow propagation of credentials (#16143) 2024-03-17 14:47:42 +05:30
Adithya Chakilam 564c44ed85
Add stats segmentsRead and segmentsPublished to compaction task reports (#15947)
Changes:
- Add visibility into number of segments read/published by each parallel compaction
- Add new fields `segmentsRead`, `segmentsPublished` to `IngestionStatsAndErrorsTaskReportData`
- Update `ParallelIndexSupervisorTask` to populate the new stats
2024-03-07 09:37:23 +05:30
Adithya Chakilam ec52f686c0
Fix compaction tasks reports getting overwritten (#15981)
* Fix compaction tasks reports geting overwrittened

* only skip for compactiont task

* address comments

* fix boolean

* move boolean flag to task rather than spec

* rename variable

* add docs, fix missing case

* Update docs/ingestion/tasks.md

* rename var

* add task report decode check in IT

* change assert
2024-03-04 10:10:17 -05:00
Sensor e0bce0ef90
Add pre-check for heavy debug logs (#15706)
Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
Co-authored-by: Benedict Jin <asdf2014@apache.org>
2024-02-29 12:58:14 +05:30
Abhishek Agarwal ddfc31d7ed
Reduce the size of distribution docker image (#15968)
This PR creates symlinks when there are duplicate jars present in the extension. Docker image includes contrib extensions, too, and the size of the image has bloated up quite a lot of late. This change also fixes "ITNestedQueryPushDownTest integration test"
2024-02-26 21:18:55 +05:30
Adarsh Sanjeev 9eaaeb5c16
Add security ITs to the revised integration tests (#15885)
* Add IT for security

* Add admin client

* Clean up code

* Clean up code

* Address review comments
2024-02-20 11:32:08 +05:30
Abhishek Radhakrishnan a7918be268
Temporarily bump up the delay in auth IT from 5s to 10s. (#15765)
A more ideal/permanent fix would be to have status checks exposed by the services,
but that'll require more code changes. So temporarily bump it to unblock CI now.
2024-01-26 11:52:27 -05:00
Karan Kumar c4990f56d6
Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
Kashif Faraz f0c552b2f9
Fix basic auth integration test (#15679)
* Add some retries

* Add a delay to allow creds to propagate

* Checkstyle and stuff
2024-01-14 08:59:15 -08:00
Rishabh Singh 71f5307277
Eliminate Periodic Realtime Segment Metadata Queries: Task Now Publish Schema for Seamless Coordinator Updates (#15475)
The initial step in optimizing segment metadata was to centralize the construction of datasource schema in the Coordinator (#14985). Subsequently, our goal is to eliminate the requirement for regularly executing queries to obtain segment schema information. This task encompasses addressing both realtime and finalized segments.

This modification specifically addresses the issue with realtime segments. Tasks will now routinely communicate the schema for realtime segments during the segment announcement process. The Coordinator will identify the schema alongside the segment announcement and subsequently update the schema for realtime segments in the metadata cache.
2024-01-10 08:55:56 +05:30
Kashif Faraz cce539495d
[Flaky test] Fix basic auth integration test (#15561)
Database slowness while doing audits seems to be causing flakiness in auth ITs.

The failing test is almost always
`ITBasicAuthConfigurationTest.test_avaticaQuery_datasourceAndContextParamsUser`
but in some rare cases, other tests fail too. Alternately, this failing test has been seen to pass too.

It is most likely because the auth changes are not able to propagate in time from
the coordinator to other services.

Fix: Just log the audits rather than persisting them to database.
Most audits have been newly added and it is okay to not have them persisted.
Moreover, logging audits can also be more beneficial while debugging an IT.
2023-12-23 12:11:12 +05:30