druid

Commit Graph

Author	SHA1	Message	Date
Gian Merlino	6a9c050095	DruidOverlord: Move becomeLeader/stopBeingLeader earlier. (#17415 ) * DruidOverlord: Move becomeLeader/stopBeingLeader earlier. On becoming leader, it is helpful for the TaskRunner and TaskQueue to be available when the SupervisorManager starts up, to aid the supervisors in discovering their tasks. On stopping leadership, it is helpful for the TaskRunner and TaskQueue to be available until the SupervisorManager has finished shutting down. They are only available when the TaskMaster is in "leader" mode, so to achieve the above, this patch moves it earlier in the sequence. * Adjust leadership into two phases. * Update test. * Adjustments for coverage. * Stop mirrors start better.	2024-10-28 20:43:13 -07:00
Gian Merlino	c4b513e599	SeekableStreamSupervisor: Don't await task futures in workerExec. (#17403 ) Following #17394, workerExec can get deadlocked with itself, because it waits for task futures and is also used as the connectExec for the task client. To fix this, we need to never await task futures in the workerExec. There are two specific changes: in "verifyAndMergeCheckpoints" and "checkpointTaskGroup", two "coalesceAndAwait" calls that formerly occurred in workerExec are replaced with Futures.transform (using a callback in workerExec). Because this adjustment removes a source of blocking, it may also improve supervisor responsiveness for high task counts. This is not the primary goal, however. The primary goal is to fix the bug introduced by #17394.	2024-10-24 12:07:18 -07:00
Gian Merlino	60daddedf8	SeekableStreamSupervisor: Use workerExec as the client connectExec. (#17394 ) * SeekableStreamSupervisor: Use workerExec as the client connectExec. This patch uses the already-existing per-supervisor workerExec as the connectExec for task clients, rather than using the process-wide default ServiceClientFactory pool. This helps prevent callbacks from backlogging on the process-wide pool. It's especially useful for retries, where callbacks may need to establish new TCP connections or perform TLS handshakes. * Fix compilation, tests. * Fix style.	2024-10-22 20:21:21 -07:00
Vishesh Garg	5da9949992	Fail MSQ compaction if multi-valued partition dimensions are found (#17344 ) MSQ currently supports only single-valued string dimensions as partition keys. This patch adds a check to ensure that partition keys are single-valued in case this info is available by virtue of segment download for schema inference. During compaction, if MSQ finds multi-valued dimensions (MVDs) declared as part of `range` partitionsSpec, it switches partitioning type to dynamic, ending up in repeated compactions of the same interval. To avoid this scenario, the segment download logic is also updated to always download segments if info on multi-valued dimensions is required.	2024-10-19 13:33:33 +05:30
Adithya Chakilam	e834e49290	supervisor/autoscaler: Fix clearing of collected lags on skipped scale actions (#17356 ) * superviosr/autoscaler: Fix clearing of collected lags on skipped scale actions * comments * supervisor/autoscaler: Skip scaling when partitions are less than minTaskCount (#17335) * Fix pip installation after ubuntu upgrade (#17358) * fix tests --------- Co-authored-by: Pranav <pranavbhole@gmail.com>	2024-10-17 11:05:16 -07:00
Adithya Chakilam	c57bd3b438	supervisor/autoscaler: Skip scaling when partitions are less than minTaskCount (#17335 )	2024-10-15 14:12:53 -07:00
Kashif Faraz	3f797c52d0	Fix duplicate compaction task launched by OverlordCompactionScheduler (#17287 ) Description ----------- The `OverlordCompactionScheduler` may sometimes launch a duplicate compaction task for an interval that has just been compacted. This may happen as follows: - Scheduler launches a compaction task for an uncompacted interval. - While the compaction task is running, the `CompactionStatusTracker` does not consider this interval as compactible and returns the `CompactionStatus` as `SKIPPED` for it. - As soon as the compaction task finishes, the `CompactionStatusTracker` starts considering the interval eligible for compaction again. - This interval remains eligible for compaction until the newly published segments are polled from the database. - Once the new segments have been polled, the `CompactionStatus` of the interval changes to `COMPLETE`. Change -------- - Keep track of the `snapshotTime` in `DataSourcesSnapshot`. This time represents the start of the poll. - Use the `snapshotTime` to determine if a poll has happened after a compaction task completed. - If not, then skip the interval to avoid launching duplicate tasks. - For tests, use a future `snapshotTime` to ensure that compaction is always triggered.	2024-10-10 08:44:09 +05:30
AmatyaAvadhanula	f42ecc9f25	Fail concurrent replace tasks with finer segment granularity than append (#17265 )	2024-10-08 07:35:13 +05:30
George Shiqi Wu	5d7c7a87ec	Add maximumCapacity to taskRunner (#17107 ) * Add maximumCapacity to taskRunner * fix tests * pr comments	2024-10-07 15:03:51 -04:00
AmatyaAvadhanula	ff97c67945	Fix batch segment allocation failure with replicas (#17262 ) Fixes #16587 Streaming ingestion tasks operate by allocating segments before ingesting rows. These allocations happen across replicas which may send different requests but must get the same segment id for a given (datasource, interval, version, sequenceName) across replicas. This patch fixes the bug by ignoring the previousSegmentId when skipLineageCheck is true.	2024-10-07 19:52:38 +05:30
Vishesh Garg	7e35e50052	Fix issues with MSQ Compaction (#17250 ) The patch makes the following changes: 1. Fixes a bug causing compaction to fail on array, complex, and other non-primitive-type columns 2. Updates compaction status check to be conscious of partition dimensions when comparing dimension ordering. 3. Ensures only string columns are specified as partition dimensions 4. Ensures `rollup` is true if and only if metricsSpec is non-empty 5. Ensures disjoint intervals aren't submitted for compaction 6. Adds `compactionReason` to compaction task context.	2024-10-06 21:48:26 +05:30
Clint Wylie	0bd13bcd51	Projections prototype (#17214 )	2024-10-05 04:38:57 -07:00
Arun Ramani	e5d027ee1c	Skip generating task context reports for sub tasks (#17219 ) * Skip task context for sub tasks * DRY a little + skip context for live report	2024-10-02 09:32:50 -04:00
Hardik Bajaj	3d56fa6f56	Improve logging to include taskId in segment handoff notifier thread (#17185 )	2024-10-01 15:34:39 +05:30
Shivam Garg	ab361747a8	Migrated commons-lang usages to commons-lang3 (#17156 )	2024-09-28 10:28:11 +02:00
Clint Wylie	d77637344d	log.warn anytime a column is relying on ArrayIngestMode.MVD (#17164 ) * log.warn anytime a column is relying on ArrayIngestMode.MVD	2024-09-26 13:44:37 +05:30
Abhishek Radhakrishnan	9132a65a48	Add `StreamSupervisor` interface (#17151 ) Follow up to #17137. Instead of moving the streaming-only methods to the SeekableStreamSupervisor abstract class, this patch moves them to a separate StreamSupervisor interface. The reason is that the SeekableStreamSupervisor abstract class also has many other abstract methods. The StreamSupervisor interface on the other hand provides a minimal set of functions offering a good middle ground for any custom concrete implementation that doesn't require all the goodies from SeekableStreamSupervisor.	2024-09-25 14:52:39 +05:30
Abhishek Radhakrishnan	83299e9882	Miscellaneous cleanup in the supervisor API flow. (#17144 ) Extracting a few miscellaneous non-functional changes from the batch supervisor branch: - Replace anonymous inner classes with lambda expressions in the SQL supervisor manager layer - Add explicit @Nullable annotations in DynamicConfigProviderUtils to make IDE happy - Small variable renames (copy-paste error perhaps) and fix typos - Add table name for this exception message: Delete the supervisor from the table[%s] in the database... - Prefer CollectionUtils.isEmptyOrNull() over list == null \|\| list.size() > 0. We can change the Precondition checks to throwing DruidException separately for a batch of APIs at a time.	2024-09-24 13:06:23 -07:00
Abhishek Radhakrishnan	5c862f6ed9	Refactor: Move streaming supervisor methods to `SeekableStreamSupervisor` (#17137 ) The current Supervisor interface is primarily focused on streaming use cases. However, as we introduce supervisors for non-streaming use cases, such as the recently added CompactionSupervisor (and the upcoming BatchSupervisor), certain operations like resetting offsets, checkpointing, task group handoff, etc., are not really applicable to non-streaming use cases. So the methods are split between: 1. Supervisor: common methods that are applicable to both streaming and non-streaming use cases 2. SeekableStreamSupervisor: Supervisor + streaming-only operations. The existing streaming-only overrides exist along with the new abstract method public abstract LagStats computeLagStats(), for which custom implementations already exist in the concrete types This PR is primarily a refactoring change with minimal functional adjustments (e.g., throwing an exception in a few places in SupervisorManager when the supervisor isn't the expected SeekableStreamSupervisor type).	2024-09-24 10:46:37 -07:00
Kashif Faraz	9670305669	Cleanup Coordinator logs, add duty status API (#16959 ) Description ----------- Coordinator logs are fairly noisy and don't give much useful information (see example below). Even when the Coordinator misbehaves, these logs are not very useful. Main changes ------------ - Add API `GET /druid/coordinator/v1/duties` that returns a status list of all duty groups currently running on the Coordinator - Emit metrics `segment/poll/time`, `segment/pollWithSchema/time`, `segment/buildSnapshot/time` - Remove redundant logs that indicate normal operation of well-tested aspects of the Coordinator Refactors --------- - Move some logic from `DutiesRunnable` to `CoordinatorDutyGroup` - Move stats collection from `CollectSegmentAndServerStats` to `PrepareBalancerAndLoadQueues` - Minor cleanup of class `DruidCoordinator` - Clean up class `DruidCoordinatorRuntimeParams` - Remove field `coordinatorStartTime`. Maintain start time in `MarkOvershadowedSegmentsAsUnused` instead. - Remove field `MetadataRuleManager`. Pass supplier to constructor of applicable duties instead. - Make `usedSegmentsNewestFirst` and `datasourcesSnapshot` as non-nullable as they are always required.	2024-09-24 19:46:22 +05:30
Vishesh Garg	f576e299db	Allow MSQ engine only for compaction supervisors (#17033 ) #16768 added the functionality to run compaction as a supervisor on the overlord. This patch builds on top of that to restrict MSQ engine to compaction in the supervisor-mode only. With these changes, users can no longer add MSQ engine as part of datasource compaction config, or as the default cluster-level compaction engine, on the Coordinator. The patch also adds an Overlord runtime property `druid.supervisor.compaction.engine=<msq/native>` to specify the default engine for compaction supervisors. Since these updates require major changes to existing MSQ compaction integration tests, this patch disables MSQ-specific compaction integration tests -- they will be taken up in a follow-up PR. Key changed/added classes in this patch: * CompactionSupervisor * CompactionSupervisorSpec * CoordinatorCompactionConfigsResource * OverlordCompactionScheduler	2024-09-24 17:19:16 +05:30
PANKAJ KUMAR	36dfff4b1a	Adding extra debug logs for the checkpoint logic (#16321 ) Logging to understand checkpointing better in streaming ingestion	2024-09-24 09:38:46 +05:30
Abhishek Radhakrishnan	635e418131	Support to parse numbers in text-based input formats (#17082 ) Text-based input formats like csv and tsv currently parse inputs only as strings, following the RFC4180Parser spec). To workaround this, the web-console and other tools need to further inspect the sample data returned to sample data returned by the Druid sampler API to parse them as numbers. This patch introduces a new optional config, tryParseNumbers, for the csv and tsv input formats. If enabled, any numbers present in the input will be parsed in the following manner -- long data type for integer types and double for floating-point numbers, and if parsing fails for whatever reason, the input is treated as a string. By default, this configuration is set to false, so numeric strings will be treated as strings.	2024-09-19 13:21:18 -07:00
Clint Wylie	4f137d2700	hard-code compaction tasks to use ARRAY for multi-value handling to preserve order (#17110 )	2024-09-19 11:56:12 -07:00
Misha	6aad9b08dd	Fix low sonatype findings (#17017 ) Fixed vulnerabilities CVE-2021-26291 : Apache Maven is vulnerable to Man-in-the-Middle (MitM) attacks. Various functions across several files, mentioned below, allow for custom repositories to use the insecure HTTP protocol. An attacker can exploit this as part of a Man-in-the-Middle (MitM) attack, taking over or impersonating a repository using the insecure HTTP protocol. Unsuspecting users may then have the compromised repository defined as a dependency in their Project Object Model (pom) file and download potentially malicious files from it. Was fixed by removing outdated tesla-aether library containing vulnerable maven-settings (v3.1.1) package, pull-deps utility updated to use maven resolver instead. sonatype-2020-0244 : The joni package is vulnerable to Man-in-the-Middle (MitM) attacks. This project downloads dependencies over HTTP due to an insecure repository configuration within the .pom file. Consequently, a MitM could intercept requests to the specified repository and replace the requested dependencies with malicious versions, which can execute arbitrary code from the application that was built with them. Was fixed by upgrading joni package to recommended 2.1.34 version	2024-09-16 16:10:25 +05:30
Clint Wylie	aa6336c5cf	add DataSchema.Builder to tidy stuff up a bit (#17065 ) * add DataSchema.Builder to tidy stuff up a bit * fixes * fixes * more style fixes * review stuff	2024-09-15 11:18:34 -07:00
Abhishek Radhakrishnan	5ef94c9dee	Add support for selective loading of broadcast datasources in the task layer (#17027 ) Tasks control the loading of broadcast datasources via BroadcastDatasourceLoadingSpec getBroadcastDatasourceLoadingSpec(). By default, tasks download all broadcast datasources, unless there's an override as with kill and MSQ controller task. The CLIPeon command line option --loadBroadcastSegments is deprecated in favor of --loadBroadcastDatasourceMode. Broadcast datasources can be specified in SQL queries through JOIN and FROM clauses, or obtained from other sources such as lookups.To this effect, we have introduced a BroadcastDatasourceLoadingSpec. Finding the set of broadcast datasources during SQL planning will be done in a follow-up, which will apply only to MSQ tasks, so they load only required broadcast datasources. This PR primarily focuses on the skeletal changes around BroadcastDatasourceLoadingSpec and integrating it from the Task interface via CliPeon to SegmentBootstrapper. Currently, only kill tasks and MSQ controller tasks skip loading broadcast datasources.	2024-09-12 13:30:28 -04:00
Pranav	a95397e712	Allow request headers in HttpInputSource in native and MSQ Ingestion (#16974 ) Support for adding the request headers in http input source. we can now pass the additional headers as json in both native and MSQ.	2024-09-12 11:18:44 +05:30
George Shiqi Wu	428f58cf15	Support maxColumnsToMerge in supervisor tuningConfig (#17030 ) * support maxColumnsToMerge in supervisor specs * remove log line * fix style * add docs * fix unit tests	2024-09-11 18:00:13 -04:00
Laksh Singla	72fbaf2e56	Non querying tasks shouldn't use processing buffers / merge buffers (#16887 ) Tasks that do not support querying or query processing i.e. supportsQueries = false do not require processing threads, processing buffers, and merge buffers.	2024-09-10 11:36:36 +05:30
Abhishek Agarwal	78775ad398	Prepare master for 32.0.0 release (#17022 )	2024-09-10 11:01:20 +05:30
Clint Wylie	f57cd6f7af	transition away from StorageAdapter (#16985 ) * transition away from StorageAdapter changes: * CursorHolderFactory has been renamed to CursorFactory and moved off of StorageAdapter, instead fetched directly from the segment via 'asCursorFactory'. The previous deprecated CursorFactory interface has been merged into StorageAdapter * StorageAdapter is no longer used by any engines or tests and has been marked as deprecated with default implementations of all methods that throw exceptions indicating the new methods to call instead * StorageAdapter methods not covered by CursorFactory (CursorHolderFactory prior to this change) have been moved into interfaces which are retrieved by Segment.as, the primary classes are the previously existing Metadata, as well as new interfaces PhysicalSegmentInspector and TopNOptimizationInspector * added UnnestSegment and FilteredSegment that extend WrappedSegmentReference since their StorageAdapter implementations were previously provided by WrappedSegmentReference * added PhysicalSegmentInspector which covers some of the previous StorageAdapter functionality which was primarily used for segment metadata queries and other metadata uses, and is implemented for QueryableIndexSegment and IncrementalIndexSegment * added TopNOptimizationInspector to cover the oddly specific StorageAdapter.hasBuiltInFilters implementation, which is implemented for HashJoinSegment, UnnestSegment, and FilteredSegment * Updated all engines and tests to no longer use StorageAdapter	2024-09-09 14:55:29 -07:00
Kashif Faraz	ba6f804f48	Fix compaction status API response (#17006 ) Description: #16768 introduces new compaction APIs on the Overlord `/compact/status` and `/compact/progress`. But the corresponding `OverlordClient` methods do not return an object compatible with the actual endpoints defined in `OverlordCompactionResource`. This patch ensures that the objects are compatible. Changes: - Add `CompactionStatusResponse` and `CompactionProgressResponse` - Use these as the return type in `OverlordClient` methods and as the response entity in `OverlordCompactionResource` - Add `SupervisorCleanupModule` bound on the Coordinator to perform cleanup of supervisors. Without this module, Coordinator cannot deserialize compaction supervisors.	2024-09-05 23:22:01 +05:30
Vishesh Garg	e28424ea25	Enable rollup on multi-value dimensions for compaction with MSQ engine (#16937 ) Currently compaction with MSQ engine doesn't work for rollup on multi-value dimensions (MVDs), the reason being the default behaviour of grouping on MVD dimensions to unnest the dimension values; for instance grouping on `[s1,s2]` with aggregate `a` will result in two rows: `<s1,a>` and `<s2,a>`. This change enables rollup on MVDs (without unnest) by converting MVDs to Arrays before rollup using virtual columns, and then converting them back to MVDs using post aggregators. If segment schema is available to the compaction task (when it ends up downloading segments to get existing dimensions/metrics/granularity), it selectively does the MVD-Array conversion only for known multi-valued columns; else it conservatively performs this conversion for all `string` columns.	2024-09-04 16:28:04 +05:30
Kashif Faraz	fe3d589ff9	Run compaction as a supervisor on Overlord (#16768 ) Description ----------- Auto-compaction currently poses several challenges as it: 1. may get stuck on a failing interval. 2. may get stuck on the latest interval if more data keeps coming into it. 3. always picks the latest interval regardless of the level of compaction in it. 4. may never pick a datasource if its intervals are not very recent. 5. requires setting an explicit period which does not cater to the changing needs of a Druid cluster. This PR introduces various improvements to compaction scheduling to tackle the above problems. Change Summary -------------- 1. Run compaction for a datasource as a supervisor of type `autocompact` on Overlord. 2. Make compaction policy extensible and configurable. 3. Track status of recently submitted compaction tasks and pass this info to policy. 4. Add `/simulate` API on both Coordinator and Overlord to run compaction simulations. 5. Redirect compaction status APIs to the Overlord when compaction supervisors are enabled.	2024-09-02 07:53:13 +05:30
Virushade	0217c8c541	Change Inspection Profile to set "Method is identical to its super method" as error (#16976 ) * Make IntelliJ's MethodIsIdenticalToSuperMethod an error * Change codebase to follow new IntelliJ inspection * Restore non-short-circuit boolean expressions to pass tests	2024-08-31 09:37:34 +05:30
Gian Merlino	5d2ed33b89	Place __time in signatures according to sort order. (#16958 ) * Place __time in signatures according to sort order. Updates a variety of places to put __time in row signatures according to its position in the sort order, rather than always first, including: - InputSourceSampler. - ScanQueryEngine (in the default signature when "columns" is empty). - Various StorageAdapters, which also have the effect of reordering the column order in segmentMetadata queries, and therefore in SQL schemas as well. Follow-up to #16849. * Fix compilation. * Additional fixes. * Fix. * Fix style. * Omit nonexistent columns from the row signature. * Fix tests.	2024-08-26 21:45:51 -07:00
George Shiqi Wu	7ee7e194c4	Add supervisor log when task count is greater than partitions (#16948 ) * Add log message when task count is higher than partitions * newline * fix ordering * Add supervisor id * Update indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/supervisor/SeekableStreamSupervisor.java Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com> * Update indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/supervisor/SeekableStreamSupervisor.java Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com> * Update indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/supervisor/SeekableStreamSupervisor.java Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com> --------- Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>	2024-08-26 07:40:02 -07:00
Gian Merlino	0603d5153d	Segments sorted by non-time columns. (#16849 ) * Segments primarily sorted by non-time columns. Currently, segments are always sorted by __time, followed by the sort order provided by the user via dimensionsSpec or CLUSTERED BY. Sorting by __time enables efficient execution of queries involving time-ordering or granularity. Time-ordering is a simple matter of reading the rows in stored order, and granular cursors can be generated in streaming fashion. However, for various workloads, it's better for storage footprint and query performance to sort by arbitrary orders that do not start with __time. With this patch, users can sort segments by such orders. For spec-based ingestion, users add "useExplicitSegmentSortOrder: true" to dimensionsSpec. The "dimensions" list determines the sort order. To define a sort order that includes "__time", users explicitly include a dimension named "__time". For SQL-based ingestion, users set the context parameter "useExplicitSegmentSortOrder: true". The CLUSTERED BY clause is then used as the explicit segment sort order. In both cases, when the new "useExplicitSegmentSortOrder" parameter is false (the default), __time is implicitly prepended to the sort order, as it always was prior to this patch. The new parameter is experimental for two main reasons. First, such segments can cause errors when loaded by older servers, due to violating their expectations that timestamps are always monotonically increasing. Second, even on newer servers, not all queries can run on non-time-sorted segments. Scan queries involving time-ordering and any query involving granularity will not run. (To partially mitigate this, a currently-undocumented SQL feature "sqlUseGranularity" is provided. When set to false the SQL planner avoids using "granularity".) Changes on the write path: 1) DimensionsSpec can now optionally contain a __time dimension, which controls the placement of __time in the sort order. If not present, __time is considered to be first in the sort order, as it has always been. 2) IncrementalIndex and IndexMerger are updated to sort facts more flexibly; not always by time first. 3) Metadata (stored in metadata.drd) gains a "sortOrder" field. 4) MSQ can generate range-based shard specs even when not all columns are singly-valued strings. It merely stops accepting new clustering key fields when it encounters the first one that isn't a singly-valued string. This is useful because it enables range shard specs on "someDim" to be created for clauses like "CLUSTERED BY someDim, __time". Changes on the read path: 1) Add StorageAdapter#getSortOrder so query engines can tell how a segment is sorted. 2) Update QueryableIndexStorageAdapter, IncrementalIndexStorageAdapter, and VectorCursorGranularizer to throw errors when using granularities on non-time-ordered segments. 3) Update ScanQueryEngine to throw an error when using the time-ordering "order" parameter on non-time-ordered segments. 4) Update TimeBoundaryQueryRunnerFactory to perform a segment scan when running on a non-time-ordered segment. 5) Add "sqlUseGranularity" context parameter that causes the SQL planner to avoid using granularities other than ALL. Other changes: 1) Rename DimensionsSpec "hasCustomDimensions" to "hasFixedDimensions" and change the meaning subtly: it now returns true if the DimensionsSpec represents an unchanging list of dimensions, or false if there is some discovery happening. This is what call sites had expected anyway. * Fixups from CI. * Fixes. * Fix missing arg. * Additional changes. * Fix logic. * Fixes. * Fix test. * Adjust test. * Remove throws. * Fix styles. * Fix javadocs. * Cleanup. * Smoother handling of null ordering. * Fix tests. * Missed a spot on the merge. * Fixups. * Avoid needless Filters.and. * Add timeBoundaryInspector to test. * Fix tests. * Fix FrameStorageAdapterTest. * Fix various tests. * Use forceSegmentSortByTime instead of useExplicitSegmentSortOrder. * Pom fix. * Fix doc.	2024-08-23 08:24:43 -07:00
Gian Merlino	a83125e4a0	Track IngestionState more accurately in realtime tasks. (#16934 ) Previously, SeekableStreamIndexTaskRunner set ingestion state to COMPLETED when it finished reading data from Kafka. This is incorrect. After the changes in this patch, the transitions go: 1) The task stays in BUILD_SEGMENTS after it finishes reading from Kafka, while it is building its final set of segments to publish. 2) The task transitions to SEGMENT_AVAILABILITY_WAIT after publishing, while waiting for handoff. 3) The task transitions to COMPLETED immediately before exiting, when truly done.	2024-08-22 11:43:46 +05:30
Clint Wylie	4283b270e3	rework cursor creation (#16533 ) changes: * Added `CursorBuildSpec` which captures all of the 'interesting' stuff that goes into producing a cursor as a replacement for the method arguments of `CursorFactory.canVectorize`, `CursorFactory.makeCursor`, and `CursorFactory.makeVectorCursor` * added new interface `CursorHolder` and new interface `CursorHolderFactory` as a replacement for `CursorFactory`, with method `makeCursorHolder`, which takes a `CursorBuildSpec` as an argument and replaces `CursorFactory.canVectorize`, `CursorFactory.makeCursor`, and `CursorFactory.makeVectorCursor` * `CursorFactory.makeCursors` previously returned a `Sequence<Cursor>` corresponding to the query granularity buckets, with a separate `Cursor` per bucket. `CursorHolder.asCursor` instead returns a single `Cursor` (equivalent to 'ALL' granularity), and a new `CursorGranularizer` has been added for query engines to iterate over the cursor and divide into granularity buckets. This makes the non-vectorized engine behave the same way as the vectorized query engine (with its `VectorCursorGranularizer`), and simplifies a lot of stuff that has to read segments particularly if it does not care about bucketing the results into granularities. * Deprecated `CursorFactory`, `CursorFactory.canVectorize`, `CursorFactory.makeCursors`, and `CursorFactory.makeVectorCursor` * updated all `StorageAdapter` implementations to implement `makeCursorHolder`, transitioned direct `CursorFactory` implementations to instead implement `CursorMakerFactory`. `StorageAdapter` being a `CursorMakerFactory` is intended to be a transitional thing, ideally will not be released in favor of moving `CursorMakerFactory` to be fetched directly from `Segment`, however this PR was already large enough so this will be done in a follow-up. * updated all query engines to use `makeCursorHolder`, granularity based engines to use `CursorGranularizer`.	2024-08-16 11:34:10 -07:00
Adithya Chakilam	a7dd436a32	Check if supervisor could be idle on startup (#16844 ) Fixes #13936 In cases where a supervisor is idle and the overlord is restarted for some reason, the supervisor would start spinning tasks again. In clusters where there are many low throughput streams, this would spike the task count unnecessarily. This commit compares the latest stream offset with the ones in metadata during the startup of supervisor and sets it to idle state if they match.	2024-08-09 14:42:48 +05:30
Hardik Bajaj	1cf3f4bebe	Fix Concurrent Task Insertion in pendingCompletionTaskGroups (#16834 ) Fix streaming task failures that may arise due to concurrent task insertion in pendingCompletionTaskGroups	2024-08-08 08:37:27 +05:30
Adarsh Sanjeev	2b81c18fd7	Refactor SemanticCreator (#16700 ) Refactors the SemanticCreator annotation. Moves the interface to the semantic package. Create a SemanticUtils to hold logic for storing semantic maps. Add FrameMaker interface.	2024-08-06 11:29:38 -05:00
Vishesh Garg	593c3b2150	Do not support non-idempotent aggregator in MSQ compaction (#16846 ) This PR adds checks for verification of DataSourceCompactionConfig and CompactionTask with msq engine to ensure: each aggregator in metricsSpec is idempotent metricsSpec is non-null when rollup is set to true Unit tests and existing compaction ITs have been updated accordingly.	2024-08-06 20:58:08 +05:30
Rushikesh Bankar	c8323d1a7c	Add indexer task success and failure metrics (#16829 ) This PR adds indexer-level task metrics- "indexer/task/failed/count" "indexer/task/success/count" the current "worker/task/completed/count" metric shows all the tasks completed irrespective of success or failure status so these metrics would help us get more visibility into the status of the completed tasks	2024-08-05 16:21:27 +05:30
AmatyaAvadhanula	92a40d8169	Add API to fetch conflicting task locks (#16799 ) * Add API to fetch conflicting active locks	2024-07-30 11:40:48 +05:30
Vishesh Garg	e9ea243d97	Enable compaction ITs on MSQ engine (#16778 ) Follow-up to #16291, this commit enables a subset of existing native compaction ITs on the MSQ engine. In the process, the following changes have been introduced in the MSQ compaction flow: - Populate `metricsSpec` in `CompactionState` from `querySpec` in `MSQControllerTask` instead of `dataSchema` - Add check for pre-rolled-up segments having `AggregatorFactory` with different input and output column names - Fix passing missing cluster-by clause in scan queries - Add annotation of `CompactionState` to tombstone segments	2024-07-30 09:34:46 +05:30
Kashif Faraz	caedeb66cd	Add API to update compaction engine (#16803 ) Changes: - Add API `/druid/coordinator/v1/config/compaction/global` to update cluster level compaction config - Add class `CompactionConfigUpdateRequest` - Fix bug in `CoordinatorCompactionConfig` which caused compaction engine to not be persisted. Use json field name `engine` instead of `compactionEngine` because JSON field names must align with the getter name. - Update MSQ validation error messages - Complete overhaul of `CoordinatorCompactionConfigResourceTest` to remove unnecessary mocking and add more meaningful tests. - Add `TuningConfigBuilder` to easily build tuning configs for tests. - Add `DatasourceCompactionConfigBuilder`	2024-07-27 09:14:51 +05:30
Clint Wylie	02b8738c00	remove batchProcessingMode from task config, remove AppenderatorImpl (#16765 ) changes: * removes `druid.indexer.task.batchProcessingMode` in favor of always using `CLOSED_SEGMENT_SINKS` which uses `BatchAppenderator`. This was intended to become the default for native batch, but that was missed so `CLOSED_SEGMENTS` was the default (using `AppenderatorImpl`), however MSQ has been exclusively using `BatchAppenderator` with no problems so it seems safe to just roll it out as the only option for batch ingestion everywhere. * with `batchProcessingMode` gone, there is no use for `AppenderatorImpl` so it has been removed * implify `Appenderator` construction since there are only separate stream and batch versions now * simplify tests since `batchProcessingMode` is gone	2024-07-22 13:56:44 -07:00

1 2 3 4 5 ...

2301 Commits