druid

Commit Graph

Author	SHA1	Message	Date
Sree Charan Manamala	27cfe12f4a	Enable reordering of window operators (#16482 ) This commit aims to enable the re-ordering of window operators in order to optimise the sort and partition operators. Example : ``` SELECT m1, m2, SUM(m1) OVER(PARTITION BY m2) as sum1, SUM(m2) OVER() as sum2 from numFoo GROUP BY m1,m2 ``` In order to compute this query, we can order the operators as to first compute the operators corresponding to sum2 and then place the operators corresponding to sum1 which would help us in reducing one sort operator if we order our operators by sum1 and then sum2.	2024-05-29 12:17:12 +05:30
Adarsh Sanjeev	21f725f33e	Add octet streaming of sketchs in MSQ (#16269 ) There are a few issues with using Jackson serialization in sending datasketches between controller and worker in MSQ. This caused a blowup due to holding multiple copies of the sketch being stored. This PR aims to resolve this by switching to deserializing the sketch payload without Jackson. The PR adds a new query parameter used during communication between controller and worker while fetching sketches, "sketchEncoding". If the value of this parameter is OCTET, the sketch is returned as a binary encoding, done by ClusterByStatisticsSnapshotSerde. If the value is not the above, the sketch is encoded by Jackson as before.	2024-05-28 18:12:38 +05:30
Akshat Jain	ddfd62d9a9	Disable loading lookups by default in CompactionTask (#16420 ) This PR updates CompactionTask to not load any lookups by default, unless transformSpec is present. If transformSpec is present, we will make the decision based on context values, loading all lookups by default. This is done to ensure backward compatibility since transformSpec can reference lookups. If transform spec is not present and no context value is passed, we donot load any lookup. This behavior can be overridden by supplying lookupLoadingMode and lookupsToLoad in the task context.	2024-05-15 11:39:23 +05:30
Codegass	621525a5cb	Refactor: Clean up `DecimalParquetInputTest` using Assume (#16436 )	2024-05-14 21:13:07 +05:30
Adarsh Sanjeev	18a4722d11	Resolve a bug where datasketches would not downsample sketches sufficiently (#16119 ) * Fix sketch memory issue * Rename function * Add unit test * Revert downsampling change	2024-05-14 10:23:57 +05:30
Akshat Jain	d1100a6f63	Add retries for building S3 client (#16438 ) * Add retries for building S3 client * Use S3Utils instead of RetryUtils * Add test	2024-05-13 16:32:06 -07:00
Laksh Singla	4bfc186153	Support sorting on complex columns in MSQ (#16322 ) MSQ sorts the columns in a highly specialized manner by byte comparisons. As such the values are serialized differently. This works well for the primitive types and primitive arrays, however complex types cannot be serialized specially. This PR adds the support for sorting the complex columns by deserializing the value from the field and comparing it via the type strategy. This is a lot slower than the byte comparisons, however, it's the only way to support sorting on complex columns that can have arbitrary serialization not optimized for MSQ. The primitives and the arrays are still compared via the byte comparison, therefore this doesn't affect the performance of the queries supported before the patch. If there's a sorting key with mixed complex and primitive/primitive array types, for example: longCol1 ASC, longCol2 ASC, complexCol1 DESC, complexCol2 DESC, stringCol1 DESC, longCol3 DESC, longCol4 ASC, the comparison will happen like: longCol1, longCol2 (ASC) - Compared together via byte-comparison, since both are byte comparable and need to be sorted in ascending order complexCol1 (DESC) - Compared via deserialization, cannot be clubbed with any other field complexCol2 (DESC) - Compared via deserialization, cannot be clubbed with any other field, even though the prior field was a complex column with the same order stringCol1, longCol3 (DESC) - Compared together via byte-comparison, since both are byte comparable and need to be sorted in descending order longCol4 (ASC) - Compared via byte-comparison, couldn't be coalesced with the previous fields as the direction was different This way, we only deserialize the field wherever required	2024-05-13 15:07:05 +05:30
Igor Berman	d0f3fdab37	Allow using different lock types for kill task, remove markAsUnused parameter (#16362 ) Changes: - Remove deprecated `markAsUnused` parameter from `KillUnusedSegmentsTask` - Allow `kill` task to use `REPLACE` lock when `useConcurrentLocks` is true - Use `EXCLUSIVE` lock by default	2024-05-10 06:37:36 +05:30
Adarsh Sanjeev	30f3cf5017	Add more info in MSQ export log message (#16363 )	2024-05-09 13:02:19 +05:30
Zoltan Haindrich	1811674753	Enable quidem tests to use different suppliers (#16382 ) * enable quidem uri support for `druidtest:///?ComponentSupplier=Nested` and similar * changes the way `SqlTestFrameworkConfig` is being applied; all options will have their own annotation (its kinda impossible to detect that an annotation has a set value or its the default) * enables hierarchical processing of config annotation (was needed to enable class level supplier annotation) * moves uri processing related string2config stuff into `SqlTestFrameworkConfig`	2024-05-09 09:21:02 +02:00
Akshat Jain	775d654a6c	Load only the required lookups for MSQ tasks (#16358 ) With this PR changes, MSQ tasks (MSQControllerTask and MSQWorkerTask) only load the required lookups during querying and ingestion, based on the value of CTX_LOOKUPS_TO_LOAD key in the query context.	2024-05-09 11:21:54 +05:30
Adarsh Sanjeev	f82cc34e5b	Maintain a connection while exporting results with MSQ (#16381 ) * Maintain a connection while exporting results with MSQ * Fix checkstyle * Fix checkstyle * Move initialization from constructor * Add null check * Address review comments	2024-05-08 11:34:20 +05:30
Adarsh Sanjeev	269e035e76	Add validation for reindex with realtime sources (#16390 ) Add validation for reindex with realtime sources. With the addition of concurrent compaction, it is possible to ingest data while querying from realtime sources with MSQ into the same datasource. This could potentially lead to issues if the interval that is ingested into is replaced by an MSQ job, which has queried only some of the data from the realtime task. This PR adds validation to check that the datasource being ingested into is not being queried from, if the query includes realtime sources.	2024-05-07 10:32:15 +05:30
Alberic Liu	92fb0ff718	upgrade mysql:mysql-connector-java to 8.2.0 (#16024 ) * upgrade mysql:mysql-connector-java to 8.2.0 * fix the check errors * remove unused comment	2024-05-06 21:58:37 +08:00
zachjsh	fb7c84fb5d	Catalog clustering keys fixes (#16351 ) * * add another catalog clustering columns unit test * * dissallow clusterKeys with descending order * * make more clear that clustering is re-written into ingest node whether a catalog table or not * * when partitionedBy is stored in catalog, user shouldnt need to specify it in order to specify clustering * * fix intellij inspection failure	2024-05-03 14:02:56 -04:00
Jan Werner	b16401323b	update dependencies to address CVEs (#16374 ) update dependencies to address new batch of CVEs: - Azure POM from 1.2.19 to 1.2.23 to update transitive dependency nimbus-jose-jwt to address: CVE-2023-52428 - commons-configuration2 from 2.8.0 to 2.10.1 to address: CVE-2024-29131 CVE-2024-29133 - bcpkix-jdk18on from 1.76 to 1.78.1 to address: CVE-2024-30172 CVE-2024-30171 CVE-2024-29857	2024-05-02 21:35:21 -07:00
Gian Merlino	5d1950d451	MSQ controller: Support in-memory shuffles; towards JVM reuse. (#16168 ) * MSQ controller: Support in-memory shuffles; towards JVM reuse. This patch contains two controller changes that make progress towards a lower-latency MSQ. First, support for in-memory shuffles. The main feature of in-memory shuffles, as far as the controller is concerned, is that they are not fully buffered. That means that whenever a producer stage uses in-memory output, its consumer must run concurrently. The controller determines which stages run concurrently, and when they start and stop. "Leapfrogging" allows any chain of sort-based stages to use in-memory shuffles even if we can only run two stages at once. For example, in a linear chain of stages 0 -> 1 -> 2 where all do sort-based shuffles, we can use in-memory shuffling for each one while only running two at once. (When stage 1 is done reading input and about to start writing its output, we can stop 0 and start 2.) 1) New OutputChannelMode enum attached to WorkOrders that tells workers whether stage output should be in memory (MEMORY), or use local or durable storage. 2) New logic in the ControllerQueryKernel to determine which stages can use in-memory shuffling (ControllerUtils#computeStageGroups) and to launch them at the appropriate time (ControllerQueryKernel#createNewKernels). 3) New "doneReadingInput" method on Controller (passed down to the stage kernels) which allows stages to transition to POST_READING even if they are not gathering statistics. This is important because it enables "leapfrogging" for HASH_LOCAL_SORT shuffles, and for GLOBAL_SORT shuffles with 1 partition. 4) Moved result-reading from ControllerContext#writeReports to new QueryListener interface, which ControllerImpl feeds results to row-by-row while the query is still running. Important so we can read query results from the final stage using an in-memory channel. 5) New class ControllerQueryKernelConfig holds configs that control kernel behavior (such as whether to pipeline, maximum number of concurrent stages, etc). Generated by the ControllerContext. Second, a refactor towards running workers in persistent JVMs that are able to cache data across queries. This is helpful because I believe we'll want to reuse JVMs and cached data for latency reasons. 1) Move creation of WorkerManager and TableInputSpecSlicer to the ControllerContext, rather than ControllerImpl. This allows managing workers and work assignment differently when JVMs are reusable. 2) Lift the Controller Jersey resource out from ControllerChatHandler to a reusable resource. 3) Move memory introspection to a MemoryIntrospector interface, and introduce ControllerMemoryParameters that uses it. This makes it easier to run MSQ in process types other than Indexer and Peon. Both of these areas will have follow-ups that make similar changes on the worker side. * Address static checks. * Address static checks. * Fixes. * Report writer tests. * Adjustments. * Fix reports. * Review updates. * Adjust name. * Small changes.	2024-04-30 21:30:27 -07:00
Adarsh Sanjeev	fb63520de9	Add tests for ProcessorManager (#16327 ) * Add tests for ProcessorManager	2024-04-30 09:35:26 +05:30
Adithya Chakilam	f8015eb02a	Add config lagAggregate to LagBasedAutoScalerConfig (#16334 ) Changes: - Add new config `lagAggregate` to `LagBasedAutoScalerConfig` - Add field `aggregateForScaling` to `LagStats` - Use the new field/config to determine which aggregate to use to compute lag - Remove method `Supervisor.computeLagForAutoScaler()`	2024-04-29 22:20:41 +05:30
Adarsh Sanjeev	9a2d7c28bc	Prepare master branch for 31.0.0 release (#16333 )	2024-04-26 09:22:43 +05:30
Arun Ramani	126a0c219a	Surface lock revocation exceptions in task status (#16325 )	2024-04-26 08:39:44 +05:30
AmatyaAvadhanula	31eee7d51e	Check for handoff of upgraded segments (#16162 ) Changes: 1) Check for handoff of upgraded realtime segments. 2) Drop sink only when all associated realtime segments have been abandoned. 3) Delete pending segments upon commit to prevent unnecessary upgrades and partition space exhaustion when a concurrent replace happens. This also prevents potential data duplication. 4) Register pending segment upgrade only on those tasks to which the segment is associated.	2024-04-25 22:03:38 +05:30
Jakub Matyszewski	5061507541	pacj4: add UserProfile attributes to AuthenticationResult context (#16109 ) I'm adding OIDC context to the AuthenticationResult returned by pac4j extension. I wanted to use this context as input in OpenPolicyAgent authorization. Since AuthenticationResult already accepts context as a parameter it felt okay to pass the profile attributes there.	2024-04-25 12:10:12 +05:30
Zoltan Haindrich	9c0bd56f5b	Make QueryComponentSupliers independent from test classes (#16275 )	2024-04-25 02:12:07 -04:00
Gian Merlino	8a5cc976a9	ArrayOfDoublesSketchBuildAggregator: Fix NPE in get() for empty sketch. (#16330 ) Fixes a bug introduced in #16296, where the sketch might not be initialized if get() is called without calling aggregate(). Also adds a test for this case.	2024-04-25 00:59:59 -04:00
Laksh Singla	6bca406d31	Grouping on complex columns aka unifying GroupBy strategies (#16068 ) Users can pass complex types as dimensions to the group by queries. For example: SELECT nested_col1, count(*) FROM foo GROUP BY nested_col1	2024-04-24 23:00:14 +05:30
Rishabh Singh	e30790e013	Introduce Segment Schema Publishing and Polling for Efficient Datasource Schema Building (#15817 ) Issue: #14989 The initial step in optimizing segment metadata was to centralize the construction of datasource schema in the Coordinator (#14985). Thereafter, we addressed the problem of publishing schema for realtime segments (#15475). Subsequently, our goal is to eliminate the requirement for regularly executing queries to obtain segment schema information. This is the final change which involves publishing segment schema for finalized segments from task and periodically polling them in the Coordinator.	2024-04-24 22:22:53 +05:30
Sree Charan Manamala	080476f9ea	WINDOWING - Fix 2 nodes with same digest causing mapping issue (#16301 ) Fixes the mapping issue in window fucntions where 2 nodes get the same reference.	2024-04-24 16:45:02 +05:30
Gian Merlino	274ccbfd85	Reset buffer aggregators when resetting Groupers. (#16296 ) Buffer aggregators can contain some cached objects within them, such as Memory references or HLL Unions. Prior to this patch, various Grouper implementations were not releasing this state when resetting their own internal state, which could lead to excessive memory use. This patch renames AggregatorAdapater#close to "reset", and updates Grouper implementations to call this reset method whenever they reset their internal state. The base method on BufferAggregator and VectorAggregator remains named "close", for compatibility with existing extensions, but the contract is adjusted to say that the aggregator may be reused after the method is called. All existing implementations in core already adhere to this new contract, except for the ArrayOfDoubles build flavors, which are updated in this patch to adhere. Additionally, this patch harmonizes buffer sketch helpers to call their clear method "clear" rather than a mix of "clear" and "close". (Others were already using "clear".)	2024-04-24 05:39:24 -04:00
Parth Agrawal	f1d24c868f	[CVE Fixes] Update version of Nimbus.jose.jwt (#16320 ) * Update version of nimbus.jose.jwt.version * update licenses.yaml	2024-04-23 15:11:54 +05:30
Vishesh Garg	173a206829	Fix incorrect check of InvalidFieldException to InvalidFieldFault while generating MSQ Error Report (#16273 ) InvalidFieldFault is incorrectly checked as InvalidFieldException in mapQueryColumnNameToOutputColumnName. This fixes the bug.	2024-04-22 15:18:49 +05:30
Laksh Singla	b9bbde5c0a	Fix deadlock that can occur while merging group by results (#15420 ) This PR prevents such a deadlock from happening by acquiring the merge buffers in a single place and passing it down to the runner that might need it.	2024-04-22 14:10:44 +05:30
Adithya Chakilam	cff5d1e369	Add method Supervisor.computeLagForAutoScaler (#16314 ) Tries to address the comments made on #16284 after merged. Changes: - Remove method `Supervisor.getLagMetric()` - Add method `Supervisor.computeLagForAutoScaler()` - Remove classes `LagMetric` and `LagMetricTest`	2024-04-20 07:57:50 +05:30
Akshat Jain	79e48c6b45	Fix NPE while loading lookups from empty JDBC source (#16307 )	2024-04-18 21:52:02 +05:30
zachjsh	3f2dd46ede	Catalog table should not need explicit segment granularity set (#16278 ) * * fix * * fix * * address review comments * * fix * * simplify tests * * fix complex type nullability issue * * fix and update test * * address review comments * * address test review comments * * fix checkstyle * * fix checkstyle * * fix failing test	2024-04-17 11:46:24 -04:00
zachjsh	2351f038eb	Kafka with topicPattern can ignore old offsets spuriously (#16190 ) * * fix * * simplify * * simplify tests * * update matches function definition for Kafka Datasource Metadata * * add matchesOld * * override matches and plus for kafka based metadata / sequence numbers * * implement minus * add tests * * fix failing tests * * remove TODO comments * * simplfy and add comments * * remove unused variable in tests * * remove unneeded function * * add serde tests * * more stuff * * address review comments * * remove unneeded code.	2024-04-17 10:00:17 -04:00
Adithya Chakilam	34237bc112	Consider max lag for kinesis while autoscaling (#16284 ) * Consider max lag for kinesis while autoscaling * add test for coverage * test folder	2024-04-17 15:05:05 +05:30
AmatyaAvadhanula	f3d69f30e6	Associate pending segments with the tasks that requested them (#16144 ) Changes: - Add column `task_allocator_id` to `pendingSegments` metadata table. - Add column `upgraded_from_segment_id` to `pendingSegments` metadata table. - Add interface `PendingSegmentAllocatingTask` and implement it by all tasks which can allocate pending segments. - Use `taskAllocatorId` to identify the task (and its sub-tasks or replicas) to which a pending segment has been allocated. - Perform active cleanup of pending segments in `TaskLockbox` once there are no active tasks for the corresponding task allocator id. - When committing APPEND segments, also commit all upgraded pending segments corresponding to that task allocator id. - When committing REPLACE segments, upgrade all overlapping pending segments in the same transaction.	2024-04-17 09:06:31 +05:30
zachjsh	a5428e75ff	INSERT/REPLACE complex target column types are validated against source input expressions (#16223 ) * * fix * * fix * * address review comments * * fix * * simplify tests * * fix complex type nullability issue * * address review comments * * address test review comments * * fix checkstyle	2024-04-16 17:20:35 -04:00
AmatyaAvadhanula	ad6bd62140	Handle task location fetch from overlord during rolling upgrades (#16227 ) Bug: #15724 introduced a bug where a rolling upgrade would cause all task locations returned by the Overlord on an older version to be unknown. Fix: If the new API fails, fall back to single task status API which always returns a valid task location.	2024-04-16 21:01:37 +05:30
YongGang	6964297b53	Remove the unused Controller context reference from Worker (#16285 )	2024-04-16 08:34:24 +05:30
Adarsh Sanjeev	3df00aef9d	Add manifest file for MSQ export (#15953 ) Currently, export creates the files at the provided destination. The addition of the manifest file will provide a list of files created as part of the manifest. This will allow easier consumption of the data exported from Druid, especially for automated data pipelines	2024-04-15 11:37:31 +05:30
Kashif Faraz	81d7b6ebe1	Fix OverlordClient to read reports as a concrete `ReportMap` (#16226 ) Follow up to #16217 Changes: - Update `OverlordClient.getReportAsMap()` to return `TaskReport.ReportMap` - Move the following classes to `org.apache.druid.indexer.report` in the `druid-processing` module - `TaskReport` - `KillTaskReport` - `IngestionStatsAndErrorsTaskReport` - `TaskContextReport` - `TaskReportFileWriter` - `SingleFileTaskReportFileWriter` - `TaskReportSerdeTest` - Remove `MsqOverlordResourceTestClient` as it had only one method which is already present in `OverlordResourceTestClient` itself	2024-04-15 08:00:59 +05:30
YongGang	da9feb4430	Introduce TaskContextReport for reporting task context (#16041 ) Changes: - Add `TaskContextEnricher` interface to improve task management and monitoring - Invoke `enrichContext` in `TaskQueue.add()` whenever a new task is submitted to the Overlord - Add `TaskContextReport` to write out task context information in reports	2024-04-12 08:57:49 +05:30
Gian Merlino	9f358f5f4a	SQL tests: avoid mixing skip and cannot vectorize. (#16251 ) * SQL tests: avoid mixing skip and cannot vectorize. skipVectorize switches off vectorization tests completely, and cannotVectorize turns vectorization tests into negative tests. It doesn't make sense to use them together, so this patch makes it an error to do so, and cleans up cases where both are mentioned. This patch also has the effect of changing various tests from skipVectorize to cannotVectorize, because in the past when both were mentioned, skipVectorize would take priority. * Fix bug with StringAnyAggregatorFactory attempting to vectorize when it cannt. * Fix tests.	2024-04-11 15:06:11 -07:00
Vishesh Garg	3d595cfab1	Add storeCompactionState flag support to msq (#15965 ) Compaction in the native engine by default records the state of compaction for each segment in the lastCompactionState segment field. This PR adds support for doing the same in the MSQ engine, targeted for future cases such as REPLACE and compaction done via MSQ. Note that this PR doesn't implicitly store the compaction state for MSQ replace tasks; it is stored with flag "storeCompactionState": true in the query context.	2024-04-09 16:47:47 +05:30
Vishesh Garg	9a4fb58543	Record column name for exceptions while writing frames in RowBasedFrameWriter (#16130 ) Current Runtime Exceptions generated while writing frames only include the exception itself without including the name of the column they were encountered in. This patch introduces the further information in the error and makes it non-retryable.	2024-04-09 15:39:10 +05:30
Adarsh Sanjeev	e2e0cb905c	Add reasoning for choosing shardSpec to the MSQ report (#16175 ) This PR logs the segment type and reason chosen. It also adds it to the query report, to be displayed in the UI. This PR adds a new section to the reports, segmentReport. This contains the segment type created, if the query is an ingestion, and null otherwise.	2024-04-09 11:32:02 +05:30
Gian Merlino	5e5cf9af99	Reduce upload buffer size in GoogleTaskLogs. (#16236 ) * Reduce upload buffer size in GoogleTaskLogs. Use a 1MB upload buffer, rather than the default of 15 MB in the API client. This is mainly because MMs may upload logs in parallel, and typically have small heaps. The default-sized 15 MB buffers add up quickly and can cause a MM to run out of memory. * Make bufferSize a nullable Integer. Add tests.	2024-04-08 12:54:31 -07:00
Parag Jain	f55c9e58a8	add google as external storage for msq export (#16051 ) Support for exporting msq results to gcs bucket. This is essentially copying the logic of s3 export for gs, originally done by @adarshsanjeev in this PR - #15689	2024-04-05 12:10:10 +05:30

1 2 3 4 5 ...

1503 Commits