druid

Commit Graph

Author	SHA1	Message	Date
Zoltan Haindrich	436ba18815	x	2024-08-05 12:59:19 +00:00
Zoltan Haindrich	70e46eadb9	update	2024-08-05 09:07:46 +00:00
Zoltan Haindrich	090f937d58	Merge branch 'quidem-record' into quidem-msq	2024-08-05 09:03:53 +00:00
Zoltan Haindrich	e6add9ea84	Merge remote-tracking branch 'apache/master' into quidem-record	2024-08-05 07:04:02 +00:00
Abhishek Radhakrishnan	31b43753fb	Add `druid.indexing.formats.stringMultiValueHandlingMode` system config (#16822 ) This patch introduces an optional cluster configuration, druid.indexing.formats.stringMultiValueHandlingMode, allowing operators to override the default mode SORTED_SET for string dimensions. The possible values for the config are SORTED_SET, SORTED_ARRAY, or ARRAY (SORTED_SET is the default). Case insensitive values are allowed. While this cluster property allows users to manage the multi-value handling mode for string dimension types, it's recommended to migrate to using real array types instead of MVDs. This fixes a long-standing issue where compaction will honor the configured cluster wide property instead of rewriting it as the default SORTED_ARRAY always, even if the data was originally ingested with ARRAY or SORTED_SET.	2024-08-03 10:23:44 -07:00
Abhishek Radhakrishnan	fe6772a101	Rename test builder `MSQTester.setExpectedSegment` (#16837 ) * Rename setExpectedSegment to setExpectedSegments in MSQTestBase. * Add expected segments for max num segments test cases.	2024-08-02 10:01:55 -07:00
zachjsh	9b731e8f0a	Kinesis Input Format for timestamp, and payload parsing (#16813 ) * SQL syntax error should target USER persona * * revert change to queryHandler and related tests, based on review comments * * add test * Introduce KinesisRecordEntity to support Kinesis headers in InputFormats * * add kinesisInputFormat and Reader, and tests * * bind KinesisInputFormat class to module * * improve test coverage * * remove references to kafka * * resolve review comments * * remove comment * * fix grammer of comment * * fix comment again * * fix comment again * * more review comments * * add partitionKey * * add check for same timestamp and partitionKey column name * * fix intellij inspection	2024-08-02 08:48:44 -04:00
Akshat Jain	63ba5a4113	Fix issues with fetching task reports in SQL statements endpoint for middlemanager (#16832 )	2024-08-01 23:37:15 -04:00
Akshat Jain	bb4d6cc001	Add task report fields in response of SQL statements endpoint (#16808 ) If the optional query parameter detail is supplied, then the response also includes the following: * A stages object that summarizes information about the different stages being used for query execution, such as stage number, phase, start time, duration, input and output information, processing methods, and partitioning. * A counters object that provides details on the rows, bytes, and files processed at various stages for each worker across different channels, along with sort progress. * A warnings object that provides details about any warnings.	2024-08-01 10:26:04 +05:30
Gian Merlino	01f6cfcbf5	MSQ worker: Support in-memory shuffles. (#16790 ) * MSQ worker: Support in-memory shuffles. This patch is a follow-up to #16168, adding worker-side support for in-memory shuffles. Changes include: 1) Worker-side code now respects the same context parameter "maxConcurrentStages" that was added to the controller in #16168. The parameter remains undocumented for now, to give us a chance to more fully develop and test this functionality. 1) WorkerImpl is broken up into WorkerImpl, RunWorkOrder, and RunWorkOrderListener to improve readability. 2) WorkerImpl has a new StageOutputHolder + StageOutputReader concept, which abstract over memory-based or file-based stage results. 3) RunWorkOrder is updated to create in-memory stage output channels when instructed to. 4) ControllerResource is updated to add /doneReadingInput/, so the controller can tell when workers that sort, but do not gather statistics, are done reading their inputs. 5) WorkerMemoryParameters is updated to consider maxConcurrentStages. Additionally, WorkerChatHandler is split into WorkerResource, so as to match ControllerChatHandler and ControllerResource. * Updates for static checks, test coverage. * Fixes. * Remove exception. * Changes from review. * Address static check. * Changes from review. * Improvements to docs and method names. * Update comments, add test. * Additional javadocs. * Fix throws. * Fix worker stopping in tests. * Fix stuck test.	2024-07-30 18:41:24 -07:00
Zoltan Haindrich	5f6290eb54	use updated hook class	2024-07-30 16:11:57 +00:00
Zoltan Haindrich	b1ab252b31	Merge branch 'quidem-record' into quidem-msq	2024-07-30 16:03:33 +00:00
Zoltan Haindrich	eb2a047e4b	Merge remote-tracking branch 'apache/master' into quidem-record	2024-07-30 14:24:37 +00:00
Zoltan Haindrich	78b75d3e8e	move more to non-static	2024-07-30 10:42:41 +00:00
Zoltan Haindrich	f6cc540368	use druidhookdispatcherr#1	2024-07-30 10:33:57 +00:00
Zoltan Haindrich	4157a8f105	add/.etc	2024-07-30 10:16:03 +00:00
Vishesh Garg	e9ea243d97	Enable compaction ITs on MSQ engine (#16778 ) Follow-up to #16291, this commit enables a subset of existing native compaction ITs on the MSQ engine. In the process, the following changes have been introduced in the MSQ compaction flow: - Populate `metricsSpec` in `CompactionState` from `querySpec` in `MSQControllerTask` instead of `dataSchema` - Add check for pre-rolled-up segments having `AggregatorFactory` with different input and output column names - Fix passing missing cluster-by clause in scan queries - Add annotation of `CompactionState` to tombstone segments	2024-07-30 09:34:46 +05:30
Kashif Faraz	caedeb66cd	Add API to update compaction engine (#16803 ) Changes: - Add API `/druid/coordinator/v1/config/compaction/global` to update cluster level compaction config - Add class `CompactionConfigUpdateRequest` - Fix bug in `CoordinatorCompactionConfig` which caused compaction engine to not be persisted. Use json field name `engine` instead of `compactionEngine` because JSON field names must align with the getter name. - Update MSQ validation error messages - Complete overhaul of `CoordinatorCompactionConfigResourceTest` to remove unnecessary mocking and add more meaningful tests. - Add `TuningConfigBuilder` to easily build tuning configs for tests. - Add `DatasourceCompactionConfigBuilder`	2024-07-27 09:14:51 +05:30
Clint Wylie	14954c7eb9	serialize legacy as false for scan query for rolling downgrade/upgrade (#16793 ) Fixes rolling downgrades/upgrades after #16659 by hard coding scan query "legacy":false since it is a required property during deserialization.	2024-07-25 14:51:58 +05:30
Clint Wylie	5da69a01cb	change arrayIngestMode default to array (#16789 ) * change arrayIngestMode default to array * remove arrayIngestMode flag option none * fix space * fix test	2024-07-25 15:09:40 +08:00
Zoltan Haindrich	8bb38a04a5	fix FIMXE	2024-07-25 03:33:33 +00:00
Zoltan Haindrich	d705c2759b	cleanup	2024-07-25 03:05:04 +00:00
Zoltan Haindrich	7e3fab5bf9	Make WindowFrames more specific (#16741 ) Changes the WindowFrame internals / representation a bit; introduces dedicated frametypes for rows and groups which corresponds to the implemented processing methods	2024-07-25 04:57:36 +02:00
Zoltan Haindrich	a489e19242	move to new file	2024-07-24 17:26:07 +00:00
Zoltan Haindrich	0be1f81d7e	remove druidPrettyprinter	2024-07-24 17:17:15 +00:00
Zoltan Haindrich	7cfbfdc3ee	add DruidPrettyPrinter	2024-07-24 17:14:30 +00:00
Zoltan Haindrich	e60a200d95	format/etc	2024-07-24 15:16:39 +00:00
Akshat Jain	a0437b6c93	MSQ window functions: Fix partition boundary issues for arrays (#16780 ) * MSQ window functions: Fix partition boundary issues for arrays * Address review comments * Cache type strategies * Trigger Build * Convert typeStrategies from list to array	2024-07-24 18:47:04 +05:30
Akshat Jain	c45d4fdbca	MSQ window functions: Minor cleanup for empty over clause related flows + Exhaustive tests (#16754 ) * MSQ window functions: Revamp logic to create separate window stages when empty over() clause is present * Fix tests * Revert changes of creating separate stages for empty over clause * Address review comments	2024-07-23 11:37:34 +05:30
Clint Wylie	02b8738c00	remove batchProcessingMode from task config, remove AppenderatorImpl (#16765 ) changes: * removes `druid.indexer.task.batchProcessingMode` in favor of always using `CLOSED_SEGMENT_SINKS` which uses `BatchAppenderator`. This was intended to become the default for native batch, but that was missed so `CLOSED_SEGMENTS` was the default (using `AppenderatorImpl`), however MSQ has been exclusively using `BatchAppenderator` with no problems so it seems safe to just roll it out as the only option for batch ingestion everywhere. * with `batchProcessingMode` gone, there is no use for `AppenderatorImpl` so it has been removed * implify `Appenderator` construction since there are only separate stream and batch versions now * simplify tests since `batchProcessingMode` is gone	2024-07-22 13:56:44 -07:00
Akshat Jain	6a2348b78b	Preemptive restriction for queries with approximate count distinct on complex columns of unsupported type (#16682 ) This PR aims to check if the complex column being queried aligns with the supported types in the aggregator and aggregator factories, and throws a user-friendly error message if they don't.	2024-07-22 21:34:06 +05:30
Sree Charan Manamala	149d7c5207	Throw exceptions in SqlValidator when DISTINCT used over WINDOW (#16738 ) * Throw exception if DISTINCT used with window functions aggregate call * Improve error message when unsupported aggregations are used with window functions	2024-07-22 16:29:46 +02:00
dave-mccowan	7f7e6ca1e5	Fix excessive logging from druid-basic-security (#16767 ) Fixes #16766 Change log level from INFO to DEBUG when processing an empty user map during polling. An empty user map is a normal situation for some authenticators (e.g. LDAP) and polling is frequent (1 minute by default.)	2024-07-22 08:33:00 +05:30
Clint Wylie	a34a06e192	remove Firehose and FirehoseFactory (#16758 ) changes: * removed `Firehose` and `FirehoseFactory` and remaining implementations which were mostly no longer used after #16602 * Moved `IngestSegmentFirehose` which was still used internally by Hadoop ingestion to `DatasourceRecordReader.SegmentReader` * Rename `SQLFirehoseFactoryDatabaseConnector` to `SQLInputSourceDatabaseConnector` and similar renames for sub-classes * Moved anything remaining in a 'firehose' package somewhere else * Clean up docs on firehose stuff	2024-07-19 14:37:21 -07:00
Zoltan Haindrich	31e97324ce	x	2024-07-19 11:36:51 +00:00
Zoltan Haindrich	361149b097	m	2024-07-19 07:29:50 +00:00
Clint Wylie	35b876436b	remove native scan query legacy mode (#16659 )	2024-07-18 23:33:27 -07:00
Zoltan Haindrich	bc7174cb6a	cleanup	2024-07-19 04:30:15 +00:00
Zoltan Haindrich	9cf723adae	rename	2024-07-19 04:29:05 +00:00
Zoltan Haindrich	7a34b6e092	cleanup	2024-07-19 04:28:02 +00:00
Zoltan Haindrich	eb4fd9f66c	removedup	2024-07-18 07:24:56 +00:00
Zoltan Haindrich	47aeb016df	Merge branch 'quidem-record' into quidem-msq	2024-07-18 05:48:32 +00:00
Zoltan Haindrich	06b68b6c89	Merge remote-tracking branch 'apache/master' into quidem-record	2024-07-18 05:48:13 +00:00
Akshat Jain	b53c26f5c5	Fix issues with partitioning boundaries for MSQ window functions (#16729 ) * Fix issues with partitioning boundaries for MSQ window functions * Address review comments * Address review comments * Add test for coverage check failure * Address review comment * Remove DruidWindowQueryTest and WindowQueryTestBase, move those tests to DrillWindowQueryTest * Update extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/querykit/WindowOperatorQueryKit.java * Address review comments * Add test for equals and hashcode for WindowOperatorQueryFrameProcessorFactory * Address review comment * Fix checkstyle --------- Co-authored-by: Benedict Jin <asdf2014@apache.org>	2024-07-18 10:05:09 +08:00
Zoltan Haindrich	70ff2a3e97	add exploratory msqPlan cmd	2024-07-17 19:48:08 +00:00
Zoltan Haindrich	8b26e490e9	fix types/resultset/etc	2024-07-17 19:30:33 +00:00
Zoltan Haindrich	c59f1adcc8	updates	2024-07-17 16:42:22 +00:00
Zoltan Haindrich	95ca0a9f5d	cleanup	2024-07-17 16:41:09 +00:00
Zoltan Haindrich	b100e982a4	make/etc	2024-07-17 16:40:30 +00:00
Zoltan Haindrich	0811d801fb	make query run	2024-07-17 16:33:10 +00:00

1 2 3 4 5 ...

1594 Commits