druid

Commit Graph

Author	SHA1	Message	Date
Vadim Ogievetsky	4e33ce2b21	fix collapsing in column tree (#16910 )	2024-08-18 15:11:28 -07:00
Akshat Jain	a56b5c018d	Propagate TooManyRowsInAWindowFault error message properly to the user (#16906 ) * Propagate TooManyRowsInAWindowFault error message properly to the user * Add TooManyRowsInAWindowFault to MSQFaultSerdeTest	2024-08-18 10:03:45 +05:30
Benedict Jin	688b4cf164	Fix flaky test in ParallelMergeCombiningSequenceTest (#16907 )	2024-08-18 10:02:50 +05:30
Gian Merlino	806649f8af	SQL: Fix nullable DATE, TIMESTAMP reduction. (#16915 ) Reduction of nullable DATE and TIMESTAMP expressions did not perform a necessary null check, so would in some cases reduce to 1970-01-01 00:00:00 (epoch) rather than NULL.	2024-08-16 22:41:12 -07:00
Vadim Ogievetsky	422183ee70	Web console: expose handoff API (#16586 ) * don't start completions on numbers... it makes numbers hard to enter * add handoff dialog * fix placeholder * Update web-console/src/dialogs/supervisor-handoff-dialog/supervisor-handoff-dialog.tsx Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update web-console/src/dialogs/supervisor-handoff-dialog/supervisor-handoff-dialog.tsx Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update web-console/src/dialogs/supervisor-handoff-dialog/supervisor-handoff-dialog.tsx Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * feedback fixes * update snapshot --------- Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>	2024-08-16 14:39:16 -07:00
Edgar Melendrez	c968e73171	[Docs] updating transformation during ingestion tutorial (#16845 ) * first major revision of tutorial * more edits * re-ID the file to reflect new content + redirect * renaming file * Apply suggestions from code review Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * addressing suggestions * adding column names * Update docs/tutorials/tutorial-transform.md * Update docs/tutorials/tutorial-transform.md * Addressing suggestions * Apply suggestions from code review Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * adding trademark logo and moving paragraph * decided to shorten final paragraph --------- Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> Co-authored-by: Benedict Jin <asdf2014@apache.org> Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>	2024-08-16 11:39:57 -07:00
Clint Wylie	4283b270e3	rework cursor creation (#16533 ) changes: * Added `CursorBuildSpec` which captures all of the 'interesting' stuff that goes into producing a cursor as a replacement for the method arguments of `CursorFactory.canVectorize`, `CursorFactory.makeCursor`, and `CursorFactory.makeVectorCursor` * added new interface `CursorHolder` and new interface `CursorHolderFactory` as a replacement for `CursorFactory`, with method `makeCursorHolder`, which takes a `CursorBuildSpec` as an argument and replaces `CursorFactory.canVectorize`, `CursorFactory.makeCursor`, and `CursorFactory.makeVectorCursor` * `CursorFactory.makeCursors` previously returned a `Sequence<Cursor>` corresponding to the query granularity buckets, with a separate `Cursor` per bucket. `CursorHolder.asCursor` instead returns a single `Cursor` (equivalent to 'ALL' granularity), and a new `CursorGranularizer` has been added for query engines to iterate over the cursor and divide into granularity buckets. This makes the non-vectorized engine behave the same way as the vectorized query engine (with its `VectorCursorGranularizer`), and simplifies a lot of stuff that has to read segments particularly if it does not care about bucketing the results into granularities. * Deprecated `CursorFactory`, `CursorFactory.canVectorize`, `CursorFactory.makeCursors`, and `CursorFactory.makeVectorCursor` * updated all `StorageAdapter` implementations to implement `makeCursorHolder`, transitioned direct `CursorFactory` implementations to instead implement `CursorMakerFactory`. `StorageAdapter` being a `CursorMakerFactory` is intended to be a transitional thing, ideally will not be released in favor of moving `CursorMakerFactory` to be fetched directly from `Segment`, however this PR was already large enough so this will be done in a follow-up. * updated all query engines to use `makeCursorHolder`, granularity based engines to use `CursorGranularizer`.	2024-08-16 11:34:10 -07:00
Vishesh Garg	e37fe93f09	Add support for a custom `DimensionSchema` in `DataSourceMSQDestination` (#16864 ) This PR adds support for passing in a custom DimensionSchema map to MSQ query destination of type DataSourceMSQDestination	2024-08-16 15:24:49 +05:30
Edgar Melendrez	5b94839d9d	[Docs] Batch08: adding examples to string functions (#16871 ) * batch08 completed * reviewing batch08 * apply corrections suggestions by @FrankChen021	2024-08-16 10:15:30 +08:00
Hugh Evans	e91f680d50	Removed deprecated deep storage properties (#16904 )	2024-08-15 11:54:34 -07:00
Hugh Evans	6cfdeb3894	Added a topic listing reserved keywords (#16843 )	2024-08-15 10:25:09 -07:00
Hugh Evans	8c030feefc	Migration guide fixes (#16902 ) * Fix typo in table header * Fixed example NVL result	2024-08-15 09:26:34 -07:00
Sree Charan Manamala	964cf47bb5	fix NPE (#16897 )	2024-08-15 18:12:22 +08:00
Vadim Ogievetsky	8181ef627a	add useConcurrentLocks toggle (#16899 )	2024-08-14 13:44:53 -07:00
Vadim Ogievetsky	ca82ecd352	bump axios to 1.7.4 (#16898 )	2024-08-14 13:42:26 -07:00
Maytas Monsereenusorn	c2ddff399d	Fix Parquet Reader when ingestion need to read columns in filter (#16874 )	2024-08-14 12:31:38 -07:00
Laksh Singla	204533cade	Remove Query ID verification check from MSQ workers (#16886 ) Upgrade/Downgrade between any version till or before Druid 30 where the newer version runs a worker task, while the older version runs a controller task can fail. The patch removes that verification check till its safe to add it back.	2024-08-14 10:22:19 +05:30
Abhishek Radhakrishnan	acadc2df20	Handle Delta StructType, ArrayType and MapType (#16884 ) Handle the following Delta complex types: a. StructType as JSON b. ArrayType as Java list c. MapType as Java map Generate and add a new Delta table complex-types-table that contains the above complex types for testing. Update the tests to include a parameterized test with complex-types-table, with the expectations defined in ComplexTypesDeltaTable.java.	2024-08-13 07:50:03 -07:00
Adarsh Sanjeev	c6da2f30e8	Add fieldReader for row based frames (#16707 ) Add a new fieldReaders#makeRAC for RowBasedFrameRowsAndColumns.	2024-08-13 14:04:41 +05:30
Rishabh Singh	f67ff92d07	[bugfix] Run cold schema refresh thread periodically (#16873 ) * Fix build * Run coldSchemaExec thread periodically * Bugfix: Run cold schema refresh periodically * Rename metrics for deep storage only segment schema process	2024-08-13 11:44:01 +05:30
Abhishek Radhakrishnan	d7dfbebf97	[Docs]: Fix typo and update broadcast rules section (#16882 ) * Fix typo in waitUntilSegmentsLoad. * Add a note on configuring druid.segmentCache.locations for broadcast rules. * Update docs/operations/rule-configuration.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> --------- Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>	2024-08-12 13:55:33 -07:00
Gian Merlino	efe0044f9e	Use fuzzy matchers for compaction bytes asserts. (#16870 ) * Use fuzzy matchers for compaction bytes asserts. This still enables us to test that the bytes are zero and nonzero when they're supposed to be, without having to ge them exactly right. The need to get bytes exactly right makes it difficult to ensure ITs pass when making changes to default segment metadata. * Additional fuzziness.	2024-08-12 10:00:33 +08:00
Rushikesh Bankar	4ef4e75c5d	Fix the issue of missing implementation of IndexerTaskCountStatsProvider for peons (#16875 ) Bug description: Peons to fail to start up when `WorkerTaskCountStatsMonitor` is used on MiddleManagers. This is because MiddleManagers pass on their properties to peons and peons are unable to find `IndexerTaskCountStatsProvider` as that is bound only for indexer nodes. Fix: Check if node is an indexer before trying to get instance of `IndexerTaskCountStatsProvider`.	2024-08-10 14:53:16 +05:30
Vadim Ogievetsky	483a03f26c	Web console: Server context defaults (#16868 ) * add server defaults * null is NULL * r to d * add test * typo	2024-08-09 14:46:59 -07:00
Adithya Chakilam	a7dd436a32	Check if supervisor could be idle on startup (#16844 ) Fixes #13936 In cases where a supervisor is idle and the overlord is restarted for some reason, the supervisor would start spinning tasks again. In clusters where there are many low throughput streams, this would spike the task count unnecessarily. This commit compares the latest stream offset with the ones in metadata during the startup of supervisor and sets it to idle state if they match.	2024-08-09 14:42:48 +05:30
Akshat Jain	3d6cedb25f	Fix IndexOutOfBoundsException for MSQ window function queries with empty RAC (#16865 ) * Fix IndexOutOfBoundsException for MSQ window function queries with empty RAC	2024-08-09 11:39:53 +05:30
zachjsh	cb09b572e6	Fix Druid table schema resolution when table defined in catalog and has schema manager (#16869 ) * SQL syntax error should target USER persona * * revert change to queryHandler and related tests, based on review comments * * add test * Properly handle Druid schema blending with catalog definition and segment metadata * * add javadocs	2024-08-08 21:21:03 -04:00
Clint Wylie	6cd8c6be22	fix IndexedStringDruidPredicateIndexes to not needlessly lookup index of values (#16860 )	2024-08-07 23:29:56 -07:00
Akshat Jain	7f67d26dfa	Reduce logging in RetryableS3OutputStream (#16853 ) This PR reduces logging in RetryableS3OutputStream.	2024-08-08 10:42:40 +05:30
Zoltan Haindrich	408702e100	Add ability to run MSQ in Quidem tests (#16798 ) * implements some jdbc facade to enable msq usage * adds an !msqPlan command * adds more guice usage to testsystem startup	2024-08-08 06:37:06 +02:00
Hardik Bajaj	1cf3f4bebe	Fix Concurrent Task Insertion in pendingCompletionTaskGroups (#16834 ) Fix streaming task failures that may arise due to concurrent task insertion in pendingCompletionTaskGroups	2024-08-08 08:37:27 +05:30
aaronm-bi	ceed4a0634	Docs: Update list of ingestion types that support concurrent append and replace (#16852 )	2024-08-08 08:06:22 +05:30
Vadim Ogievetsky	56c03582cf	support kinesis input format (#16850 )	2024-08-07 10:24:28 -07:00
Rishabh Singh	c6a7ab005f	Increase query cancellation timeout in the router (#16656 ) * Fix build * Increase query cancellation timeout in router * Increase cancellation timeout to 5 seconds	2024-08-07 20:29:35 +05:30
Atul Mohan	76ad17fb4c	Add config for http client connect timeout (#16831 ) Adds a configuration clientConnectTimeout to our http client config which controls the connection timeout for our http client requests. It was observed that on busy K8S clusters, the default connect timeout of 500ms is sometimes not enough time to complete syn/acks for a request and in these cases, the requests timeout with the error: exceptionType=java.net.SocketTimeoutException, exceptionMessage=Connect Timeout This behavior was mostly observed on the router while forwarding queries to the broker. Having a slightly higher connect timeout helped resolve these issues.	2024-08-07 19:31:10 +05:30
Sree Charan Manamala	84192b11d7	Benchmark for window functions (#16824 )	2024-08-07 11:07:11 +02:00
Sree Charan Manamala	1f6d2c41d2	Update doc for dynamic parameters supporting array (#16660 ) Update dynamic parameter docs to provide how it can used to replace an Array	2024-08-07 12:33:37 +05:30
Rishabh Singh	99313e9996	Revised IT to detect backward incompatible change (#16779 ) Added a new revised IT group BackwardCompatibilityMain. The idea is to catch potential backward compatibility issues that may arise during rolling upgrade. This test group runs a docker-compose cluster with Overlord & Coordinator service on the previous druid version. Following env vars are required in the GHA file .github/workflows/unit-and-integration-tests-unified.yml to run this test DRUID_PREVIOUS_VERSION -> Previous druid version to test backward incompatibility. DRUID_PREVIOUS_VERSION_DOWNLOAD_URL -> URL to fetch the tar.	2024-08-07 11:13:35 +05:30
Edgar Melendrez	83cf4dc554	[docs] fixes to sql-scalar.md (#16826 ) Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> Co-authored-by: Benedict Jin <asdf2014@apache.org> Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>	2024-08-06 17:12:57 -07:00
zachjsh	c324f09108	Kinesis input format docs (#16840 ) * SQL syntax error should target USER persona * * revert change to queryHandler and related tests, based on review comments * * add test * Docs for Kinesis input format * * remove reference to kafka * * fix spellcheck error * Apply suggestions from code review Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com> --------- Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>	2024-08-06 18:53:10 -04:00
Gian Merlino	eaa09937bc	SuperSorter: direct merging, increased parallelism. (#16775 ) Two performance enhancements: 1) Direct merging of input frames to output channels, without any temporary files, if all input frames fit in memory. 2) When doing multi-level merging (now called "external mode"), improve parallelism by boosting up the number of mergers in the penultimate level. To support direct merging, FrameChannelMerger is enhanced such that the output partition min/max values are used to filter input frames. This is necessary because all direct mergers read all input frames, but only rows corresponding to a single output partition.	2024-08-06 15:00:39 -07:00
Edgar Melendrez	ebea34a814	[Docs] Batch06: starting string functions (#16838 ) * batch06, starting string functions * addind space after Syntax * quick change * correcting spelling * Update docs/querying/sql-functions.md * Update sql-functions.md * applying suggestions * Update docs/querying/sql-functions.md * Update docs/querying/sql-functions.md --------- Co-authored-by: Benedict Jin <asdf2014@apache.org> Co-authored-by: Charles Smith <techdocsmith@gmail.com>	2024-08-06 11:32:26 -07:00
Adarsh Sanjeev	739068469c	General Druid refactors (#16708 ) Some general refactors across Druid. Switch to DruidExceptions Add javadocs Fix a bug in IntArrayColumns Add a class for LongArrayColumns Remove wireTransferable since it would never be called Refactor DictionaryWriter to return the index written as a return value from write.	2024-08-06 11:47:08 -05:00
Adarsh Sanjeev	2b81c18fd7	Refactor SemanticCreator (#16700 ) Refactors the SemanticCreator annotation. Moves the interface to the semantic package. Create a SemanticUtils to hold logic for storing semantic maps. Add FrameMaker interface.	2024-08-06 11:29:38 -05:00
Vishesh Garg	593c3b2150	Do not support non-idempotent aggregator in MSQ compaction (#16846 ) This PR adds checks for verification of DataSourceCompactionConfig and CompactionTask with msq engine to ensure: each aggregator in metricsSpec is idempotent metricsSpec is non-null when rollup is set to true Unit tests and existing compaction ITs have been updated accordingly.	2024-08-06 20:58:08 +05:30
Kashif Faraz	aa49be61ea	Do not create ZK paths if not needed (#16816 ) Background: ZK-based segment loading has been completely disabled in #15705 . ZK `servedSegmentsPath` has been deprecated since Druid 0.7.1, #1182 . This legacy path has been replaced by the `liveSegmentsPath` and is not used in the code anymore. Changes: - Never create ZK loadQueuePath as it is never used. - Never create ZK servedSegmentsPath as it is never used. - Do not create ZK liveSegmentsPath if announcement on ZK is disabled - Fix up tests	2024-08-06 19:29:13 +05:30
Gian Merlino	de40d81b29	SQL: Add ProjectableFilterableTable to SegmentsTable. (#16841 ) * SQL: Add ProjectableFilterableTable to SegmentsTable. This allows us to skip serialization of expensive fields such as shard_spec, dimensions, metrics, and last_compaction_state, if those fields are not actually being queried. * Restructure logic to avoid unnecessary toString() as well.	2024-08-06 06:40:21 -07:00
Akshat Jain	c3aa033e14	MSQ window functions: Fix query correctness issues when using multiple workers (#16804 ) This PR fixes query correctness issues for MSQ window functions when using more than 1 worker (that is, maxNumTasks > 2). Currently, we were keeping the shuffle spec of the previous stage when we didn't have any partition columns for window stage. This PR changes it to override the shuffle spec of the previous stage to MixShuffleSpec (if we have a window function with empty over clause) so that the window stage gets a single partition to work on. A test has been added for a query which returned incorrect results prior to this change when using more than 1 workers.	2024-08-06 16:11:18 +05:30
Sree Charan Manamala	ed6b547481	Handle default bounds correctly in WINDOW clause (#16833 ) When a window is defined as WINDOW W AS <DEF> and using a syntax of (PARTITION BY col1 ORDER BY col2 ROWS x PRECEDING), we would need to default the other bound to CURRENT ROW We already have implemented this earlier, but when defined as WINDOW W AS <DEF>, Calcite takes a different route to validate the window.	2024-08-06 09:58:44 +02:00
Vadim Ogievetsky	aeace28ccb	Web console: Add columnMapping information to the Explain dialog (#16598 ) * Add columnMapping information in the Explain dialog * use arrow char	2024-08-05 13:21:51 -07:00

... 2 3 4 5 6 ...

14452 Commits All Branches Search

14452 Commits

All Branches