druid

Commit Graph

Author	SHA1	Message	Date
Vishesh Garg	e9ea243d97	Enable compaction ITs on MSQ engine (#16778 ) Follow-up to #16291, this commit enables a subset of existing native compaction ITs on the MSQ engine. In the process, the following changes have been introduced in the MSQ compaction flow: - Populate `metricsSpec` in `CompactionState` from `querySpec` in `MSQControllerTask` instead of `dataSchema` - Add check for pre-rolled-up segments having `AggregatorFactory` with different input and output column names - Fix passing missing cluster-by clause in scan queries - Add annotation of `CompactionState` to tombstone segments	2024-07-30 09:34:46 +05:30
Zoltan Haindrich	c7cde31a89	HAVING clauses may not contain window functions (#16742 ) Rejects having clauses if they contain windowed expressions. Also added a check to produce a more descriptive error if an OVER expression reaches the filter translation layer. --------- Co-authored-by: Benedict Jin <asdf2014@apache.org>	2024-07-29 04:11:36 -04:00
dependabot[bot]	f5527dc3e7	Bump io.grpc:grpc-netty-shaded from 1.57.2 to 1.65.1 (#16731 ) Bumps [io.grpc:grpc-netty-shaded](https://github.com/grpc/grpc-java) from 1.57.2 to 1.65.1. - [Release notes](https://github.com/grpc/grpc-java/releases) - [Commits](https://github.com/grpc/grpc-java/compare/v1.57.2...v1.65.1) --- updated-dependencies: - dependency-name: io.grpc:grpc-netty-shaded dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: asdf2014 <asdf2014@apache.org>	2024-07-29 14:51:39 +08:00
dependabot[bot]	cbca0dc969	Bump jclouds.version from 2.5.0 to 2.6.0 (#16796 ) Bumps `jclouds.version` from 2.5.0 to 2.6.0. Updates `org.apache.jclouds:jclouds-core` from 2.5.0 to 2.6.0 Updates `org.apache.jclouds.api:openstack-swift` from 2.5.0 to 2.6.0 Updates `org.apache.jclouds.driver:jclouds-slf4j` from 2.5.0 to 2.6.0 Updates `org.apache.jclouds.api:openstack-keystone` from 2.5.0 to 2.6.0 Updates `org.apache.jclouds.api:rackspace-cloudfiles` from 2.5.0 to 2.6.0 Updates `org.apache.jclouds.provider:rackspace-cloudfiles-us` from 2.5.0 to 2.6.0 Updates `org.apache.jclouds.provider:rackspace-cloudfiles-uk` from 2.5.0 to 2.6.0 --- updated-dependencies: - dependency-name: org.apache.jclouds:jclouds-core dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: org.apache.jclouds.api:openstack-swift dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: org.apache.jclouds.driver:jclouds-slf4j dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: org.apache.jclouds.api:openstack-keystone dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: org.apache.jclouds.api:rackspace-cloudfiles dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: org.apache.jclouds.provider:rackspace-cloudfiles-us dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: org.apache.jclouds.provider:rackspace-cloudfiles-uk dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: asdf2014 <asdf2014@apache.org>	2024-07-29 14:49:26 +08:00
Kashif Faraz	caedeb66cd	Add API to update compaction engine (#16803 ) Changes: - Add API `/druid/coordinator/v1/config/compaction/global` to update cluster level compaction config - Add class `CompactionConfigUpdateRequest` - Fix bug in `CoordinatorCompactionConfig` which caused compaction engine to not be persisted. Use json field name `engine` instead of `compactionEngine` because JSON field names must align with the getter name. - Update MSQ validation error messages - Complete overhaul of `CoordinatorCompactionConfigResourceTest` to remove unnecessary mocking and add more meaningful tests. - Add `TuningConfigBuilder` to easily build tuning configs for tests. - Add `DatasourceCompactionConfigBuilder`	2024-07-27 09:14:51 +05:30
Edgar Melendrez	c07aeedbec	[docs] Updating Rollup tutorial (#16762 ) Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> Co-authored-by: Benedict Jin <asdf2014@apache.org> Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>	2024-07-26 15:43:31 -07:00
Edgar Melendrez	028ee23a1e	[Docs] batch 03 - trig functions (#16795 ) * batch 03 - trig functions * Apply suggestions from code review Co-authored-by: Charles Smith <techdocsmith@gmail.com> * applying suggestions and corrections --------- Co-authored-by: Charles Smith <techdocsmith@gmail.com>	2024-07-26 13:11:17 -07:00
Charles Smith	ed48cb82e9	[Docs} Remove avro_ocf support from Kafka & Kinesis streaming sources (Revert changes from #11865 ) (#16807 )	2024-07-26 13:06:22 -07:00
Abhishek Radhakrishnan	3c493dc3ed	CircularList round-robin iterator for the KillUnusedSegments duty (#16719 ) * Round-robin iterator for datasources to kill. Currently there's a fairness problem in the KillUnusedSegments duty where the duty consistently selects the same set of datasources as discovered from the metadata store or dynamic config params. This is a problem especially when there are multiple unused. In a medium to large cluster, while we can increase the task slots to increase the likelihood of broader coverage. This patch adds a simple round-robin iterator to select datasources and has the following properties: 1. Starts with an initial random cursor position in an ordered list of candidates. 2. Consecutive {@code next()} iterations from {@link #getIterator()} are guaranteed to be deterministic unless the set of candidates change when {@link #updateCandidates(Set)} is called. 3. Guarantees that no duplicate candidates are returned in two consecutive {@code next()} iterations. * Renames in RoundRobinIteratorTest. * Address review comments. 1. Clarify javadocs on the ordered list. Also flesh out the details a bit more. 2. Rename the test hooks to make intent clearer and fix typo. 3. Add NotThreadSafe annotation. 4. Remove one potentially noisy log that's in the path of iteration. * Add null check to input candidates. * More commentary. * Addres review feedback: downgrade some new info logs to debug; invert condition. Remove redundant comments. Remove rendundant variable tracking. * CircularList adjustments. * Updates to CircularList and cleanup RoundRobinInterator. * One more case and add more tests. * Make advanceCursor private for now. * Review comments.	2024-07-26 12:20:49 -07:00
Sree Charan Manamala	9b76d13ff8	Check for Aggregation inside a window clause when syntax used as - WINDOW W AS DEF (#16801 )	2024-07-26 11:18:35 +02:00
Laksh Singla	725d442355	Faster dimension deserialization on the brokers (#16740 ) Speedier dimension deserialization on the brokers.	2024-07-26 14:36:11 +05:30
Zoltan Haindrich	ed9ef1f635	checkstryle	2024-07-26 03:39:55 +00:00
Clint Wylie	71725b41b5	ignore dependencies for github stale action (#16797 )	2024-07-25 10:32:43 -07:00
Gian Merlino	b2a88da200	Attempt to coerce COMPLEX to number in numeric aggregators. (#16564 ) * Coerce COMPLEX to number in numeric aggregators. PR #15371 eliminated ObjectColumnSelector's built-in implementations of numeric methods, which had been marked deprecated. However, some complex types, like SpectatorHistogram, can be successfully coerced to number. The documentation for spectator histograms encourages taking advantage of this by aggregating complex columns with doubleSum and longSum. Currently, this doesn't work properly for IncrementalIndex, where the behavior relied on those deprecated ObjectColumnSelector methods. This patch fixes the behavior by making two changes: 1) SimpleXYZAggregatorFactory (XYZ = type; base class for simple numeric aggregators; all of these extend NullableNumericAggregatorFactory) use getObject for STRING and COMPLEX. Previously, getObject was only used for STRING. 2) NullableNumericAggregatorFactory (base class for simple numeric aggregators) has a new protected method "useGetObject". This allows the base class to correctly check for null (using getObject or isNull). The patch also adds a test for SpectatorHistogram + doubleSum + IncrementalIndex. * Fix tests. * Remove the special ColumnValueSelector. * Add test.	2024-07-25 08:45:29 -07:00
Rohan Garg	b5f117bca2	Check for tombstones in wrapping storage adapters (#16791 )	2024-07-25 06:55:40 -04:00
Clint Wylie	14954c7eb9	serialize legacy as false for scan query for rolling downgrade/upgrade (#16793 ) Fixes rolling downgrades/upgrades after #16659 by hard coding scan query "legacy":false since it is a required property during deserialization.	2024-07-25 14:51:58 +05:30
Gian Merlino	c1875e7c1d	HashJoinEngine: Check for interruptions while walking left cursor. (#16773 ) * HashJoinEngine: Check for interruptions while walking left cursor. Previously, the engine only checked for interruptions between emitting joined rows. In scenarios where large numbers of left rows are skipped completely (such as a highly selective INNER JOIN) this led to the join cursor being insufficiently responsive to cancellation. * Coverage.	2024-07-25 15:10:50 +08:00
Clint Wylie	5da69a01cb	change arrayIngestMode default to array (#16789 ) * change arrayIngestMode default to array * remove arrayIngestMode flag option none * fix space * fix test	2024-07-25 15:09:40 +08:00
Zoltan Haindrich	8bb38a04a5	fix FIMXE	2024-07-25 03:33:33 +00:00
Zoltan Haindrich	d705c2759b	cleanup	2024-07-25 03:05:04 +00:00
Zoltan Haindrich	7e3fab5bf9	Make WindowFrames more specific (#16741 ) Changes the WindowFrame internals / representation a bit; introduces dedicated frametypes for rows and groups which corresponds to the implemented processing methods	2024-07-25 04:57:36 +02:00
Edgar Melendrez	ca787885c9	[docs] batch02 of updating functions (#16761 ) * applying changes * ensuring batch is updated * Update docs/querying/sql-functions.md * raise -> raises * addressing review * Apply suggestions from code review Co-authored-by: Charles Smith <techdocsmith@gmail.com> --------- Co-authored-by: Benedict Jin <asdf2014@apache.org> Co-authored-by: Charles Smith <techdocsmith@gmail.com>	2024-07-24 15:28:57 -07:00
John Gozde	6ff0cbfa54	Prune date-fns locales, bump sass TODO (#16792 )	2024-07-24 10:50:53 -07:00
Zoltan Haindrich	a489e19242	move to new file	2024-07-24 17:26:07 +00:00
Zoltan Haindrich	d010b488a7	cleanup	2024-07-24 17:23:15 +00:00
Zoltan Haindrich	7428da51de	cleanup	2024-07-24 17:20:42 +00:00
Zoltan Haindrich	0be1f81d7e	remove druidPrettyprinter	2024-07-24 17:17:15 +00:00
Zoltan Haindrich	7cfbfdc3ee	add DruidPrettyPrinter	2024-07-24 17:14:30 +00:00
Zoltan Haindrich	e60a200d95	format/etc	2024-07-24 15:16:39 +00:00
Akshat Jain	a0437b6c93	MSQ window functions: Fix partition boundary issues for arrays (#16780 ) * MSQ window functions: Fix partition boundary issues for arrays * Address review comments * Cache type strategies * Trigger Build * Convert typeStrategies from list to array	2024-07-24 18:47:04 +05:30
Zoltan Haindrich	a9dcb2da46	Merge branch 'quidem-record' into quidem-msq	2024-07-24 10:59:41 +00:00
Clint Wylie	302739aa58	more aggressive cancellation of broker parallel merge, more chill blocking queue timeouts, and query cancellation participation (#16748 ) * more aggressive cancellation of broker parallel merge, more chill blocking queue timeouts * wire parallel merge into query cancellation system * oops * style * adjust metrics initialization * fix timeout, fix cleanup to not block * javadocs to clarify why cancellation future and gizmo are split * cancelled -> canceled, simplify QueuePusher since it always takes a ResultBatch, non-static terminal marker to make stuff stop complaining about types, specialize tryOffer to be tryOfferTerminal so it wont be misused, add comments to clarify reason for non-blocking offers that might fail	2024-07-24 14:58:34 +08:00
Vadim Ogievetsky	4f0b80bef5	Web console: change to use @fontsource/open-sans (#16786 ) * change to use @fontsource/open-sans * import locale directly * update check license	2024-07-23 21:28:59 -07:00
Sree Charan Manamala	3f4d66c399	Check for Unsupported Aggregation with Distinct when useApproxCountDistinct is enabled (#16770 ) * init * add NativelySupportsDistinct * refactor * javadoc * refactor * fix tests * fix drill tests * comments * Update sql/src/test/java/org/apache/druid/sql/calcite/DrillWindowQueryTest.java --------- Co-authored-by: Benedict Jin <asdf2014@apache.org>	2024-07-24 11:13:22 +08:00
Sébastien	aeb2ee59a2	Added an option to hide the workbench-view toolbar (#16785 )	2024-07-23 15:36:54 -07:00
317brian	704962ec8e	doc: minor fixes to migration guides (#16784 )	2024-07-23 13:09:51 -07:00
George Shiqi Wu	a64e9a1746	Add annotation for pod template (#16772 ) * Add annotation for pod template * pr comments * add test cases * add tests	2024-07-23 07:25:15 -07:00
Laksh Singla	11bb40981e	Deduce type from the aggregators when materializing subquery results (#16703 ) For aggregators like StringFirst/Last, whose intermediate type isn't the same as the final type, using them in GroupBy, TopN or Timeseries subqueries causes a fallback when maxSubqueryBytes is set. This is because we assume that the finalization is not known, due to which the row signature cannot determine whether to use the intermediate or the final type, and it puts it as null. This PR figures out the finalization from the query context and uses the intermediate or the final type appropriately.	2024-07-23 11:52:39 +05:30
Akshat Jain	c45d4fdbca	MSQ window functions: Minor cleanup for empty over clause related flows + Exhaustive tests (#16754 ) * MSQ window functions: Revamp logic to create separate window stages when empty over() clause is present * Fix tests * Revert changes of creating separate stages for empty over clause * Address review comments	2024-07-23 11:37:34 +05:30
Gian Merlino	8b8ca0d7fc	DimFilterUtils: Exit filterShards early when filter is null. (#16774 ) When the filter is null, there is no need to run the converter on all the input objects.	2024-07-22 21:17:11 -07:00
Clint Wylie	b645d09c5d	move long and double nested field serialization to later phase of serialization (#16769 ) changes: * moves value column serializer initialization, call to `writeValue` method to `GlobalDictionaryEncodedFieldColumnWriter.writeTo` instead of during `GlobalDictionaryEncodedFieldColumnWriter.addValue`. This shift means these numeric value columns are now done in the per field section that happens after serializing the nested column raw data, so only a single compression buffer and temp file will be needed at a time instead of the total number of nested literal fields present in the column. This should be especially helpful for complicated nested structures with thousands of columns as even those 64k compression buffers can add up pretty quickly to a sizeable chunk of direct memory.	2024-07-22 21:14:30 -07:00
Edgar Melendrez	934c10b1cd	docs: Adding admonition box to warn about MVD (#16712 ) Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> Co-authored-by: Benedict Jin <asdf2014@apache.org>	2024-07-22 17:32:23 -07:00
Clint Wylie	02b8738c00	remove batchProcessingMode from task config, remove AppenderatorImpl (#16765 ) changes: * removes `druid.indexer.task.batchProcessingMode` in favor of always using `CLOSED_SEGMENT_SINKS` which uses `BatchAppenderator`. This was intended to become the default for native batch, but that was missed so `CLOSED_SEGMENTS` was the default (using `AppenderatorImpl`), however MSQ has been exclusively using `BatchAppenderator` with no problems so it seems safe to just roll it out as the only option for batch ingestion everywhere. * with `batchProcessingMode` gone, there is no use for `AppenderatorImpl` so it has been removed * implify `Appenderator` construction since there are only separate stream and batch versions now * simplify tests since `batchProcessingMode` is gone	2024-07-22 13:56:44 -07:00
Akshat Jain	6a2348b78b	Preemptive restriction for queries with approximate count distinct on complex columns of unsupported type (#16682 ) This PR aims to check if the complex column being queried aligns with the supported types in the aggregator and aggregator factories, and throws a user-friendly error message if they don't.	2024-07-22 21:34:06 +05:30
Sree Charan Manamala	149d7c5207	Throw exceptions in SqlValidator when DISTINCT used over WINDOW (#16738 ) * Throw exception if DISTINCT used with window functions aggregate call * Improve error message when unsupported aggregations are used with window functions	2024-07-22 16:29:46 +02:00
Sree Charan Manamala	c9aae9d8e6	Enable WINDOW_LEAF_OPERATOR for native engine to support queries without group by (#16753 )	2024-07-22 12:31:55 +02:00
dave-mccowan	7f7e6ca1e5	Fix excessive logging from druid-basic-security (#16767 ) Fixes #16766 Change log level from INFO to DEBUG when processing an empty user map during polling. An empty user map is a normal situation for some authenticators (e.g. LDAP) and polling is frequent (1 minute by default.)	2024-07-22 08:33:00 +05:30
Vadim Ogievetsky	72eeeec024	fix NPE in number formatting (#16760 )	2024-07-19 15:20:44 -07:00
Clint Wylie	a34a06e192	remove Firehose and FirehoseFactory (#16758 ) changes: * removed `Firehose` and `FirehoseFactory` and remaining implementations which were mostly no longer used after #16602 * Moved `IngestSegmentFirehose` which was still used internally by Hadoop ingestion to `DatasourceRecordReader.SegmentReader` * Rename `SQLFirehoseFactoryDatabaseConnector` to `SQLInputSourceDatabaseConnector` and similar renames for sub-classes * Moved anything remaining in a 'firehose' package somewhere else * Clean up docs on firehose stuff	2024-07-19 14:37:21 -07:00
Zoltan Haindrich	d227029b6b	undo unrealted change	2024-07-19 19:16:46 +00:00

1 2 3 4 5 ...

14434 Commits All Branches Search

14434 Commits

All Branches