Commit Graph

14419 Commits

Author SHA1 Message Date
Clint Wylie 14954c7eb9
serialize legacy as false for scan query for rolling downgrade/upgrade (#16793)
Fixes rolling downgrades/upgrades after #16659 by hard coding scan query "legacy":false since it is a required property during deserialization.
2024-07-25 14:51:58 +05:30
Gian Merlino c1875e7c1d
HashJoinEngine: Check for interruptions while walking left cursor. (#16773)
* HashJoinEngine: Check for interruptions while walking left cursor.

Previously, the engine only checked for interruptions between emitting
joined rows. In scenarios where large numbers of left rows are skipped
completely (such as a highly selective INNER JOIN) this led to the
join cursor being insufficiently responsive to cancellation.

* Coverage.
2024-07-25 15:10:50 +08:00
Clint Wylie 5da69a01cb
change arrayIngestMode default to array (#16789)
* change arrayIngestMode default to array

* remove arrayIngestMode flag option none

* fix space

* fix test
2024-07-25 15:09:40 +08:00
Zoltan Haindrich 8bb38a04a5 fix FIMXE 2024-07-25 03:33:33 +00:00
Zoltan Haindrich d705c2759b cleanup 2024-07-25 03:05:04 +00:00
Zoltan Haindrich 7e3fab5bf9
Make WindowFrames more specific (#16741)
Changes the WindowFrame internals / representation a bit; introduces dedicated frametypes for rows and groups which corresponds to the implemented processing methods
2024-07-25 04:57:36 +02:00
Edgar Melendrez ca787885c9
[docs] batch02 of updating functions (#16761)
* applying changes

* ensuring batch is updated

* Update docs/querying/sql-functions.md

* raise -> raises

* addressing review

* Apply suggestions from code review

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

---------

Co-authored-by: Benedict Jin <asdf2014@apache.org>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2024-07-24 15:28:57 -07:00
John Gozde 6ff0cbfa54
Prune date-fns locales, bump sass TODO (#16792) 2024-07-24 10:50:53 -07:00
Zoltan Haindrich a489e19242 move to new file 2024-07-24 17:26:07 +00:00
Zoltan Haindrich d010b488a7 cleanup 2024-07-24 17:23:15 +00:00
Zoltan Haindrich 7428da51de cleanup 2024-07-24 17:20:42 +00:00
Zoltan Haindrich 0be1f81d7e remove druidPrettyprinter 2024-07-24 17:17:15 +00:00
Zoltan Haindrich 7cfbfdc3ee add DruidPrettyPrinter 2024-07-24 17:14:30 +00:00
Zoltan Haindrich e60a200d95 format/etc 2024-07-24 15:16:39 +00:00
Akshat Jain a0437b6c93
MSQ window functions: Fix partition boundary issues for arrays (#16780)
* MSQ window functions: Fix partition boundary issues for arrays

* Address review comments

* Cache type strategies

* Trigger Build

* Convert typeStrategies from list to array
2024-07-24 18:47:04 +05:30
Zoltan Haindrich a9dcb2da46 Merge branch 'quidem-record' into quidem-msq 2024-07-24 10:59:41 +00:00
Clint Wylie 302739aa58
more aggressive cancellation of broker parallel merge, more chill blocking queue timeouts, and query cancellation participation (#16748)
* more aggressive cancellation of broker parallel merge, more chill blocking queue timeouts

* wire parallel merge into query cancellation system

* oops

* style

* adjust metrics initialization

* fix timeout, fix cleanup to not block

* javadocs to clarify why cancellation future and gizmo are split

* cancelled -> canceled, simplify QueuePusher since it always takes a ResultBatch, non-static terminal marker to make stuff stop complaining about types, specialize tryOffer to be tryOfferTerminal so it wont be misused, add comments to clarify reason for non-blocking offers that might fail
2024-07-24 14:58:34 +08:00
Vadim Ogievetsky 4f0b80bef5
Web console: change to use @fontsource/open-sans (#16786)
* change to use @fontsource/open-sans

* import locale directly

* update check license
2024-07-23 21:28:59 -07:00
Sree Charan Manamala 3f4d66c399
Check for Unsupported Aggregation with Distinct when useApproxCountDistinct is enabled (#16770)
* init

* add NativelySupportsDistinct

* refactor

* javadoc

* refactor

* fix tests

* fix drill tests

* comments

* Update sql/src/test/java/org/apache/druid/sql/calcite/DrillWindowQueryTest.java

---------

Co-authored-by: Benedict Jin <asdf2014@apache.org>
2024-07-24 11:13:22 +08:00
Sébastien aeb2ee59a2
Added an option to hide the workbench-view toolbar (#16785) 2024-07-23 15:36:54 -07:00
317brian 704962ec8e
doc: minor fixes to migration guides (#16784) 2024-07-23 13:09:51 -07:00
George Shiqi Wu a64e9a1746
Add annotation for pod template (#16772)
* Add annotation for pod template

* pr comments

* add test cases

* add tests
2024-07-23 07:25:15 -07:00
Laksh Singla 11bb40981e
Deduce type from the aggregators when materializing subquery results (#16703)
For aggregators like StringFirst/Last, whose intermediate type isn't the same as the final type, using them in GroupBy, TopN or Timeseries subqueries causes a fallback when maxSubqueryBytes is set. This is because we assume that the finalization is not known, due to which the row signature cannot determine whether to use the intermediate or the final type, and it puts it as null. This PR figures out the finalization from the query context and uses the intermediate or the final type appropriately.
2024-07-23 11:52:39 +05:30
Akshat Jain c45d4fdbca
MSQ window functions: Minor cleanup for empty over clause related flows + Exhaustive tests (#16754)
* MSQ window functions: Revamp logic to create separate window stages when empty over() clause is present

* Fix tests

* Revert changes of creating separate stages for empty over clause

* Address review comments
2024-07-23 11:37:34 +05:30
Gian Merlino 8b8ca0d7fc
DimFilterUtils: Exit filterShards early when filter is null. (#16774)
When the filter is null, there is no need to run the converter on
all the input objects.
2024-07-22 21:17:11 -07:00
Clint Wylie b645d09c5d
move long and double nested field serialization to later phase of serialization (#16769)
changes:
* moves value column serializer initialization, call to `writeValue` method to `GlobalDictionaryEncodedFieldColumnWriter.writeTo` instead of during `GlobalDictionaryEncodedFieldColumnWriter.addValue`. This shift means these numeric value columns are now done in the per field section that happens after serializing the nested column raw data, so only a single compression buffer and temp file will be needed at a time instead of the total number of nested literal fields present in the column. This should be especially helpful for complicated nested structures with thousands of columns as even those 64k compression buffers can add up pretty quickly to a sizeable chunk of direct memory.
2024-07-22 21:14:30 -07:00
Edgar Melendrez 934c10b1cd
docs: Adding admonition box to warn about MVD (#16712)
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
Co-authored-by: Benedict Jin <asdf2014@apache.org>
2024-07-22 17:32:23 -07:00
Clint Wylie 02b8738c00
remove batchProcessingMode from task config, remove AppenderatorImpl (#16765)
changes:
* removes `druid.indexer.task.batchProcessingMode` in favor of always using `CLOSED_SEGMENT_SINKS` which uses `BatchAppenderator`. This was intended to become the default for native batch, but that was missed so `CLOSED_SEGMENTS` was the default (using `AppenderatorImpl`), however MSQ has been exclusively using `BatchAppenderator` with no problems so it seems safe to just roll it out as the only option for batch ingestion everywhere.
* with `batchProcessingMode` gone, there is no use for `AppenderatorImpl` so it has been removed
* implify `Appenderator` construction since there are only separate stream and batch versions now
* simplify tests since `batchProcessingMode` is gone
2024-07-22 13:56:44 -07:00
Akshat Jain 6a2348b78b
Preemptive restriction for queries with approximate count distinct on complex columns of unsupported type (#16682)
This PR aims to check if the complex column being queried aligns with the supported types in the aggregator and aggregator factories, and throws a user-friendly error message if they don't.
2024-07-22 21:34:06 +05:30
Sree Charan Manamala 149d7c5207
Throw exceptions in SqlValidator when DISTINCT used over WINDOW (#16738)
* Throw exception if DISTINCT used with window functions aggregate call
* Improve error message when unsupported aggregations are used with window functions
2024-07-22 16:29:46 +02:00
Sree Charan Manamala c9aae9d8e6
Enable WINDOW_LEAF_OPERATOR for native engine to support queries without group by (#16753) 2024-07-22 12:31:55 +02:00
dave-mccowan 7f7e6ca1e5
Fix excessive logging from druid-basic-security (#16767)
Fixes #16766

Change log level from INFO to DEBUG when processing an empty user map
during polling.  An empty user map is a normal situation for some
authenticators (e.g. LDAP) and polling is frequent (1 minute by
default.)
2024-07-22 08:33:00 +05:30
Vadim Ogievetsky 72eeeec024
fix NPE in number formatting (#16760) 2024-07-19 15:20:44 -07:00
Clint Wylie a34a06e192
remove Firehose and FirehoseFactory (#16758)
changes:
* removed `Firehose` and `FirehoseFactory` and remaining implementations which were mostly no longer used after #16602
* Moved `IngestSegmentFirehose` which was still used internally by Hadoop ingestion to `DatasourceRecordReader.SegmentReader`
* Rename `SQLFirehoseFactoryDatabaseConnector` to `SQLInputSourceDatabaseConnector` and similar renames for sub-classes
* Moved anything remaining in a 'firehose' package somewhere else
* Clean up docs on firehose stuff
2024-07-19 14:37:21 -07:00
Zoltan Haindrich d227029b6b undo unrealted change 2024-07-19 19:16:46 +00:00
Charles Smith 1881880714
[Docs] Adds a migration guide SQL compatible null handling (#16704)
Co-authored-by: Clint Wylie <cjwylie@gmail.com>
Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
2024-07-19 09:25:05 -07:00
Sébastien e286be9427
Exposes hooks to customize the workbench-view (#16749)
* Exposes hooks to customize the workbench-view

* addressed PR feedback

* naming

* auto -> formatInteger(maxNum)
2024-07-19 08:53:34 -07:00
Zoltan Haindrich f7247e1bb7 use entryset 2024-07-19 15:13:17 +00:00
Kashif Faraz b1edf4a5b4
Refactor: Clean up Overlord guice dependencies (#16752)
Description:
Overlord guice dependencies are currently a little difficult to plug into.
This was encountered while working on a separate PR where a class needed to depend
on `TaskMaster.getTaskQueue()`  to query some task related info but this class itself
needs to be a dependency of `TaskMaster` so that it can be registered to the leader lifecycle.

The approach taken here is to simply decouple the leadership lifecycle of the overlord from
manipulation or querying of its state.

Changes:
- No functional change
- Add new class `DruidOverlord` to contain leadership logic after the model of `DruidCoordinator`
- The new class `DruidOverlord` should not be a dependency of any class with the exception of
REST endpoint `*Resource` classes.
- All classes that need to listen to leadership changes must be a dependency of `DruidOverlord`
so that they can be registered to the leadership lifecycle.
- Move all querying logic from `OverlordResource` to `TaskQueryTool` so that other classes can
leverage this logic too (required for follow up PR).
- Update tests
2024-07-19 17:30:23 +05:30
Zoltan Haindrich b38935a450 add test; fb 2024-07-19 11:44:23 +00:00
Zoltan Haindrich 31e97324ce x 2024-07-19 11:36:51 +00:00
Zoltan Haindrich e2a54b5758 update 2024-07-19 08:42:58 +00:00
Zoltan Haindrich 361149b097 m 2024-07-19 07:29:50 +00:00
Clint Wylie 35b876436b
remove native scan query legacy mode (#16659) 2024-07-18 23:33:27 -07:00
Zoltan Haindrich bc7174cb6a cleanup 2024-07-19 04:30:15 +00:00
Zoltan Haindrich 9cf723adae rename 2024-07-19 04:29:05 +00:00
Zoltan Haindrich 7a34b6e092 cleanup 2024-07-19 04:28:02 +00:00
Vadim Ogievetsky 0a274d56a1
Web console: upgrade to Blueprint5 (#16756)
* pre upgrade

* did the upgrade

* update snapshots

* fix BP5 issues

* update licenses

* fix more depication warnings

* use segmented control

* updat snapshots

* convert to fake local time

* preload icons before tests

* update e2e tests

* Update web-console/src/components/segment-timeline/segment-timeline.tsx

Co-authored-by: John Gozde <john@gozde.ca>

* Update web-console/src/components/segment-timeline/segment-timeline.tsx

Co-authored-by: John Gozde <john@gozde.ca>

* update e2e test selector

* direct import date-fns

---------

Co-authored-by: John Gozde <john@gozde.ca>
2024-07-18 20:47:44 -07:00
Edgar Melendrez 721a65046f
docs: add examples for SQL functions (#16745)
* updating first batch of numeric functions

* First batch of functions

* addressing first few comments

* alphabetize list

* draft with suggestions applied

* minor discrepency expr -> <NUMERIC>

* changed raises to calculates

* Update docs/querying/sql-functions.md

* switch to underscore

* changed to exp(1) to match slack message

* adding html text for trademark symbol to .spelling

* fixed discrepancy between description and example

---------

Co-authored-by: Benedict Jin <asdf2014@apache.org>
2024-07-18 17:06:22 -07:00
Zoltan Haindrich d216b934fc Merge remote-tracking branch 'kgyrtkirk/quidem-record' into quidem-record 2024-07-18 11:41:21 +00:00