Commit Graph

14442 Commits

Author SHA1 Message Date
Clint Wylie 02b8738c00
remove batchProcessingMode from task config, remove AppenderatorImpl (#16765)
changes:
* removes `druid.indexer.task.batchProcessingMode` in favor of always using `CLOSED_SEGMENT_SINKS` which uses `BatchAppenderator`. This was intended to become the default for native batch, but that was missed so `CLOSED_SEGMENTS` was the default (using `AppenderatorImpl`), however MSQ has been exclusively using `BatchAppenderator` with no problems so it seems safe to just roll it out as the only option for batch ingestion everywhere.
* with `batchProcessingMode` gone, there is no use for `AppenderatorImpl` so it has been removed
* implify `Appenderator` construction since there are only separate stream and batch versions now
* simplify tests since `batchProcessingMode` is gone
2024-07-22 13:56:44 -07:00
Akshat Jain 6a2348b78b
Preemptive restriction for queries with approximate count distinct on complex columns of unsupported type (#16682)
This PR aims to check if the complex column being queried aligns with the supported types in the aggregator and aggregator factories, and throws a user-friendly error message if they don't.
2024-07-22 21:34:06 +05:30
Sree Charan Manamala 149d7c5207
Throw exceptions in SqlValidator when DISTINCT used over WINDOW (#16738)
* Throw exception if DISTINCT used with window functions aggregate call
* Improve error message when unsupported aggregations are used with window functions
2024-07-22 16:29:46 +02:00
Sree Charan Manamala c9aae9d8e6
Enable WINDOW_LEAF_OPERATOR for native engine to support queries without group by (#16753) 2024-07-22 12:31:55 +02:00
dave-mccowan 7f7e6ca1e5
Fix excessive logging from druid-basic-security (#16767)
Fixes #16766

Change log level from INFO to DEBUG when processing an empty user map
during polling.  An empty user map is a normal situation for some
authenticators (e.g. LDAP) and polling is frequent (1 minute by
default.)
2024-07-22 08:33:00 +05:30
Vadim Ogievetsky 72eeeec024
fix NPE in number formatting (#16760) 2024-07-19 15:20:44 -07:00
Clint Wylie a34a06e192
remove Firehose and FirehoseFactory (#16758)
changes:
* removed `Firehose` and `FirehoseFactory` and remaining implementations which were mostly no longer used after #16602
* Moved `IngestSegmentFirehose` which was still used internally by Hadoop ingestion to `DatasourceRecordReader.SegmentReader`
* Rename `SQLFirehoseFactoryDatabaseConnector` to `SQLInputSourceDatabaseConnector` and similar renames for sub-classes
* Moved anything remaining in a 'firehose' package somewhere else
* Clean up docs on firehose stuff
2024-07-19 14:37:21 -07:00
Zoltan Haindrich d227029b6b undo unrealted change 2024-07-19 19:16:46 +00:00
Charles Smith 1881880714
[Docs] Adds a migration guide SQL compatible null handling (#16704)
Co-authored-by: Clint Wylie <cjwylie@gmail.com>
Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
2024-07-19 09:25:05 -07:00
Sébastien e286be9427
Exposes hooks to customize the workbench-view (#16749)
* Exposes hooks to customize the workbench-view

* addressed PR feedback

* naming

* auto -> formatInteger(maxNum)
2024-07-19 08:53:34 -07:00
Zoltan Haindrich f7247e1bb7 use entryset 2024-07-19 15:13:17 +00:00
Kashif Faraz b1edf4a5b4
Refactor: Clean up Overlord guice dependencies (#16752)
Description:
Overlord guice dependencies are currently a little difficult to plug into.
This was encountered while working on a separate PR where a class needed to depend
on `TaskMaster.getTaskQueue()`  to query some task related info but this class itself
needs to be a dependency of `TaskMaster` so that it can be registered to the leader lifecycle.

The approach taken here is to simply decouple the leadership lifecycle of the overlord from
manipulation or querying of its state.

Changes:
- No functional change
- Add new class `DruidOverlord` to contain leadership logic after the model of `DruidCoordinator`
- The new class `DruidOverlord` should not be a dependency of any class with the exception of
REST endpoint `*Resource` classes.
- All classes that need to listen to leadership changes must be a dependency of `DruidOverlord`
so that they can be registered to the leadership lifecycle.
- Move all querying logic from `OverlordResource` to `TaskQueryTool` so that other classes can
leverage this logic too (required for follow up PR).
- Update tests
2024-07-19 17:30:23 +05:30
Zoltan Haindrich b38935a450 add test; fb 2024-07-19 11:44:23 +00:00
Zoltan Haindrich 31e97324ce x 2024-07-19 11:36:51 +00:00
Zoltan Haindrich e2a54b5758 update 2024-07-19 08:42:58 +00:00
Zoltan Haindrich 361149b097 m 2024-07-19 07:29:50 +00:00
Clint Wylie 35b876436b
remove native scan query legacy mode (#16659) 2024-07-18 23:33:27 -07:00
Zoltan Haindrich bc7174cb6a cleanup 2024-07-19 04:30:15 +00:00
Zoltan Haindrich 9cf723adae rename 2024-07-19 04:29:05 +00:00
Zoltan Haindrich 7a34b6e092 cleanup 2024-07-19 04:28:02 +00:00
Vadim Ogievetsky 0a274d56a1
Web console: upgrade to Blueprint5 (#16756)
* pre upgrade

* did the upgrade

* update snapshots

* fix BP5 issues

* update licenses

* fix more depication warnings

* use segmented control

* updat snapshots

* convert to fake local time

* preload icons before tests

* update e2e tests

* Update web-console/src/components/segment-timeline/segment-timeline.tsx

Co-authored-by: John Gozde <john@gozde.ca>

* Update web-console/src/components/segment-timeline/segment-timeline.tsx

Co-authored-by: John Gozde <john@gozde.ca>

* update e2e test selector

* direct import date-fns

---------

Co-authored-by: John Gozde <john@gozde.ca>
2024-07-18 20:47:44 -07:00
Edgar Melendrez 721a65046f
docs: add examples for SQL functions (#16745)
* updating first batch of numeric functions

* First batch of functions

* addressing first few comments

* alphabetize list

* draft with suggestions applied

* minor discrepency expr -> <NUMERIC>

* changed raises to calculates

* Update docs/querying/sql-functions.md

* switch to underscore

* changed to exp(1) to match slack message

* adding html text for trademark symbol to .spelling

* fixed discrepancy between description and example

---------

Co-authored-by: Benedict Jin <asdf2014@apache.org>
2024-07-18 17:06:22 -07:00
Zoltan Haindrich d216b934fc Merge remote-tracking branch 'kgyrtkirk/quidem-record' into quidem-record 2024-07-18 11:41:21 +00:00
Zoltan Haindrich 76ff3f26e1 add supress 2024-07-18 07:25:19 +00:00
Zoltan Haindrich eb4fd9f66c removedup 2024-07-18 07:24:56 +00:00
Benedict Jin e388140b2a
Apply suggestions from code review 2024-07-18 15:06:59 +08:00
Alberic Liu 0eaa810e89
Fix the maven warning during build (#16746) 2024-07-18 14:56:15 +08:00
Zoltan Haindrich 47aeb016df Merge branch 'quidem-record' into quidem-msq 2024-07-18 05:48:32 +00:00
Zoltan Haindrich 06b68b6c89 Merge remote-tracking branch 'apache/master' into quidem-record 2024-07-18 05:48:13 +00:00
Akshat Jain b53c26f5c5
Fix issues with partitioning boundaries for MSQ window functions (#16729)
* Fix issues with partitioning boundaries for MSQ window functions

* Address review comments

* Address review comments

* Add test for coverage check failure

* Address review comment

* Remove DruidWindowQueryTest and WindowQueryTestBase, move those tests to DrillWindowQueryTest

* Update extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/querykit/WindowOperatorQueryKit.java

* Address review comments

* Add test for equals and hashcode for WindowOperatorQueryFrameProcessorFactory

* Address review comment

* Fix checkstyle

---------

Co-authored-by: Benedict Jin <asdf2014@apache.org>
2024-07-18 10:05:09 +08:00
Vadim Ogievetsky 44b3f8e588
Web console: fix a few console bugs (#16735)
* remove __time from min max query shortcut

* fix scrolling in retention rules dialog

* actions menus should have titles

* change term

* correctly name sort/shuffle
2024-07-17 14:51:17 -07:00
Zoltan Haindrich 70ff2a3e97 add exploratory msqPlan cmd 2024-07-17 19:48:08 +00:00
Zoltan Haindrich 8b26e490e9 fix types/resultset/etc 2024-07-17 19:30:33 +00:00
Kashif Faraz 89066b72cf
Fix bug in TaskStorageQueryAdapter (#16750)
Changes:
- Do not hold a reference to `TaskQueue` in `TaskStorageQueryAdapter`
- Use `TaskStorage` instead of `TaskStorageQueryAdapter` in `IndexerMetadataStorageAdapter`
- Rename `TaskStorageQueryAdapter` to `TaskQueryTool`
- Fix newly added task actions `RetrieveUpgradedFromSegmentIds` and `RetrieveUpgradedToSegmentIds`
by removing `isAudited` method.
2024-07-17 23:17:41 +05:30
Zoltan Haindrich c59f1adcc8 updates 2024-07-17 16:42:22 +00:00
Zoltan Haindrich 95ca0a9f5d cleanup 2024-07-17 16:41:09 +00:00
Zoltan Haindrich b100e982a4 make/etc 2024-07-17 16:40:30 +00:00
Zoltan Haindrich 0811d801fb make query run 2024-07-17 16:33:10 +00:00
Zoltan Haindrich 97c32ca3de less crappy way to run it 2024-07-17 16:19:08 +00:00
Zoltan Haindrich 6790f9cf8b move stuff 2024-07-17 16:08:32 +00:00
Zoltan Haindrich 51d465df6d make engine load via injector for msqdrill 2024-07-17 16:04:14 +00:00
Zoltan Haindrich 0eaf4c61b9 removePrint 2024-07-17 15:52:19 +00:00
Zoltan Haindrich f3cf778115 some stuff 2024-07-17 15:48:36 +00:00
Zoltan Haindrich 42b3086512 msq-test-0 2024-07-17 15:38:50 +00:00
Zoltan Haindrich 8ada2ff238 picked akshat's 3e0202811e05dcd07db5ab47791151fab5dd5772 2024-07-17 14:44:27 +00:00
Zoltan Haindrich 82436df585 fix test;disable dep-check for module 2024-07-17 14:34:33 +00:00
Zoltan Haindrich 2a590eb3ae Merge commit 'apache/master^^^' into quidem-record 2024-07-17 13:27:54 +00:00
Sree Charan Manamala 40ef9fc4ec
Bug fix for array type selector causing array aggregation over window frame fail (#16653) 2024-07-17 14:09:56 +02:00
Kashif Faraz 9f6ce6ddc0
Remove task action audit logging and druid_taskLog metadata table (#16309)
Description:
Task action audit logging was first deprecated and disabled by default in Druid 0.13, #6368.

As called out in the original discussion #5859, there are several drawbacks to persisting task action audit logs. 
- Only usage of the task audit logs is to serve the API `/indexer/v1/task/{taskId}/segments`
which returns the list of segments created by a task.
- The use case is really narrow and no prod clusters really use this information.
- There can be better ways of obtaining this information, such as the metric
`segment/added/bytes` which reports both the segment ID and task ID
when a segment is committed by a task. We could also include committed segment IDs in task reports.
- A task persisting several segments would bloat up the audit logs table putting unnecessary strain
on metadata storage.

Changes:
- Remove `TaskAuditLogConfig`
- Remove method `TaskAction.isAudited()`. No task action is audited anymore.
- Remove `SegmentInsertAction` as it is not used anymore. `SegmentTransactionalInsertAction`
is the new incarnation which has been in use for a while.
- Deprecate `MetadataStorageActionHandler.addLog()` and `getLogs()`. These are not used anymore
but need to be retained for backward compatibility of extensions.
- Do not create `druid_taskLog` metadata table anymore.
2024-07-17 17:09:00 +05:30
trompa ebf216829d
#16717 defer provider instantiation in Kubernetes Module (#16726)
* #16717 defer provider instatiation

* add license header

* fix style, ignore new class in jacoco as it is still initialization code

---------

Co-authored-by: Alberto Lago Alvarado <albl@sitecore.net>
2024-07-16 13:05:28 -07:00