Commit Graph

13347 Commits

Author SHA1 Message Date
Tejaswini Bandlamudi 550a66d71e
Upgrade jackson-databind to 2.12.7 (#14770)
The current version of jackson-databind is flagged for vulnerabilities CVE-2020-28491 (Although cbor format is not used in druid), CVE-2020-36518 (Seems genuine as deeply nested json in can cause resource exhaustion). Updating the dependency to the latest version 2.12.7 to fix these vulnerabilities.
2023-08-09 12:22:16 +05:30
Karan Kumar cd817fc469
Fixing typo in `resultsTruncated` (#14779) 2023-08-08 20:51:44 -07:00
Clint Wylie e57f880020
document new filters and stuff (#14760) 2023-08-08 16:01:06 -07:00
Clint Wylie 667e4dab5e
document expression aggregator (#14497) 2023-08-08 15:49:29 -07:00
317brian 8a4dabc431
docs: remove experimental from schema auto-discoery (#14759) 2023-08-08 12:45:44 -07:00
zachjsh 660e6cfa01
Allow for task limit on kill tasks spawned by auto kill coordinator duty (#14769)
### Description

Previously, the `KillUnusedSegments` coordinator duty, in charge of periodically deleting unused segments, could spawn an unlimited number of kill tasks for unused segments. This change adds 2 new coordinator dynamic configs that can be used to control the limit of tasks spawned by this coordinator duty

`killTaskSlotRatio`: Ratio of total available task slots, including autoscaling if applicable that will be allowed for kill tasks. This limit only applies for kill tasks that are spawned automatically by the coordinator's auto kill duty. Default is 1, which allows all available tasks to be used, which is the existing behavior

`maxKillTaskSlots`: Maximum number of tasks that will be allowed for kill tasks. This limit only applies for kill tasks that are spawned automatically by the coordinator's auto kill duty. Default is INT.MAX, which essentially allows for unbounded number of tasks, which is the existing behavior. 

Realize that we can effectively get away with just the one `killTaskSlotRatio`, but following similarly to the compaction config, which has similar properties; I thought it was good to have some control of the upper limit regardless of ratio provided.

#### Release note
NEW: `killTaskSlotRatio`  and `maxKillTaskSlots` coordinator dynamic config properties added that allow control of task resource usage spawned by `KillUnusedSegments` coordinator task (auto kill)
2023-08-08 08:40:55 -04:00
Clint Wylie 2845b6a424
add new filters to unnest filter pushdown (#14777) 2023-08-08 03:29:18 -07:00
Tejaswini Bandlamudi d0403f00fd
upgrade org.mozilla:rhino (#14765) 2023-08-08 12:17:59 +05:30
Suneet Saldanha 2af0ab2425
Metric to report time spent fetching and analyzing segments (#14752)
* Metric to report time spent fetching and analyzing segments

* fix test

* spell check

* fix tests

* checkstyle

* remove unused variable

* Update docs/operations/metrics.md

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

* Update docs/operations/metrics.md

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

* Update docs/operations/metrics.md

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

---------

Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>
2023-08-07 18:32:48 -07:00
Abhishek Radhakrishnan bff8f9e12e
Update kinesis docs (#14768)
Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>
Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>
2023-08-07 17:08:34 -07:00
Suneet Saldanha b624a4ec4a
Rolling Supervisor restarts at taskDuration (#14396)
* Rolling supervior task publishing

* add an option for number of task groups to roll over

* better

* remove docs

* oops

* checkstyle

* wip test

* undo partial test change

* remove incomplete test
2023-08-07 16:24:32 -07:00
George Shiqi Wu 14940dc3ed
Add pod name to TaskLocation for easier observability and debugging. (#14758)
* Add pod name to location

* Add log

* fix style

* Update extensions-contrib/kubernetes-overlord-extensions/src/main/java/org/apache/druid/k8s/overlord/KubernetesPeonLifecycle.java

Co-authored-by: Suneet Saldanha <suneet@apache.org>

* Fix unit tests

---------

Co-authored-by: Suneet Saldanha <suneet@apache.org>
2023-08-07 12:33:35 -07:00
Victoria Lim 7d7813372a
Docs: Include EARLIEST_BY and LATEST_BY as supported aggregation functions (#14280) 2023-08-07 09:59:12 -07:00
Adarsh Sanjeev 56ab81f381
Add support for different result formats to MSQ SqlStatementResource (#14571)
* Add support for different result format

* Add tests

* Add tests

* Fix checkstyle

* Remove changes to destination

* Removed some unwanted code

* Address review comments

* Rename parameter

* Fix tests
2023-08-07 20:48:59 +05:30
Kashif Faraz 2d8e0f28f3
Refactor: Cleanup coordinator duties for metadata cleanup (#14631)
Changes
- Add abstract class `MetadataCleanupDuty`
- Make `KillAuditLogs`, `KillCompactionConfig`, etc extend `MetadataCleanupDuty` 
- Improve log and error messages
- Cleanup tests
- No functional change
2023-08-05 13:08:23 +05:30
Suneet Saldanha 62ddeaf16f
Additional dimensions for service/heartbeat (#14743)
* Additional dimensions for service/heartbeat

* docs

* review

* review
2023-08-04 11:01:07 -07:00
Suneet Saldanha 590734b5eb
Update tutorial-kafka.md (#14749) 2023-08-04 10:56:33 -07:00
Clint Wylie e5661a394c
refactor front-coded into static classes instead of using functional interfaces (#14572)
* refactor front-coded into static classes instead of using functional interfaces

* shared v0 static method instead of copy
2023-08-04 10:52:36 -07:00
Laksh Singla d6c73ca6e5
Cleanup the documentation for deep storage 2023-08-04 10:20:01 +00:00
Abhishek Agarwal 6ced208391
Improve the backport missing script (#14723) 2023-08-04 15:21:55 +05:30
Soumyava 0d73480c8f
Latest aggregator factories should accept time as VectorValueSelecto… (#14753)
Fix the queries that have latest aggregator with an expression as time column
2023-08-04 13:04:25 +05:30
317brian 3b5b6c6a41
docs: query from deep storage (#14609)
* cold tier wip

* wip

* copyedits

* wip

* copyedits

* copyedits

* wip

* wip

* update rules page

* typo

* typo

* update sidebar

* moves durable storage info to its own page in operations

* update screenshots

* add apache license

* Apply suggestions from code review

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* add query from deep storage tutorial stub

* address some of the feedback

* revert screenshot update. handled in separate pr

* load rule update

* wip tutorial

* reformat deep storage endpoints

* rest of tutorial

* typo

* cleanup

* screenshot and sidebar for tutorial

* add license

* typos

* Apply suggestions from code review

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* rest of review comments

* clarify where results are stored

* update api reference for durablestorage context param

* Apply suggestions from code review

Co-authored-by: Karan Kumar <karankumar1100@gmail.com>

* comments

* incorporate #14720

* address rest of comments

* missed one

* Update docs/api-reference/sql-api.md

* Update docs/api-reference/sql-api.md

---------

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
Co-authored-by: demo-kratia <56242907+demo-kratia@users.noreply.github.com>
Co-authored-by: Karan Kumar <karankumar1100@gmail.com>
2023-08-04 11:10:08 +05:30
Pranav d31c04c4c6
Fix the bug in getIndexInfo for mysql (#14750) 2023-08-03 21:45:01 -07:00
YongGang 3335040b22
Report task/pending/time metrics for k8s based ingestion (#14698)
Changes:
* Add and invoke `StateListener` when state changes in `KubernetesPeonLifecycle`
* Report `task/pending/time` metric in `KubernetesTaskRunner` when state moves to RUNNING
2023-08-04 09:07:11 +05:30
zachjsh ba957a9b97
Add ability to limit the number of segments killed in kill task (#14662)
### Description

Previously, the `maxSegments` configured for auto kill could be ignored if an interval of data for a  given datasource had more than this number of unused segments, causing the kill task spawned with the task of deleting unused segments in that given interval of data to delete more than the `maxSegments` configured. Now each kill task spawned by the auto kill coordinator duty, will kill at most `limit` segments. This is done by adding a new config property to the `KillUnusedSegmentTask` which allows users to specify this limit.
2023-08-03 22:17:04 -04:00
imply-cheddar 748874405c
Minimize PostAggregator computations (#14708)
* Minimize PostAggregator computations

Since a change back in 2014, the topN query has been computing
all PostAggregators on all intermediate responses from leaf nodes
to brokers.  This generates significant slow downs for queries
with relatively expensive PostAggregators.  This change rewrites
the query that is pushed down to only have the minimal set of
PostAggregators such that it is impossible for downstream
processing to do too much work.  The final PostAggregators are
applied at the very end.
2023-08-04 00:04:31 +05:30
YongGang 20c48b6a3d
Retry S3 task log fetch in case of transient S3 exceptions (#14714) 2023-08-03 19:46:10 +05:30
Kashif Faraz b27d281b11
Remove unused param in MetadataResource (#14747) 2023-08-03 19:18:01 +05:30
Suneet Saldanha 00f1f8cef5
Enable ServiceStatusMonitor in the examples (#14744) 2023-08-03 06:07:01 -07:00
AmatyaAvadhanula 5a52f7a457
Fix IT failure due to query interval (#14738) 2023-08-02 11:29:35 -07:00
Adarsh Sanjeev 6837a7be19
Add logging for downsampling sketches in MSQ (#14580)
* Add more logs for downsampling sketches

* Fix builds

* Lower log level

* Add new log message
2023-08-02 20:07:54 +05:30
Abhishek Agarwal 955734ba8d
Fix exempt labels in stale.yml (#14733) 2023-08-02 17:12:18 +05:30
Clint Wylie 94fb41a4df
fix nested field virtual column array column element vector object selector (#14729)
Fixes a case I missed in #14688 when the return type is STRING but its coming from a top level array typed column instead of a nested array column while making a vector object selector.

Also while here I noticed that the internal JSON_VALUE functions for array types were named inconsistently with the non-array functions, so I renamed them. These are not documented so it should not be disruptive in any way, since they are only used internally for rewrites while planning to make the correctly virtual column.

JSON_VALUE_RETURNING_ARRAY_VARCHAR -> JSON_VALUE_ARRAY_VARCHAR
JSON_VALUE_RETURNING_ARRAY_BIGINT -> JSON_VALUE_ARRAY_BIGINT
JSON_VALUE_RETURNING_ARRAY_DOUBLE -> JSON_VALUE_ARRAY_DOUBLE
The internal non-array functions are JSON_VALUE_VARCHAR, JSON_VALUE_BIGINT, and JSON_VALUE_DOUBLE.
2023-08-02 17:08:24 +05:30
Xavier Léauté c1c2435aee
upgrade core Apache Kafka dependencies to 3.5.1 (#14721)
Release notes: https://downloads.apache.org/kafka/3.5.1/RELEASE_NOTES.html
Announcement: https://lists.apache.org/thread/p7jyv3ys7b6jowcb6lys7821qcbcpb07

Release notes: https://downloads.apache.org/kafka/3.5.0/RELEASE_NOTES.html
Announcement: https://lists.apache.org/thread/s6x3zvkrv32v5y8yb6hh31h57spdbylk
2023-08-02 01:08:40 -07:00
Gian Merlino 72c151a192
MSQ WorkerImpl: Ignore ServiceClosedException on postCounters. (#14707)
* MSQ WorkerImpl: Ignore ServiceClosedException on postCounters.

A race can happen where postCounters is in flight while the controller
goes offline. When this happens, we should ignore the ServiceClosedException
and continue without posting counters.

* Fix style and logic.
2023-08-02 07:00:10 +05:30
Vadim Ogievetsky 4a31ae26f4
Web console: Page downloader, and fix JSON error resetting (#14712)
* fix error reset

* add page dialog logic

* add to detail archive

* update tests

* fix plurals

* use jsonl ext

* fix regex issue
2023-08-01 14:25:41 -07:00
George Shiqi Wu 174053f4fd
Add readme for kubernetes-overlord-extensions and update docs (#14674)
* Add readme for kubernetes task scheduler

* clean up uneeded stuff

* Update extensions-contrib/kubernetes-overlord-extensions/README.md

Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>

* Move documentation into main page

* indentation

* cleanup spellcheck errors

* Update docs/development/extensions-contrib/k8s-jobs.md

Co-authored-by: Suneet Saldanha <suneet@apache.org>

* Update extensions-contrib/kubernetes-overlord-extensions/README.md

Co-authored-by: Suneet Saldanha <suneet@apache.org>

* Update docs/development/extensions-contrib/k8s-jobs.md

Co-authored-by: Suneet Saldanha <suneet@apache.org>

* PR comments

* Update docs/development/extensions-contrib/k8s-jobs.md

Co-authored-by: Suneet Saldanha <suneet@apache.org>

* Update docs/development/extensions-contrib/k8s-jobs.md

Co-authored-by: Suneet Saldanha <suneet@apache.org>

* Update docs/development/extensions-contrib/k8s-jobs.md

Co-authored-by: Suneet Saldanha <suneet@apache.org>

---------

Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>
Co-authored-by: Suneet Saldanha <suneet@apache.org>
2023-08-01 13:29:44 -07:00
Kashif Faraz ee4e0c93b4
Improve alert message for segment assignments (#14696)
Changes:
- Add interface `SegmentDeleteHandler` for marking segments as unused
- In `StrategicSegmentAssigner`, collect all segments on which a drop rule applies in a list
- Process the list above as a batch delete rather than individual deletes
- Improve alert messages when an invalid tier is specified in a load rule
- Improve alert message when no rule applies on a segment
2023-08-01 23:33:05 +05:30
Vadim Ogievetsky 153948198c
Web console: fix grouped filtering and add complex menu (#14668)
* fix filtering when grouped

* add complex menu

* complex aggs

* use ResizeObserverEntry

* add quantile and test

* fix style

* update snapshots
2023-08-01 10:41:44 -07:00
Clint Wylie 2e456d25ae
fix issue with ExprEval.bestEffortOf and mixed type arrays containing ARRAY<COMPLEX<json>> and other complicated casts (#14710) 2023-08-01 09:25:14 -07:00
Pranav 8a10b46dd8
Adding the PropertyNamingStrategies from jackson for fixing hadoop ingestion (#14671) 2023-08-01 20:02:43 +05:30
Kashif Faraz 10328c0743
Rename metadatacache and serverview metrics (#14716) 2023-08-01 14:18:20 +05:30
Kashif Faraz d04521d58f
Improve description field when emitting metric for broadcast failure (#14703)
Changes:
- Emit descriptions such as `Load queue is full`, `No disk space` etc. instead of `Unknown error`
- Rewrite `BroadcastDistributionRuleTest`
2023-08-01 10:13:55 +05:30
Abhishek Agarwal 5c96b60162
Increase heap size for router (#14699) 2023-08-01 08:58:48 +05:30
Gian Merlino 5387f1bac0
Remove chatAsync parameter, so chat is always async. (#14692)
* Remove chatAsync parameter, so chat is always async.

chatAsync has been made default in Druid 26. I have seen good
battle-testing of it in production, and am comfortable removing the
older sync client.

This was the last remaining usage of IndexTaskClient, so this patch
deletes all that stuff too.

* Remove unthrown exception.

* Remove unthrown exception.

* No more TimeoutException.
2023-07-31 19:42:51 -07:00
Jason Koch 44d5c1a15f
split KillUnusedSegmentsTask to processing in smaller chunks (#14642)
split KillUnusedSegmentsTask to smaller batches

Processing in smaller chunks allows the task execution to yield the TaskLockbox lock,
which allows the overlord to continue being responsive to other tasks and users while
this particular kill task is executing.

* introduce KillUnusedSegmentsTask batchSize parameter to control size of batching

* provide an explanation for kill task batchSize parameter

* add logging details for kill batch progress
2023-07-31 12:56:27 -07:00
Adarsh Sanjeev 339b8d959f
Change the default format from OBJECT to OBJECTLINES (#14700) 2023-07-31 18:39:58 +00:00
Adarsh Sanjeev 21d023b62b
Handle taskIds which are not found in the overlord correctly (#14706)
This PR has fixes a bug in the SqlStatementAPI where if the task is not found on the overlord, the response status is 500.
This changes the response to invalid input since the queryID passed is not valid.
2023-07-31 21:38:14 +05:30
Kashif Faraz 844a9c7ffb
Cancel loads of unused segments (#14644) 2023-07-31 18:01:50 +05:30
Kashif Faraz e9b4f1e95c
Fix reported replication factor of segment with zero required replicas (#14701) 2023-07-31 14:51:01 +05:30