597 Commits

Author SHA1 Message Date
AmatyaAvadhanula
a8febd457c
A Replacing task must read segments created before it acquired its lock (#15085)
* Replacing tasks must read segments created before they acquired their locks
2023-10-19 11:13:07 +05:30
George Shiqi Wu
dc0b163e19
Separate task lifecycle from kubernetes/location lifecycle (#15133)
* Separate k8s and druid task lifecycles

* Remove extra log lines

* Fix unit tests

* fix unit tests

* Fix unit tests

* notify listeners on task completion

* Fix unit test

* unused var

* PR changes

* Fix unit tests

* Fix checkstyle

* PR changes
2023-10-17 08:17:43 -07:00
Clint Wylie
d0f64608eb
sql compatible three-valued logic native filters (#15058)
* sql compatible tri-state native logical filters when druid.expressions.useStrictBooleans=true and druid.generic.useDefaultValueForNull=false, and new druid.generic.useThreeValueLogicForNativeFilters=true
* log.warn if non-default configurations are used to guide operators towards SQL complaint behavior
2023-10-12 00:06:23 -07:00
Zoltan Haindrich
ae88f2c0b6
Fix non-sqlcompat validation in CalciteWindowQueryTest (#15086)
* fixes

* check for latest rewrite place

* Revert "check for latest rewrite place"

This reverts commit 5cf1e2c1ca1f70ef84e8f0a8d96d4563b18230ff.

* some stuff

(cherry picked from commit ab346d4373ea888eb8ef6115e018e7fb0d27407f)

* update test output

* updates to test ouptuts

* some stuff

* move validator

* cleanup

* fix

* change test slightly

* add apidoc cleanup warnings

* cleanup/etc

* instead of telling the story; add a fail with some reason whats the issue

* lead-lag fix

* add test

* remove unnecessary throw

* druidexception-trial

* Revert "druidexception-trial"

This reverts commit 8fa06644bc94e45149238053eadec5fcebeb7b28.

* undo changes to no_grouping; add no_grouping2

* add missing assert on resultcount

* rename method; update

* introduce enum/etc

* make resultmatchmode accessible from TestBuilder#expectedResults

* fix dump results to use log

* fix

* handle null correctly

* disable feature type based things for MSQ

* fix varianssqlaggtest

* use eps in other test

* fix intellij error

* add final

* addrss review

* update test/string/etc

* write concat in 3 lines :D
2023-10-11 12:34:31 -07:00
Laksh Singla
5f86072456
Prepare master for Druid 29 (#15121)
Prepare master for Druid 29
2023-10-11 10:33:45 +05:30
Zoltan Haindrich
b5a87fd89b
Support constant args in window functions (#15071)
Instead of passing the constants around in a new parameter; InputAccessor was introduced to take care of transparently handling the constants - this new class started picking up some copy-paste debris around field accesses; and made them a little bit more readble.
2023-10-08 12:14:25 +05:30
Xavier Léauté
adef2069b1
Make unit tests pass with Java 21 (#15014)
This change updates dependencies as needed and fixes tests to remove code incompatible with Java 21
As a result all unit tests now pass with Java 21.

* update maven-shade-plugin to 3.5.0 and follow-up to #15042
  * explain why we need to override configuration when specifying outputFile
  * remove configuration from dependency management in favor of explicit overrides in each module.
* update to mockito to 5.5.0 for Java 21 support when running with Java 11+
  * continue using latest mockito 4.x (4.11.0) when running with Java 8  
  * remove need to mock private fields
* exclude incorrectly declared mockito dependency from pac4j-oidc
* remove mocking of ByteBuffer, since sealed classes can no longer be mocked in Java 21
* add JVM options workaround for system-rules junit plugin not supporting Java 18+
* exclude older versions of byte-buddy from assertj-core
* fix for Java 19 changes in floating point string representation
* fix missing InitializedNullHandlingTest
* update easymock to 5.2.0 for Java 21 compatibility
* update animal-sniffer-plugin to 1.23
* update nl.jqno.equalsverifier to 3.15.1
* update exec-maven-plugin to 3.1.0
2023-10-03 22:41:21 -07:00
Soumyava
cb050282a0
Intervals are updated properly for Unnest queries (#15020)
Fixes a bug where the unnest queries were not updated with the correct intervals.
2023-10-04 02:52:10 +05:30
George Shiqi Wu
64754b6799
Allow users to pass task payload via deep storage instead of environment variable (#14887)
This change is meant to fix a issue where passing too large of a task payload to the mm-less task runner will cause the peon to fail to startup because the payload is passed (compressed) as a environment variable (TASK_JSON). In linux systems the limit for a environment variable is commonly 128KB, for windows systems less than this. Setting a env variable longer than this results in a bunch of "Argument list too long" errors.
2023-10-03 14:08:59 +05:30
Karan Kumar
2f1bcd6717
Adding `"segment/scan/active" metric for processing thread pool. (#15060) 2023-09-29 12:34:28 -07:00
Zoltan Haindrich
5f3b310115
Build reliablity fixes (#15048)
* disable parallel builds; enable batch mode to get rid of transfer progress

* restore .m2 from setup-java if not found

* some change to sql

* add ws

* fix quote

* fix quote

* undo querytest change

* nullhandling in mvtest

* init more

* skip commitid plugin

* add-back 1.0C to build ; remove redundant skip-s from copy-resources; add comment
2023-09-28 12:27:52 -07:00
George Shiqi Wu
8e22a178cc
Support getTaskLocation for mixed task runner (#15033)
The KubernetesAndWorkerTaskRunner currently doesn't implement getTaskLocation, so tasks run by it will show a unknown TaskLocation in the druid console after a task has completed.

Fix bug in KubernetesAndWorkerTaskRunner that manifests as missing information in the druid Web Console.
2023-09-27 08:57:36 +05:30
YongGang
7301e60a9c
Add metrics for number of segments generated per task in MSQ (#14980)
Add ingest/tombstones/count and ingest/segments/count metrics in MSQ.
2023-09-26 02:46:33 +05:30
Tejaswini Bandlamudi
48b6d2abf9
skip org.owasp:dependency-check on extensions-contrib modules and suppress false-positive gRPC CVEs (#15026) 2023-09-25 12:14:42 +05:30
YongGang
be3f93e3cf
Restore tasks when lifecycle start (#14909)
* K8s tasks restore should be from lifecycle start

* add test

* add more tests

* fix test

* wait tasks restore finish when start

* fix style

* revert previous change and add comment
2023-09-22 12:03:34 -07:00
Kashif Faraz
409bffe7f2
Rename IMSC.announceHistoricalSegments to commitSegments (#15021)
This commit pulls out some changes from #14407 to simplify that PR.

Changes:
- Rename `IndexerMetadataStorageCoordinator.announceHistoricalSegments` to `commitSegments`
- Rename the overloaded method to `commitSegmentsAndMetadata`
- Fix some typos
2023-09-21 16:19:03 +05:30
Laksh Singla
82e809c8d0
fix (#15017) 2023-09-20 15:48:26 -07:00
George Shiqi Wu
d459df8d6e
Fix log syntax (#15004) 2023-09-18 10:40:02 -07:00
George Shiqi Wu
f773d83914
Mixed task runner for migration to mm-less ingestion (#14918)
* save work

* Working

* Fix runner constructor

* Working runner

* extra log lines

* try using lifecycle for everything

* clean up configs

* cleanup /workers call

* Use a single config

* Allow selecting runner

* debug changes

* Work on composite task runner

* Unit tests running

* Add documentation

* Add some javadocs

* Fix spelling

* Use standard libraries

* code review

* fix

* fix

* use taskRunner as string

* checkstyl

---------

Co-authored-by: Suneet Saldanha <suneet@apache.org>
2023-09-11 18:09:46 -07:00
Laksh Singla
6ee0b06e38
Auto configuration for maxSubqueryBytes (#14808)
A new monitor SubqueryCountStatsMonitor which emits the metrics corresponding to the subqueries and their execution is now introduced. Moreover, the user can now also use the auto mode to automatically set the number of bytes available per query for the inlining of its subquery's results.
2023-09-06 05:47:19 +00:00
Abhishek Radhakrishnan
9d6ca61ac1
Verify statsd mock client interaction in unit test (#14939) 2023-09-05 07:34:22 -07:00
Kashif Faraz
289ee1e011
Refactor: Cleanup NoopTask (#14938)
Changes:
- Simplify static `create` methods for `NoopTask`
- Remove `FirehoseFactory`, `IsReadyResult`, `readyTime` from `NoopTask`
as these fields were not being used anywhere
- Update tests
2023-09-05 09:15:41 +05:30
Kashif Faraz
7f26b80e21
Simplify ServiceMetricEvent.Builder (#14933)
Changes:
- Make ServiceMetricEvent.Builder extend ServiceEventBuilder<ServiceMetricEvent>
and thus convert it to a plain builder rather than a builder of builder.
- Add methods setCreatedTime , setMetricAndValue to the builder
2023-09-01 11:30:45 +05:30
John Gerassimou
d201ea0ece
prometheus-emitter: add extraLabels parameter (#14728)
* prometheus-emitter: add extraLabels parameter

* prometheus-emitter: update readme to include the extraLabels parameter

* prometheus-emitter: remove nullable and surface label name issues

* remove import to make linter happy
2023-08-29 12:02:22 -07:00
George Shiqi Wu
95b0de61d1
Move some lifecycle management from doTask -> shutdown for the mm-less task runner (#14895)
* save work

* Add syncronized

* Don't shutdown in run

* Adding unit tests

* Cleanup lifecycle

* Fix tests

* remove newline
2023-08-25 10:50:38 -06:00
George Shiqi Wu
ad32f84586
Fix capacity response in mm-less ingestion (#14888)
Changes:
- Fix capacity response in mm-less ingestion.
- Add field usedClusterCapacity to the GET /totalWorkerCapacity response.
This API should be used to get the total ingestion capacity on the overlord.
- Remove method `isK8sTaskRunner` from interface `TaskRunner`
2023-08-25 08:17:38 +05:30
Tejaswini Bandlamudi
388d5ecf78
Fix reported CVEs (#14882)
Suppress CVEs from dependencies with no available fix or false positives
hadoop-annotations: CVE-2022-25168, CVE-2021-33036
hadoop-client-runtime: CVE-2023-1370, CVE-2023-37475
okio: CVE-2023-3635
Upgrade grpc version to fix CVE-2023-33953
2023-08-24 19:28:55 +05:30
Clint Wylie
36e659a501
remove group-by v1 (#14866)
* remove group-by v1

* docs

* remove unused configs, fix test

* fix test

* adjustments

* why not

* adjust

* review stuff
2023-08-23 12:44:06 -07:00
Tejaswini Bandlamudi
d87056e708
Upgrade guava version to 31.1-jre (#14767)
Currently, Druid is using Guava 16.0.1 version. This upgrade to 31.1-jre fixes the following issues.

CVE-2018-10237 (Unbounded memory allocation in Google Guava 11.0 through 24.x before 24.1.1 allows remote attackers to conduct denial of service attacks against servers that depend on this library and deserialize attacker-provided data because the AtomicDoubleArray class (when serialized with Java serialization) and the CompoundOrdering class (when serialized with GWT serialization) perform eager allocation without appropriate checks on what a client has sent and whether the data size is reasonable). We don't use Java or GWT serializations. Despite being false positive they're causing red security scans on Druid distribution.
Latest version of google-client-api is incompatible with the existing Guava version. This PR unblocks Update google client apis to latest version #14414
2023-08-22 12:09:53 +05:30
Abhishek Radhakrishnan
37db5d9b81
Reset offsets supervisor API (#14772)
* Add supervisor /resetOffsets API.

- Add a new endpoint /druid/indexer/v1/supervisor/<supervisorId>/resetOffsets
which accepts DataSourceMetadata as a body parameter.
- Update logs, unit tests and docs.

* Add a new interface method for backwards compatibility.

* Rename

* Adjust tests and javadocs.

* Use CoreInjectorBuilder instead of deprecated makeInjectorWithModules

* UT fix

* Doc updates.

* remove extraneous debugging logs.

* Remove the boolean setting; only ResetHandle() and resetInternal()

* Relax constraints and add a new ResetOffsetsNotice; cleanup old logic.

* A separate ResetOffsetsNotice and some cleanup.

* Minor cleanup

* Add a check & test to verify that sequence numbers are only of type SeekableStreamEndSequenceNumbers

* Add unit tests for the no op implementations for test coverage

* CodeQL fix

* checkstyle from merge conflict

* Doc changes

* DOCUSAURUS code tabs fix. Thanks, Brian!
2023-08-17 14:13:10 -07:00
Clint Wylie
6b14dde50e
deprecate config-magic in favor of json configuration stuff (#14695)
* json config based processing and broker merge configs to deprecate config-magic
2023-08-16 18:23:57 -07:00
dependabot[bot]
faf79470ae
Bump io.dropwizard.metrics:metrics-graphite from 3.1.2 to 4.2.19 (#14842)
* Bump io.dropwizard.metrics:metrics-graphite from 3.1.2 to 4.2.19

Bumps [io.dropwizard.metrics:metrics-graphite](https://github.com/dropwizard/metrics) from 3.1.2 to 4.2.19.
- [Release notes](https://github.com/dropwizard/metrics/releases)
- [Commits](https://github.com/dropwizard/metrics/compare/v3.1.2...v4.2.19)

---
updated-dependencies:
- dependency-name: io.dropwizard.metrics:metrics-graphite
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

* align graphite-emitter dropwizard version with core

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Xavier Léauté <xvrl@apache.org>
2023-08-16 13:58:35 -07:00
YongGang
3954685aae
Report more metrics to monitor K8s task runner (#14771)
* Report pod running metrics to monitor K8s task runner

* refine method definition

* fix checkstyle

* implement task metrics

* more comment

* address comments

* update doc for the new metrics reported

* fix checkstyle

* refine method definition

* minor refine
2023-08-16 14:03:53 -04:00
Rishabh Singh
0dc305f9e4
Upgrade hibernate validator version to fix CVE-2019-10219 (#14757) 2023-08-14 11:50:51 +05:30
George Shiqi Wu
c8a11702db
Support broadcast segmetns (#14789) 2023-08-11 11:14:05 -07:00
hqx871
a0234c4e13
Add sampling factor for DeterminePartitionsJob (#13840)
There are two type of DeterminePartitionsJob:
-  When the input data is not assume grouped, there may be duplicate rows.
In this case, two MR jobs are launched. The first one do group job to remove duplicate rows.
And a second one to perform global sorting to find lower and upper bound for target segments.
- When the input data is assume grouped, we only need to launch the global sorting
MR job to find lower and upper bound for segments.

Sampling strategy:
- If the input data is assume grouped, sample by random at the mapper side of the global sort mr job.
- If the input data is not assume grouped, sample at the mapper of the group job. Use hash on time
and all dimensions and mod by sampling factor to sample, don't use random method because there
may be duplicate rows.
2023-08-11 10:42:25 +05:30
zachjsh
82d82dfbd6
Add stats to KillUnusedSegments coordinator duty (#14782)
### Description

Added the following metrics, which are calculated from the `KillUnusedSegments` coordinatorDuty

`"killTask/availableSlot/count"`: calculates the number remaining task slots available for auto kill
`"killTask/maxSlot/count"`: calculates the maximum number of tasks available for auto kill
`"killTask/task/count"`: calculates the number of tasks submitted by auto kill. 

#### Release note
NEW: metrics added for auto kill

`"killTask/availableSlot/count"`: calculates the number remaining task slots available for auto kill
`"killTask/maxSlot/count"`: calculates the maximum number of tasks available for auto kill
`"killTask/task/count"`: calculates the number of tasks submitted by auto kill.
2023-08-10 18:36:53 -04:00
Xavier Léauté
37ed0f4a17
Bump jclouds.version from 1.9.1 to 2.0.3 (#14746)
* Updates `org.apache.jclouds:*` from 1.9.1 to 2.0.3
* Pin jclouds to 2.0.x since 2.1.x requires Guava 18+
* replace easymock with mockito

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-10 06:24:01 -07:00
George Shiqi Wu
c8537dbeaf
Add lifecycle hooks to KubernetesTaskRunner (#14790) 2023-08-09 21:16:44 -07:00
Tejaswini Bandlamudi
a45b25fa1d
Removes support for Hadoop 2 (#14763)
Removing Hadoop 2 support as discussed in https://lists.apache.org/list?dev@druid.apache.org:lte=1M:hadoop
2023-08-09 17:47:52 +05:30
Tejaswini Bandlamudi
550a66d71e
Upgrade jackson-databind to 2.12.7 (#14770)
The current version of jackson-databind is flagged for vulnerabilities CVE-2020-28491 (Although cbor format is not used in druid), CVE-2020-36518 (Seems genuine as deeply nested json in can cause resource exhaustion). Updating the dependency to the latest version 2.12.7 to fix these vulnerabilities.
2023-08-09 12:22:16 +05:30
George Shiqi Wu
14940dc3ed
Add pod name to TaskLocation for easier observability and debugging. (#14758)
* Add pod name to location

* Add log

* fix style

* Update extensions-contrib/kubernetes-overlord-extensions/src/main/java/org/apache/druid/k8s/overlord/KubernetesPeonLifecycle.java

Co-authored-by: Suneet Saldanha <suneet@apache.org>

* Fix unit tests

---------

Co-authored-by: Suneet Saldanha <suneet@apache.org>
2023-08-07 12:33:35 -07:00
Suneet Saldanha
62ddeaf16f
Additional dimensions for service/heartbeat (#14743)
* Additional dimensions for service/heartbeat

* docs

* review

* review
2023-08-04 11:01:07 -07:00
YongGang
3335040b22
Report task/pending/time metrics for k8s based ingestion (#14698)
Changes:
* Add and invoke `StateListener` when state changes in `KubernetesPeonLifecycle`
* Report `task/pending/time` metric in `KubernetesTaskRunner` when state moves to RUNNING
2023-08-04 09:07:11 +05:30
imply-cheddar
748874405c
Minimize PostAggregator computations (#14708)
* Minimize PostAggregator computations

Since a change back in 2014, the topN query has been computing
all PostAggregators on all intermediate responses from leaf nodes
to brokers.  This generates significant slow downs for queries
with relatively expensive PostAggregators.  This change rewrites
the query that is pushed down to only have the minimal set of
PostAggregators such that it is impossible for downstream
processing to do too much work.  The final PostAggregators are
applied at the very end.
2023-08-04 00:04:31 +05:30
George Shiqi Wu
174053f4fd
Add readme for kubernetes-overlord-extensions and update docs (#14674)
* Add readme for kubernetes task scheduler

* clean up uneeded stuff

* Update extensions-contrib/kubernetes-overlord-extensions/README.md

Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>

* Move documentation into main page

* indentation

* cleanup spellcheck errors

* Update docs/development/extensions-contrib/k8s-jobs.md

Co-authored-by: Suneet Saldanha <suneet@apache.org>

* Update extensions-contrib/kubernetes-overlord-extensions/README.md

Co-authored-by: Suneet Saldanha <suneet@apache.org>

* Update docs/development/extensions-contrib/k8s-jobs.md

Co-authored-by: Suneet Saldanha <suneet@apache.org>

* PR comments

* Update docs/development/extensions-contrib/k8s-jobs.md

Co-authored-by: Suneet Saldanha <suneet@apache.org>

* Update docs/development/extensions-contrib/k8s-jobs.md

Co-authored-by: Suneet Saldanha <suneet@apache.org>

* Update docs/development/extensions-contrib/k8s-jobs.md

Co-authored-by: Suneet Saldanha <suneet@apache.org>

---------

Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>
Co-authored-by: Suneet Saldanha <suneet@apache.org>
2023-08-01 13:29:44 -07:00
YongGang
9b88b78ba4
Fix race condition in KubernetesTaskRunner when task is added to the map (#14643)
Changes:
- Fix race condition in KubernetesTaskRunner introduced by #14435 
- Perform addition and removal from map inside a synchronized block
- Update tests
2023-07-27 12:34:36 +05:30
George Shiqi Wu
f742bb7376
Get task location should be stored on the lifecycle object (#14649)
* Fix issue with long data source names

* Use the regular library

* Save location and tls enabled

* Null out before running

* add another comment
2023-07-24 18:36:19 -07:00
George Shiqi Wu
28914bbab8
Fix issue with long data source names (#14620)
* Fix issue with long data source names

* Use the regular library

* fix overlord utils test
2023-07-24 08:45:10 -07:00
Abhishek Agarwal
efb32810c4
Clean up the core API required for Iceberg extension (#14614)
Changes:
- Replace `AbstractInputSourceBuilder` with `InputSourceFactory`
- Move iceberg specific logic to `IcebergInputSource`
2023-07-21 13:01:33 +05:30