Commit Graph

12249 Commits

Author SHA1 Message Date
Frank Chen d30cf8c308
Dependency cleanup (#13194)
* Clean up dependency in extensions

* Bump protobuf/aws.sdk

* Bump aws-sdk to 1.12.317

* Fix CI

* Fix CI

* Update license

* Update license
2022-10-10 20:34:38 +08:00
Gian Merlino 5b519f3689
Fix null message handling in AllowedRegexErrorResponseTransformStrategy. (#13177)
Error messages can be null. If the incoming error message is null, then
return null.
2022-10-09 07:42:41 -07:00
Vadim Ogievetsky 573e12c75f
Web console: making the cell filter menu more functional, removing the old query view, and updating d3 (#13169)
* remove old query view

* update tests

* add filter

* fix test

* bump d3 things to latest versions

* rent too far into the future with d3

* make config dialogs load

* goodies

* update snapshots

* only compute duration when running or pending
2022-10-07 12:44:40 -07:00
Charles Smith 25c1d55dd6
Clarify behavior when decommissioningMaxPercentOfMaxSegmentsToMove = 0 (#13157)
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
2022-10-07 09:01:32 -07:00
Sam Rash f89496ccac
Revert Accidental Change to Druid.xml (#13190)
See commit 54a2eb for accidental commit
2022-10-06 14:42:35 -07:00
317brian 0edceead80
msq: update known issue about GROUPING SETS and COUNT DISTINCT (#13185)
* msq: update known issue about GROUPING SETS and COUNT DISTINCT

* address feedback from Gian
2022-10-05 19:47:03 -07:00
AmatyaAvadhanula 41e51b21c3
Make http options the default configurations (#13092)
Druid currently uses Zookeeper dependent options as the default.
This commit updates the following to use HTTP as the default instead.
- task runner. `druid.indexer.runner.type=remote -> httpRemote`
- load queue peon. `druid.coordinator.loadqueuepeon.type=curator -> http`
- server inventory view. `druid.serverview.type=curator -> http`
2022-10-05 05:35:17 +05:30
Xavier Léauté eff7edb603
update core Apache Kafka dependencies to 3.3.1 (#13176)
Announcement:
- https://blogs.apache.org/kafka/entry/what-rsquo-s-new-in

Release notes:
- https://archive.apache.org/dist/kafka/3.3.0/RELEASE_NOTES.html
- https://downloads.apache.org/kafka/3.3.1/RELEASE_NOTES.html
2022-10-04 12:52:16 -07:00
Abhishek Agarwal e3f9a0ed44
Lazy initialization of segment killers, movers and archivers (#13170)
* Lazy initialization of segment killers, movers and archivers

* Add test for lazy killer

* Add more tests

* Intellij fixes
2022-10-04 15:55:46 +05:30
Kashif Faraz b07f01d645
Set useMaxMemoryEstimates=false by default (#13178)
A value of `false` denotes that the new flow with improved estimates will be used.
2022-10-04 15:04:23 +05:30
Abhishek Agarwal 7fa53ff4b3
Exclude calcite from dependabot (#13160)
* Exclude calcite from dependabot

* Update .github/dependabot.yml

Co-authored-by: Liam Newman <96086065+liam-verta@users.noreply.github.com>

* Update dependabot.yml

Co-authored-by: Liam Newman <96086065+liam-verta@users.noreply.github.com>
2022-10-04 10:21:11 +08:00
Vadim Ogievetsky 4bfae1deee
Docs: fix doc search (#13164)
* fix doc search

* upgrade website node to 16

* change website travis script

* move spellcheck notification

* explicit path to npm bin

* cd to the correct place
2022-10-03 16:48:13 -07:00
Adarsh Sanjeev 92d2633ae6
Update ClusterByStatisticsCollectorImpl to use bytes instead of keys (#12998)
* Update clusterByStatistics to use bytes instead of keys

* Address review comments

* Resolve checkstyle

* Increase test coverage

* Update test

* Update thresholds

* Update retained keys function

* Update docs

* Fix spelling
2022-10-03 12:08:23 +05:30
Vadim Ogievetsky ebfe1c0c90
Web console: fix DQT import (#13159)
* fix dqt import

* update licenses

* update tests
2022-09-30 09:31:06 -07:00
Kashif Faraz ce5f55e5ce
Fix over-replication caused by balancing when inventory is not updated yet (#13114)
* Add coordinator test framework

* Remove outdated changes

* Add more tests

* Add option to auto-sync inventory

* Minor cleanup

* Fix inspections

* Add README for simulations, add SegmentLoadingNegativeTest

* Fix over-replication from balancing

* Fix README

* Cleanup unnecessary fields from DruidCoordinator

* Add a test

* Fix DruidCoordinatorTest

* Remove unused import

* Fix CuratorDruidCoordinatorTest

* Remove test log4j2.xml
2022-09-29 12:06:23 +05:30
Abhishek Agarwal 61b34950e7
Fix assertion error in sql planning for latest aggregators (#13151)
* Fix sql planning bug for latest aggregators

* change test name

* Fix error messages

* fix error message again
2022-09-28 21:01:32 +05:30
AmatyaAvadhanula acafd0d1e0
Upgrade kafka version to 3.2.3 to fix CVE (#13142)
Upgrade to 3.2.3 to fix CVE: https://nvd.nist.gov/vuln/detail/CVE-2022-34917
2022-09-28 10:47:09 +05:30
Jill Osborne 548d810baa
Correct nested columns example (#13150) 2022-09-28 10:39:56 +05:30
David Palmer 0d7bf66578
Add a note to the documentation about pre-built HLLSketches (#13088)
* add a note to the documentation about pre-built HLLSketches

Druid actually supports ingesting a pre-generated sketch column by using
the HLLSketchMerge aggregator. However, this functionality was
previously not made clear in the documentation.

* copyedit from the King's English to American English

* add suggested style changes

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2022-09-27 10:29:39 +08:00
Apoorv Gupta c8f4d72fb1
Fix documentation bug about injective lookups (#13147)
replace mapping to `unique keys` with mapping to `unique values`.
2022-09-27 10:16:48 +08:00
Sam Rash 28b9edc2a8
Add BIG_SUM SQL function (#13102)
This adds a sql function, "BIG_SUM", that uses
CompressedBigDecimal to do a sum. Other misc changes:

1. handle NumberFormatExceptions when parsing a string (default to set
   to 0, configurable in agg factory to be strict and throw on error)
2. format pom file (whitespace) + add dependency
3. scaleUp -> scale and always require scale as a parameter
2022-09-26 18:02:25 -07:00
Jonathan Wei 1f1fced6d4
Add JsonInputFormat option to assume newline delimited JSON, improve parse exception handling for multiline JSON (#13089)
* Add JsonInputFormat option to assume newline delimited JSON, improve handling for non-NDJSON

* Fix serde and docs

* Add PR comment check
2022-09-26 19:51:04 -05:00
imply-cheddar e839660b6a
Grab the thread name in a poisoned pool (#13143) 2022-09-26 17:09:10 -07:00
Laksh Singla 0bfa81b7df
Fix the Injector creation in HadoopTask (#13138)
* Injector fix in HadoopTask

* Log the ExtensionsConfig while instantiating the HadoopTask

* Log the config in the run() method instead of the ctor
2022-09-24 10:38:25 +05:30
Adarsh Sanjeev 306f612f86
Suppress Calcite CVE (#13119)
* Suppress Calcite CVE

* Update comment
2022-09-23 16:23:26 +05:30
Vadim Ogievetsky a910764e41
better spec conversion with issues (#13136) 2022-09-22 10:46:57 -07:00
Vadim Ogievetsky 6c1dc6589e
initialize all counters for stages with input (#13137) 2022-09-22 08:10:50 -07:00
Laksh Singla 728745a1d3
Add IT for MSQ task engine using the new IT framework (#12992)
* first test, serde causing problems

* serde working

* insert and select check

* Add cluster annotations for MSQ test cases

* Add cluster config for MSQ

* Add MSQ config to the pom.xml

* cleanup unnecessary changes

* Remove model classes

* Comments, checkstyle, check queries from file

* fixup test case name

* build failure fix

* review changes

* build failure fix

* Trigger Build

* Log the mismatch in QueryResultsVerifier

* Trigger Build

* Change the signature of the results verifier

* review changes

* LGTM fix

* build, change pom

* Trigger Build

* Trigger Build

* trigger build with minimal pom changes

* guice fix in tests

* travis.yml
2022-09-22 16:09:47 +05:30
Sam Rash 044cab5094
Optimize CompressedBigDecimal compareTo() (#13086)
Optimizes the compareTo() function in
CompressedBigDecimal. It directly compares the int[] rather than
creating BigDecimal objects and using its compareTo.

It handles unequal sized CBDs, but does require
the scales to match.
2022-09-21 20:31:02 -07:00
Vadim Ogievetsky f1d3728371
append to exisitng callout (#13130) 2022-09-21 19:39:28 -07:00
Charles Smith eb760c3d1d
update log4j example (#13095)
* update log4j example

* fix some style issues

* Update docs/configuration/logging.md

Co-authored-by: Frank Chen <frankchen@apache.org>

Co-authored-by: Frank Chen <frankchen@apache.org>
2022-09-22 09:46:49 +08:00
317brian 12f12a13a9
fix: fix broken postgres link (#13135) 2022-09-22 09:46:20 +08:00
317brian 7fa35839c0
fix: follow naming convention for msq task engine (#13127)
* fix: follow naming convention for msq task engine

* more fixes

* add back in experimental

* fix anchor
2022-09-21 18:46:06 -07:00
Gian Merlino 2f731f356e
Update pull-deps docs with correct repo list. (#13134)
There is only one default remote repo at this time.
2022-09-21 12:16:57 -07:00
Jonathan Wei 331e6d707b
Add KafkaConfigOverrides extension point (#13122)
* Add KafkaConfigOverrides extension point

* X
2022-09-21 11:47:19 +05:30
Katya Macedo 90d14f629a
spatial-filters (#13124) 2022-09-20 22:48:36 -07:00
Kashif Faraz 0039409817
Add test framework to simulate segment loading and balancing (#13074)
Fixes #12822 

The framework added here make it easy to write tests that verify the behaviour and interactions
of the following entities under various conditions:
- `DruidCoordinator`
- `HttpLoadQueuePeon`, `LoadQueueTaskMaster`
- coordinator duties: `BalanceSegments`, `RunRules`, `UnloadUnusedSegments`, etc.
- datasource retention rules: `LoadRule`, `DropRule`

Changes:
Add the following main classes:
- `CoordinatorSimulation` and related interfaces to dictate behaviour of simulation
- `CoordinatorSimulationBuilder` to build a simulation.
- `BlockingExecutorService` to keep submitted tasks in queue and execute them
  only when explicitly invoked.

Add tests:
- `CoordinatorSimulationBaseTest`, `SegmentLoadingTest`, `SegmentBalancingTest`
- `SegmentLoadingNegativeTest` to contain tests which assert the existing erroneous behaviour
of segment loading. Once the behaviour is fixed, these tests will be moved to the regular
`SegmentLoadingTest`.

Please refer to the README.md in `org.apache.druid.server.coordinator.simulate` for more details
2022-09-21 09:51:58 +05:30
hosswald 5ed5c83aab
Clarified the behaviour of SQL COUNT(DISTINCT dim) on multi-value dimensions (#13128)
* Clarified the behaviour of COUNT(DISTINCT column) on multi-value columns

* Update docs/querying/sql-aggregations.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

Co-authored-by: Vadim Ogievetsky <vadimon@gmail.com>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2022-09-20 18:03:34 -07:00
Vadim Ogievetsky edc444a4bc
fix quickstart (#13126) 2022-09-20 17:44:21 -07:00
Abhishek Agarwal 455b074b36
Move JDK11 ITs to cron stage (#13075)
* Move JDK11 ITs to cron stage

* Make cron run on release branches

* Review comments

* fix spelling
2022-09-20 09:18:52 -07:00
Vadim Ogievetsky b9edfe34a4
be consistent about referring to the web console by its name (#13118) 2022-09-19 15:02:17 -07:00
Frank Chen a3391693eb
Improve a MSQ planning error message (#13113) 2022-09-19 23:11:54 +08:00
abhagraw 48638a5438
Getting extension list from pom (#13073)
* Getting extension list from pom

* Trigger Build
2022-09-19 15:14:21 +05:30
Clint Wylie a0e0fbe1b3
nested column serializer performance improvement for sparse columns (#13101) 2022-09-19 14:07:48 +05:30
Paul Rogers 8ce03eb094
Convert the Druid planner to use statement handlers (#12905)
* Converted Druid planner to use statement handlers

Converts the large collection of if-statements for statement
types into a set of classes: one per supported statement type.
Cleans up a few error messages.

* Revisions from review comments

* Build fix

* Build fix

* Resolve merge confict.

* More merges with QueryResponse PR

* More parameterized type cleanup

Forces a rebuild due to a flaky test
2022-09-19 11:58:45 +05:30
Vadim Ogievetsky bb0b810b1d
fix html tags in docs (#13117)
* fix html tags in docs

* revert not null
2022-09-18 19:40:33 -07:00
Gian Merlino 2e729170cc
Kill task: Don't include markAsUnused unless set. (#13104)
Cleans up the serialized JSON.
2022-09-17 14:03:34 -07:00
Vadim Ogievetsky de8f229bed
Web console: correctly escape path based flatten specs (#13105)
* fix path generation

* do escape

* fix replace

* fix replace for good
2022-09-17 14:02:42 -07:00
Gian Merlino d9b2968edb
Docs: Clarify the situation with SELECT. (#13109) 2022-09-17 10:47:57 -07:00
Charles Smith b366a6c5a4
Add clarification around docker environment #8926 (#13084)
* Add clarification around docker environment #8926

* fix spelling

* Update docs/tutorials/docker.md

Co-authored-by: Frank Chen <frankchen@apache.org>

* Update docs/tutorials/docker.md

Co-authored-by: Frank Chen <frankchen@apache.org>

* fix nano quickstart

Co-authored-by: Frank Chen <frankchen@apache.org>
2022-09-17 20:44:24 +08:00