14036 Commits

Author SHA1 Message Date
Zoltan Haindrich
28ea884e19 almost ready? 2024-05-16 10:01:22 +00:00
Zoltan Haindrich
27735f2621 move disco 2024-05-16 09:50:10 +00:00
Zoltan Haindrich
cab3d945be up 2024-05-16 09:48:18 +00:00
Zoltan Haindrich
c9638b7836 update 2024-05-16 09:44:16 +00:00
Zoltan Haindrich
7e10df1ffa ... 2024-05-16 09:33:51 +00:00
Zoltan Haindrich
4a47b0229e no roles 2024-05-16 09:31:21 +00:00
Zoltan Haindrich
5f552a2997 c 2024-05-16 09:30:41 +00:00
Zoltan Haindrich
074161dfde add some service crap 2024-05-16 05:53:42 +00:00
Zoltan Haindrich
55b2051f9d workinhg stuff 2024-05-15 16:23:11 +00:00
Zoltan Haindrich
8ee41f58d0 it does work 2024-05-15 15:14:43 +00:00
Zoltan Haindrich
d4b052a579 stuff 2024-05-15 11:57:13 +00:00
Zoltan Haindrich
73011267af triaks 2024-05-15 10:34:48 +00:00
Zoltan Haindrich
a16f982699 remove crap 2024-05-14 16:04:19 +00:00
Zoltan Haindrich
43fd8af63c Revert "add"
This reverts commit 3fbb3cb853456bebccfbf8fc16ba7f30a810c26c.
2024-05-14 09:39:04 +00:00
Zoltan Haindrich
3fbb3cb853 add 2024-05-14 09:39:02 +00:00
Zoltan Haindrich
b7b73fa7fe fix context key order 2024-05-14 08:54:53 +00:00
Zoltan Haindrich
9578953678 Merge remote-tracking branch 'apache/master' into quidem-runner-extension-submit 2024-05-14 07:36:48 +00:00
Zoltan Haindrich
3132c12781 remove unnecessary \\ 2024-05-14 07:36:07 +00:00
Adarsh Sanjeev
18a4722d11
Resolve a bug where datasketches would not downsample sketches sufficiently (#16119)
* Fix sketch memory issue

* Rename function

* Add unit test

* Revert downsampling change
2024-05-14 10:23:57 +05:30
Sree Charan Manamala
b8dd7478d0
Custom Calcite Rule to remove redundant references (#16402)
Custom calcite rule mimicking AggregateProjectMergeRule to extend support to expressions.
The current calcite rule return null in such cases.
In addition, this removes the redundant references.
2024-05-14 06:38:05 +02:00
Vadim Ogievetsky
760e449875
Web console: Fix order-by-delta in explore view table (#16417)
* change to using measure name

* Implment order by delta

* less paring, stricter types

* safeDivide0

* fix no query

* new DTQ alows parsing JSON_VALUE(...RETURNING...)
2024-05-13 19:03:46 -07:00
Akshat Jain
d1100a6f63
Add retries for building S3 client (#16438)
* Add retries for building S3 client

* Use S3Utils instead of RetryUtils

* Add test
2024-05-13 16:32:06 -07:00
Zoltan Haindrich
e36c46a85a fix import
style fixes

clenaup
2024-05-13 15:52:03 +00:00
Laksh Singla
4bfc186153
Support sorting on complex columns in MSQ (#16322)
MSQ sorts the columns in a highly specialized manner by byte comparisons. As such the values are serialized differently. This works well for the primitive types and primitive arrays, however complex types cannot be serialized specially.

This PR adds the support for sorting the complex columns by deserializing the value from the field and comparing it via the type strategy. This is a lot slower than the byte comparisons, however, it's the only way to support sorting on complex columns that can have arbitrary serialization not optimized for MSQ.

The primitives and the arrays are still compared via the byte comparison, therefore this doesn't affect the performance of the queries supported before the patch. If there's a sorting key with mixed complex and primitive/primitive array types, for example: longCol1 ASC, longCol2 ASC, complexCol1 DESC, complexCol2 DESC, stringCol1 DESC, longCol3 DESC, longCol4 ASC, the comparison will happen like:

    longCol1, longCol2 (ASC) - Compared together via byte-comparison, since both are byte comparable and need to be sorted in ascending order
    complexCol1 (DESC) - Compared via deserialization, cannot be clubbed with any other field
    complexCol2 (DESC) - Compared via deserialization, cannot be clubbed with any other field, even though the prior field was a complex column with the same order
    stringCol1, longCol3 (DESC) - Compared together via byte-comparison, since both are byte comparable and need to be sorted in descending order
    longCol4 (ASC) - Compared via byte-comparison, couldn't be coalesced with the previous fields as the direction was different

This way, we only deserialize the field wherever required
2024-05-13 15:07:05 +05:30
Akshat Jain
bacdb4c48d
Update integration tests related documentation for better clarity (#16313) 2024-05-13 11:27:21 +05:30
Sensor
1601a0f8f8
add ignore path (#16429) 2024-05-11 17:54:52 +08:00
aho135
9459722ebf
Use canonical hostname instead of ip by default (#16386)
Co-authored-by: Andrew Ho <a.ho@salesforce.com>
2024-05-11 17:53:22 +08:00
Alberic Liu
811dcd1726
update protobuf.md (#16434) 2024-05-11 17:52:54 +08:00
Zoltan Haindrich
e13d560b6e Enable quidem shadowing for decoupled testcases
* Altered `QueryTestBuilder` to be able to switch to a backing quidem test
* added a small crc to ensure that the shadow testcase does not deviate from the original one
* Packaged all decoupled related things into a a single `DecoupledExtension` to reduce copy-paste
* `DecoupledTestConfig#quidemReason` must describe why its being used
* `DecoupledTestConfig#separateDefaultModeTest` can be used to make multiple case files based on `NullHandling` state
* fixed a cosmetic bug during decoupled join translation
* enhanced `!druidPlan` to report the final logical plan in non-decoupled mode as well
* add check to ensure that only supported params are present in a druidtest uri
* enabled shadow testcases for previously disabled testcases
2024-05-10 13:38:54 +00:00
Benedict Jin
cb7c2c1e37
Downgrade the version of Apache Curator from 5.5.0 to 5.3.0 to avoid a bug in the new version (#16425) 2024-05-10 15:08:33 +05:30
Kashif Faraz
3b84751233
Remove unused task action SegmentLockReleaseAction (#16422)
Changes:
- Remove `SegmentLockReleaseAction` as it is not used anywhere.
It is not even registered as a known sub-type of `TaskAction`.
- Minor refactor in `TaskLockbox`. No functional change.
- Remove `ExpectedException` from `TaskLockboxTest`
2024-05-10 06:38:29 +05:30
Igor Berman
d0f3fdab37
Allow using different lock types for kill task, remove markAsUnused parameter (#16362)
Changes:
- Remove deprecated `markAsUnused` parameter from `KillUnusedSegmentsTask`
- Allow `kill` task to use `REPLACE` lock when `useConcurrentLocks` is true
- Use `EXCLUSIVE` lock by default
2024-05-10 06:37:36 +05:30
Charles Smith
2d0b4e5f1e
Update sidebar to organize tutorials + other minor improvements (#16184)
Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
2024-05-09 08:57:43 -07:00
Adarsh Sanjeev
30f3cf5017
Add more info in MSQ export log message (#16363) 2024-05-09 13:02:19 +05:30
Zoltan Haindrich
1811674753
Enable quidem tests to use different suppliers (#16382)
* enable quidem uri support for `druidtest:///?ComponentSupplier=Nested` and similar
* changes the way `SqlTestFrameworkConfig` is being applied; all options will have their own annotation (its kinda impossible to detect that an annotation has a set value or its the default)
* enables hierarchical processing of config annotation (was needed to enable class level supplier annotation)
* moves uri processing related string2config stuff into `SqlTestFrameworkConfig`
2024-05-09 09:21:02 +02:00
Akshat Jain
775d654a6c
Load only the required lookups for MSQ tasks (#16358)
With this PR changes, MSQ tasks (MSQControllerTask and MSQWorkerTask) only load the required lookups during querying and ingestion, based on the value of CTX_LOOKUPS_TO_LOAD key in the query context.
2024-05-09 11:21:54 +05:30
Rishabh Singh
a6ebb963c7
Fix NPE in SegmentSchemaCache (#16404)
Verify that schema backfill count metric is emitted for each datasource.
    Fix potential NPE in SegmentSchemaCache#markMetadataQueryResultPublished.
2024-05-09 11:13:53 +05:30
Rushikesh Bankar
eb4e957db1
Remove software.amazon.ion:ion-java from the licenses (#16413)
Remove software.amazon.ion:ion-java from the licenses as it is no longer a transient dependency of aws-java-sdk-core
Verified that after version 1.12.638 of aws-java-sdk-core doesnt have the ion-java as a dependency
2024-05-08 13:51:51 -07:00
Laksh Singla
dded473ac0
Fix another deadlock which can occur while acquiring merge buffers (#16372)
Fixes a deadlock while acquiring merge buffers
2024-05-08 14:33:15 +05:30
Adarsh Sanjeev
03566b0115
Fix script and improve documentation (#16401)
Fixes a few minor issues with scripts.

- Add additional information around since it was confusing, and not clear that the number was the ID from github and not just the major version number.
- Fix an issue where the milestone displayed in an output message was the milestone supplied as an argument, instead of the number of the milestone the PR is already tagged against in Github, from the sent request.
2024-05-08 14:09:14 +05:30
Adarsh Sanjeev
f82cc34e5b
Maintain a connection while exporting results with MSQ (#16381)
* Maintain a connection while exporting results with MSQ

* Fix checkstyle

* Fix checkstyle

* Move initialization from constructor

* Add null check

* Address review comments
2024-05-08 11:34:20 +05:30
Adarsh Sanjeev
269e035e76
Add validation for reindex with realtime sources (#16390)
Add validation for reindex with realtime sources.

With the addition of concurrent compaction, it is possible to ingest data while querying from realtime sources with MSQ into the same datasource. This could potentially lead to issues if the interval that is ingested into is replaced by an MSQ job, which has queried only some of the data from the realtime task.

This PR adds validation to check that the datasource being ingested into is not being queried from, if the query includes realtime sources.
2024-05-07 10:32:15 +05:30
Misha
b5958b6b07
Feature configurable calcite bloat (#16248)
* Configurable bloat for calcite ProjectMergeRule implemented

* Comment added

* Default bloat value increased to 1000

* Implemented bloat configuration from QueryContext

* Code refactored, docs updated

---------

Co-authored-by: sviatahorau <mikhail.sviatahorau@deep.bi>
2024-05-06 20:43:39 +05:30
Sensor
ac42737242
Specify node type so that the log filename can get resolved (#16282)
* specify node type so that the log filename can get resolved

* Update distribution/docker/druid.sh

Co-authored-by: Benedict Jin <asdf2014@apache.org>

---------

Co-authored-by: Benedict Jin <asdf2014@apache.org>
2024-05-06 22:12:11 +08:00
dependabot[bot]
a2223ce821
Bump org.scala-lang:scala-library from 2.13.11 to 2.13.14 (#16364)
Bumps [org.scala-lang:scala-library](https://github.com/scala/scala) from 2.13.11 to 2.13.14.
- [Release notes](https://github.com/scala/scala/releases)
- [Commits](https://github.com/scala/scala/compare/v2.13.11...v2.13.14)

---
updated-dependencies:
- dependency-name: org.scala-lang:scala-library
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-06 22:06:23 +08:00
Alberic Liu
92fb0ff718
upgrade mysql:mysql-connector-java to 8.2.0 (#16024)
* upgrade mysql:mysql-connector-java to 8.2.0

* fix the check errors

* remove unused comment
2024-05-06 21:58:37 +08:00
Pranav
b713a517f1
Fix the bug in Immutable RTree object strategy (#16389)
* Fix the bug in Immutable Node object strategy

* Adding comments in code
2024-05-06 14:37:29 +05:30
Abhishek Radhakrishnan
2a638d77d9
Remove stale references to coordinator dynamic config killAllDataSources. (#16387)
This parameter has been removed for awhile now as of Druid 0.23.0
https://github.com/apache/druid/pull/12187.

The code was only used in tests to verify that serialization works.
Now remove all references to avoid any confusion.
2024-05-05 08:48:56 +05:30
Gian Merlino
588d442422
Add native filter conversion for SCALAR_IN_ARRAY. (#16312)
* Add native filter conversion for SCALAR_IN_ARRAY.

Main changes:

1) Add an implementation of "toDruidFilter" in ScalarInArrayOperatorConversion.

2) Split up Expressions.literalToDruidExpression into two functions, so the first
   half (literalToExprEval) can be used by ScalarInArrayOperatorConversion to more
   efficiently create the list of match values.

* Fix type in time arithmetic conversion.

* Test updates.

* Update test cases to use null instead of '' in default-value mode.

* Switch test from msqIncompatible to compatible with a different result.

* Update one more test.

* Fix test.

* Update tests.

* Use ExprEvalWrapper to differentiate between empty string and null.

* Fix tests some more.

* Fix test.

* Additional comment.

* Style adjustment.

* Fix tests.

* trueValue -> actualValue.

* Use different approach, DruidLiteral instead of ExprEvalWrapper.

* Revert changes in ArrayOfDoublesSketchSqlAggregatorTest.
2024-05-03 13:00:33 -07:00
Gian Merlino
1b107ff695
QueryableIndex: Close columns after failed vector cursor setup. (#16365)
* QueryableIndex: Close columns after failed vector cursor setup.

If anything fails while setting up a vector cursor, the prior code in
QueryableIndex would not close its ColumnCache and would therefore leak
columns. Columns often contain references to buffers that must be closed.

* Fix style.
2024-05-03 12:58:40 -07:00