Commit Graph

14052 Commits

Author SHA1 Message Date
Zoltan Haindrich 1811674753
Enable quidem tests to use different suppliers (#16382)
* enable quidem uri support for `druidtest:///?ComponentSupplier=Nested` and similar
* changes the way `SqlTestFrameworkConfig` is being applied; all options will have their own annotation (its kinda impossible to detect that an annotation has a set value or its the default)
* enables hierarchical processing of config annotation (was needed to enable class level supplier annotation)
* moves uri processing related string2config stuff into `SqlTestFrameworkConfig`
2024-05-09 09:21:02 +02:00
Akshat Jain 775d654a6c
Load only the required lookups for MSQ tasks (#16358)
With this PR changes, MSQ tasks (MSQControllerTask and MSQWorkerTask) only load the required lookups during querying and ingestion, based on the value of CTX_LOOKUPS_TO_LOAD key in the query context.
2024-05-09 11:21:54 +05:30
Rishabh Singh a6ebb963c7
Fix NPE in SegmentSchemaCache (#16404)
Verify that schema backfill count metric is emitted for each datasource.
    Fix potential NPE in SegmentSchemaCache#markMetadataQueryResultPublished.
2024-05-09 11:13:53 +05:30
Rushikesh Bankar eb4e957db1
Remove software.amazon.ion:ion-java from the licenses (#16413)
Remove software.amazon.ion:ion-java from the licenses as it is no longer a transient dependency of aws-java-sdk-core
Verified that after version 1.12.638 of aws-java-sdk-core doesnt have the ion-java as a dependency
2024-05-08 13:51:51 -07:00
Laksh Singla dded473ac0
Fix another deadlock which can occur while acquiring merge buffers (#16372)
Fixes a deadlock while acquiring merge buffers
2024-05-08 14:33:15 +05:30
Adarsh Sanjeev 03566b0115
Fix script and improve documentation (#16401)
Fixes a few minor issues with scripts.

- Add additional information around since it was confusing, and not clear that the number was the ID from github and not just the major version number.
- Fix an issue where the milestone displayed in an output message was the milestone supplied as an argument, instead of the number of the milestone the PR is already tagged against in Github, from the sent request.
2024-05-08 14:09:14 +05:30
Adarsh Sanjeev f82cc34e5b
Maintain a connection while exporting results with MSQ (#16381)
* Maintain a connection while exporting results with MSQ

* Fix checkstyle

* Fix checkstyle

* Move initialization from constructor

* Add null check

* Address review comments
2024-05-08 11:34:20 +05:30
Adarsh Sanjeev 269e035e76
Add validation for reindex with realtime sources (#16390)
Add validation for reindex with realtime sources.

With the addition of concurrent compaction, it is possible to ingest data while querying from realtime sources with MSQ into the same datasource. This could potentially lead to issues if the interval that is ingested into is replaced by an MSQ job, which has queried only some of the data from the realtime task.

This PR adds validation to check that the datasource being ingested into is not being queried from, if the query includes realtime sources.
2024-05-07 10:32:15 +05:30
Misha b5958b6b07
Feature configurable calcite bloat (#16248)
* Configurable bloat for calcite ProjectMergeRule implemented

* Comment added

* Default bloat value increased to 1000

* Implemented bloat configuration from QueryContext

* Code refactored, docs updated

---------

Co-authored-by: sviatahorau <mikhail.sviatahorau@deep.bi>
2024-05-06 20:43:39 +05:30
Sensor ac42737242
Specify node type so that the log filename can get resolved (#16282)
* specify node type so that the log filename can get resolved

* Update distribution/docker/druid.sh

Co-authored-by: Benedict Jin <asdf2014@apache.org>

---------

Co-authored-by: Benedict Jin <asdf2014@apache.org>
2024-05-06 22:12:11 +08:00
dependabot[bot] a2223ce821
Bump org.scala-lang:scala-library from 2.13.11 to 2.13.14 (#16364)
Bumps [org.scala-lang:scala-library](https://github.com/scala/scala) from 2.13.11 to 2.13.14.
- [Release notes](https://github.com/scala/scala/releases)
- [Commits](https://github.com/scala/scala/compare/v2.13.11...v2.13.14)

---
updated-dependencies:
- dependency-name: org.scala-lang:scala-library
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-06 22:06:23 +08:00
Alberic Liu 92fb0ff718
upgrade mysql:mysql-connector-java to 8.2.0 (#16024)
* upgrade mysql:mysql-connector-java to 8.2.0

* fix the check errors

* remove unused comment
2024-05-06 21:58:37 +08:00
Pranav b713a517f1
Fix the bug in Immutable RTree object strategy (#16389)
* Fix the bug in Immutable Node object strategy

* Adding comments in code
2024-05-06 14:37:29 +05:30
Abhishek Radhakrishnan 2a638d77d9
Remove stale references to coordinator dynamic config killAllDataSources. (#16387)
This parameter has been removed for awhile now as of Druid 0.23.0
https://github.com/apache/druid/pull/12187.

The code was only used in tests to verify that serialization works.
Now remove all references to avoid any confusion.
2024-05-05 08:48:56 +05:30
Gian Merlino 588d442422
Add native filter conversion for SCALAR_IN_ARRAY. (#16312)
* Add native filter conversion for SCALAR_IN_ARRAY.

Main changes:

1) Add an implementation of "toDruidFilter" in ScalarInArrayOperatorConversion.

2) Split up Expressions.literalToDruidExpression into two functions, so the first
   half (literalToExprEval) can be used by ScalarInArrayOperatorConversion to more
   efficiently create the list of match values.

* Fix type in time arithmetic conversion.

* Test updates.

* Update test cases to use null instead of '' in default-value mode.

* Switch test from msqIncompatible to compatible with a different result.

* Update one more test.

* Fix test.

* Update tests.

* Use ExprEvalWrapper to differentiate between empty string and null.

* Fix tests some more.

* Fix test.

* Additional comment.

* Style adjustment.

* Fix tests.

* trueValue -> actualValue.

* Use different approach, DruidLiteral instead of ExprEvalWrapper.

* Revert changes in ArrayOfDoublesSketchSqlAggregatorTest.
2024-05-03 13:00:33 -07:00
Gian Merlino 1b107ff695
QueryableIndex: Close columns after failed vector cursor setup. (#16365)
* QueryableIndex: Close columns after failed vector cursor setup.

If anything fails while setting up a vector cursor, the prior code in
QueryableIndex would not close its ColumnCache and would therefore leak
columns. Columns often contain references to buffers that must be closed.

* Fix style.
2024-05-03 12:58:40 -07:00
zachjsh fb7c84fb5d
Catalog clustering keys fixes (#16351)
* * add another catalog clustering columns unit test

* * dissallow clusterKeys with descending order

* * make more clear that clustering is re-written into ingest node
whether a catalog table or not

* * when partitionedBy is stored in catalog, user shouldnt need to specify
it in order to specify clustering

* * fix intellij inspection failure
2024-05-03 14:02:56 -04:00
Vadim Ogievetsky 4d62c4a917
Web console: concat data when doing a durable storage download (#16375)
* concat data

* fix silly console.error
2024-05-03 08:00:32 -07:00
Rishabh Singh c61c3785a0
Followup changes to 15817 (Segment schema publishing and polling) (#16368)
* Fix build

* Nit changes in KillUnreferencedSegmentSchema

* Replace reference to the abbreviation SMQ with Metadata Query, rename inTransit maps in schema cache

* nitpicks

* Remove reference to smq abbreviation from integration-tests

* Remove reference to smq abbreviation from integration-tests

* minor change

* Update index.md

* Add delimiter while computing schema fingerprint hash
2024-05-03 19:13:52 +05:30
AmatyaAvadhanula 5fae20d287
Do not allocate ids conflicting with existing segment ids (#16380)
* Do not allocate ids conflicting with existing segment ids

* Parameterized tests

* Add doc and retain test for coverage
2024-05-03 19:09:48 +05:30
Jan Werner b16401323b
update dependencies to address CVEs (#16374)
update dependencies to address new batch of CVEs:
- Azure POM from 1.2.19 to 1.2.23 to update transitive dependency nimbus-jose-jwt to address:  CVE-2023-52428
- commons-configuration2 from 2.8.0 to 2.10.1 to address: CVE-2024-29131 CVE-2024-29133
- bcpkix-jdk18on from 1.76 to 1.78.1 to address: CVE-2024-30172 CVE-2024-30171 CVE-2024-29857
2024-05-02 21:35:21 -07:00
Abhishek Radhakrishnan 3717554e16
Web console changes for https://github.com/apache/druid/pull/16288 (#16379)
Adds a text box for delta filter that can accept an optional json
object.
2024-05-02 15:50:17 -07:00
Vadim Ogievetsky 39ada8b9ad
Web console: surface more info on the supervisor view (#16318)
* add rate and stats

* better tabs

* detail

* add recent errors

* update tests

* don't let people hide the actions column because why

* don't sort on actions

* better way to agg

* add timeouts

* show error only once

* fix tests and Explain showing up

* only consider active tasks

* refresh

* fix tests

* better formatting
2024-05-02 08:50:27 -07:00
AmatyaAvadhanula b7ae78296a
Allow different timechunk lock types to coexist in a task group (#16369)
Description:
All the streaming ingestion tasks for a given datasource share the same lock for a given interval.
Changing lock types in the supervisor can lead to segment allocation errors due to lock conflicts
for the new tasks while the older tasks are still running.

Fix:
Allow locks of different types (EXCLUSIVE, SHARED, APPEND, REPLACE) to co-exist if they have
the same interval and the same task group.
2024-05-02 19:54:43 +05:30
Kashif Faraz e5b40b0b8c
Miscellaneous cleanup of load queue references (#16367)
Changes:
- Rename `DataSegmentChangeRequestAndStatus` to `DataSegmentChangeResponse`
- Rename `SegmentLoadDropHandler.Status` to `SegmentChangeStatus`
- Remove method `CoordinatorRunStats.getSnapshotAndReset()` as it was used only in
load queue peon implementations. Using an atomic reference is much simpler.
- Remove `ServerTestHelper.MAPPER`. Use existing `TestHelper.makeJsonMapper()` instead.
2024-05-02 15:59:50 +05:30
Zoltan Haindrich 2d0e86cbdc
Use quidem to run tests (#16249)
* test scoped jdbc driver for druidtest:/// backed DruidAvaticaTestDriver
** DecoupledTestConfig is used inside the URI - this will make it possible to attach to existing things more easily
* DruidQuidemTestBase can be used to create module level set of quidem tests
* added quidem commands: !convertedPlan, !logicalPlan, !druidPlan, !nativePlan
** for these I've used some values of the Hook which was there in calcite
* there are some shortcuts with proxies(they are only used during testing) - we can probably remove those later
2024-05-02 02:12:42 -04:00
Gian Merlino 5d1950d451
MSQ controller: Support in-memory shuffles; towards JVM reuse. (#16168)
* MSQ controller: Support in-memory shuffles; towards JVM reuse.

This patch contains two controller changes that make progress towards a
lower-latency MSQ.

First, support for in-memory shuffles. The main feature of in-memory shuffles,
as far as the controller is concerned, is that they are not fully buffered. That
means that whenever a producer stage uses in-memory output, its consumer must run
concurrently. The controller determines which stages run concurrently, and when
they start and stop.

"Leapfrogging" allows any chain of sort-based stages to use in-memory shuffles
even if we can only run two stages at once. For example, in a linear chain of
stages 0 -> 1 -> 2 where all do sort-based shuffles, we can use in-memory shuffling
for each one while only running two at once. (When stage 1 is done reading input
and about to start writing its output, we can stop 0 and start 2.)

1) New OutputChannelMode enum attached to WorkOrders that tells workers
   whether stage output should be in memory (MEMORY), or use local or durable
   storage.

2) New logic in the ControllerQueryKernel to determine which stages can use
   in-memory shuffling (ControllerUtils#computeStageGroups) and to launch them
   at the appropriate time (ControllerQueryKernel#createNewKernels).

3) New "doneReadingInput" method on Controller (passed down to the stage kernels)
   which allows stages to transition to POST_READING even if they are not
   gathering statistics. This is important because it enables "leapfrogging"
   for HASH_LOCAL_SORT shuffles, and for GLOBAL_SORT shuffles with 1 partition.

4) Moved result-reading from ControllerContext#writeReports to new QueryListener
   interface, which ControllerImpl feeds results to row-by-row while the query
   is still running. Important so we can read query results from the final
   stage using an in-memory channel.

5) New class ControllerQueryKernelConfig holds configs that control kernel
   behavior (such as whether to pipeline, maximum number of concurrent stages,
   etc). Generated by the ControllerContext.

Second, a refactor towards running workers in persistent JVMs that are able to
cache data across queries. This is helpful because I believe we'll want to reuse
JVMs and cached data for latency reasons.

1) Move creation of WorkerManager and TableInputSpecSlicer to the
   ControllerContext, rather than ControllerImpl. This allows managing workers and
   work assignment differently when JVMs are reusable.

2) Lift the Controller Jersey resource out from ControllerChatHandler to a
   reusable resource.

3) Move memory introspection to a MemoryIntrospector interface, and introduce
   ControllerMemoryParameters that uses it. This makes it easier to run MSQ in
   process types other than Indexer and Peon.

Both of these areas will have follow-ups that make similar changes on the
worker side.

* Address static checks.

* Address static checks.

* Fixes.

* Report writer tests.

* Adjustments.

* Fix reports.

* Review updates.

* Adjust name.

* Small changes.
2024-04-30 21:30:27 -07:00
Kashif Faraz 51104e8bb3
Docs: Remove references to Zk-based segment loading (#16360)
Follow up to #15705

Changes:
- Remove references to ZK-based segment loading in the docs
- Fix doc for existing config `druid.coordinator.loadqueuepeon.http.repeatDelay`
2024-05-01 08:06:00 +05:30
John Gozde 834b0eddeb
web-console: ACE editor refactoring (#16359)
* Move druid-sql completions to dsql mode

* Use font-size 12

* Convert ace-modes to typescript

* Move aceCompleters to class member

* Use namespace imports
2024-04-30 11:53:39 -07:00
AmatyaAvadhanula 42e99bf912
Add new index on datasource and task_allocator_id for pending segments (#16355)
* Add pending segments index on datasource and task_allocator_id

* Use both datasource and task_allocator_id in queries
2024-04-30 15:48:16 +05:30
Laksh Singla e695e52d3f
Improve code flow in the First/Last vector aggregators and unify the numeric aggregators with the String implementations (#16230)
This PR fixes the first and last vector aggregators and improves their readability. Following changes are introduced

    The folding is broken in the vectorized versions. We consider time before checking the folded object.

    If the numerical aggregator gets passed any other object type for some other reason (like String), then the aggregator considers it to be folded, even though it shouldn’t be. We should convert these objects to the desired type, and aggregate them properly.

    The aggregators must properly use generics. This would minimize the ClassCastException issues that can happen with mixed segment types. We are unifying the string first/last aggregators with numeric versions as well.

    The aggregators must aggregate null values (https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/query/aggregation/first/StringFirstLastUtils.java#L55-L56 ). The aggregator should only ignore pairs with time == null, and not value == null

    Time nullity is ignored when trying to vectorize the data.

    String versions initialized with DateTimes.MIN that is equal to Long.MIN / 2. This can cause incorrect results in case the user enters a custom time column. NOTE: This is still present because it would require a larger refactor in all of the versions.

    There is a difference in what users might expect from the results because the code flow is changed (for example, the direction of the for loops, etc), however, this will only change the results, and not the contract set by first/last aggregators, which is that if multiple values have the same timestamp, then any of them can get picked.

    If the column is non-existent, the users might expect a change in the timestamp from DateTime.MAX to Long.MAX, because the code incorrectly used DateTime.MAX to initialize the aggregator, however, in case of a custom timestamp column, this might not be the case. The SQL query might be prohibited from using any Long since it requires a cast to the timestamp function that can fail, but AFAICT native queries don't have such limitations.
2024-04-30 15:13:14 +05:30
Laksh Singla 26d63e7b65
Prevent joining on nested arrays and complex types (#16349)
#16068 modified DimensionHandlerUtils to accept complex types to be dimensions. This had an unintended side effect of allowing complex types to be joined upon (which wasn't guarded explicitly, it doesn't work).
This PR modifies the IndexedTable to reject building the index on the complex types to prevent joining on complex types. The PR adds back the check in the same place, explicitly.
2024-04-30 11:36:53 +05:30
Adarsh Sanjeev fb63520de9
Add tests for ProcessorManager (#16327)
* Add tests for ProcessorManager
2024-04-30 09:35:26 +05:30
Alberic Liu 736a2ab7c1
update code style for task type (#16343)
* update code style for task type

* address the comments
2024-04-29 14:42:55 -07:00
Kashif Faraz aa46314971
Remove usage of skife from DruidCoordinatorConfig (#15705)
* Remove usage of skife from DruidCoordinatorConfig

* Remove old config class

* Address static checks

* Fix tests

* Remove unnecessary mocks

* Fix config typos

* Fix config condition

* Fix test, spotbug check

* Move validation to DruidCoordinatorConfig

* Move DruidCoordinatorConfig to different package

* Fix validation of killunusedconfig

* Simplify and fix KillSupervisorsCustomDuty

* Address review comments

* Fix new tests

* Add KillUnusedSchemasConfig

* Remove KillUnusedSchemasConfig

* Minor renames
2024-04-29 11:37:13 -07:00
Abhishek Radhakrishnan 1d7595f3f7
Support for filters in the Druid Delta Lake connector (#16288)
* Delta Lake support for filters.

* Updates

* cleanup comments

* Docs

* Remmove Enclosed runner

* Rename

* Cleanup test

* Serde test for the Delta input source and fix jackson annotation.

* Updates and docs.

* Update error messages to be clearer

* Fixes

* Handle NumberFormatException to provide a nicer error message.

* Apply suggestions from code review

Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>

* Doc fixes based on feedback

* Yes -> yes in docs; reword slightly.

* Update docs/ingestion/input-sources.md

Co-authored-by: Laksh Singla <lakshsingla@gmail.com>

* Update docs/ingestion/input-sources.md

Co-authored-by: Laksh Singla <lakshsingla@gmail.com>

* Documentation, javadoc and more updates.

* Not with an or expression end-to-end test.

* Break up =, >, >=, <, <= into its own types instead of sub-classing.

---------

Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
Co-authored-by: Laksh Singla <lakshsingla@gmail.com>
2024-04-29 11:31:36 -07:00
Adithya Chakilam f8015eb02a
Add config lagAggregate to LagBasedAutoScalerConfig (#16334)
Changes:
- Add new config `lagAggregate` to `LagBasedAutoScalerConfig`
- Add field `aggregateForScaling` to `LagStats`
- Use the new field/config to determine which aggregate to use to compute lag
- Remove method `Supervisor.computeLagForAutoScaler()`
2024-04-29 22:20:41 +05:30
Kashif Faraz 89ec0da5c5
Disable upload of coverage report to codecov.io (#16347) 2024-04-29 21:04:55 +05:30
Akshat Jain 9d2cae40c3
Add support for selective loading of lookups in the task layer (#16328)
Changes:
- Add `LookupLoadingSpec` to support 3 modes of lookup loading: ALL, NONE, ONLY_REQUIRED
- Add method `Task.getLookupLoadingSpec()`
- Do not load any lookups for `KillUnusedSegmentsTask`
2024-04-29 07:19:59 +05:30
Bünyamin 9aef8e02ef
Expend coverage for default mapping (#16340) 2024-04-27 17:39:07 +05:30
Gian Merlino db82adcdfd
SCALAR_IN_ARRAY: Optimization and behavioral follow-ups. (#16311)
* Four changes to scalar_in_array as follow-ups to #16306:

1) Align behavior for `null` scalars to the behavior of the native `in` and `inType` filters: return `true` if the array itself contains null, else return `null`.

2) Rename the class to more closely match the function name.

3) Add a specialization for constant arrays, where we build a `HashSet`.

4) Use `castForEqualityComparison` to properly handle cross-type comparisons.
   Additional tests verify comparisons between LONG and DOUBLE are now
   handled properly.

* Fix spelling.

* Adjustments from review.
2024-04-26 16:01:17 -07:00
Andreas Maechler 9cd1890855
Fix log count (#16341) 2024-04-26 14:04:19 -07:00
Charles Smith 4e3cb9c251
change ownership of /opt/shared to druid (#16253) 2024-04-26 21:16:00 +05:30
zachjsh 365cd7e8e7
INSERT/REPLACE can omit clustering when catalog has default (#16260)
* * fix

* * fix

* * address review comments

* * fix

* * simplify tests

* * fix complex type nullability issue

* * implement and add tests

* * address review comments

* * address test review comments

* * fix checkstyle

* * fix dependencies

* * all tests passing

* * cleanup

* * remove unneeded code

* * remove unused dependency

* * fix checkstyle
2024-04-26 10:19:45 -04:00
Gian Merlino 64a6fc8fc0
JSONFlattenerMaker: Speed up charsetFix. (#16212)
JSON parsing has this function "charsetFix" that fixes up strings
so they can round-trip through UTF-8 encoding without loss of
fidelity. It was originally introduced to fix a bug where strings
could be sorted, encoded, then decoded, and the resulting decoded
strings could end up no longer in sorted order (due to character
swaps during the encode operation).

The code has been in place for some time, and only applies to JSON.
I am not sure if it needs to apply to other formats; it's certainly
more difficult to get broken strings from other formats. It's easy
in JSON because you can write a JSON string like "foo\uD900".

At any rate, this patch does not revisit whether charsetFix should
be applied to all formats. It merely optimizes it for the JSON case.
The function works by using CharsetEncoder.canEncode, which is
a relatively slow method (just as expensive as actually encoding).
This patch adds a short-circuit to skip canEncode if all chars in
a string are in the basic multilingual plane (i.e. if no chars are
surrogates).
2024-04-26 10:46:07 +05:30
Adarsh Sanjeev 9a2d7c28bc
Prepare master branch for 31.0.0 release (#16333) 2024-04-26 09:22:43 +05:30
Arun Ramani 126a0c219a
Surface lock revocation exceptions in task status (#16325) 2024-04-26 08:39:44 +05:30
Kashif Faraz 4b6748bdc9
Update default value of useMaxMemoryEstimates for Hadoop jobs (#16280) 2024-04-26 08:07:21 +05:30
Gian Merlino 68d6e682e8
Fix TimeBoundary planning when filters require virtual columns. (#16337)
The timeBoundary query does not support virtual columns, so we should
avoid it if the query requires virtual columns.
2024-04-25 16:49:40 -07:00
AmatyaAvadhanula 31eee7d51e
Check for handoff of upgraded segments (#16162)
Changes:
1) Check for handoff of upgraded realtime segments.
2) Drop sink only when all associated realtime segments have been abandoned.
3) Delete pending segments upon commit to prevent unnecessary upgrades and
partition space exhaustion when a concurrent replace happens. This also prevents
potential data duplication.
4) Register pending segment upgrade only on those tasks to which the segment is associated.
2024-04-25 22:03:38 +05:30