14569 Commits

Author SHA1 Message Date
Vadim Ogievetsky
26e2ca66d7
update to node 20 (#17363) 2024-10-16 13:15:10 -07:00
Vadim Ogievetsky
877784e5fd
Web console: add expectedLoadTimeMillis (#17359)
* add expectedLoadTimeMillis

* make spec cleaning less agro

* more cleanup
2024-10-16 13:14:27 -07:00
Vadim Ogievetsky
8ddb316e68
Web console: fix progress indication for table input (#17334)
* fix porgress indication for table input

* fix snapshot
2024-10-16 13:14:11 -07:00
Suraj Goel
c1fe1ac898
Remove EOL file-loader dependency (#17346) 2024-10-16 11:11:06 -07:00
George Shiqi Wu
a664fc8be3
always set taskLocation (#17350) 2024-10-16 14:02:39 -04:00
Kashif Faraz
df3a307e83
Do not use cachingCost balancer strategy in Docker environment (#17349) 2024-10-16 20:59:46 +05:30
TessaIO
a9f582711e
Fix loading lookup extension (#17212)
We introduce the option to iterate over fetched data from the dataFetcher for loadingLookups in the lookups-cached-single extension. Also, added the handling of a use case where the data exists in Druid but not in the actual data fetcher, which is in our use-case JDBC Data fetcher, where the value returned is null.

Signed-off-by: TessaIO <ahmedgrati1999@gmail.com>
2024-10-16 07:28:32 -07:00
Pranav
f80e2c229e
Fix pip installation after ubuntu upgrade (#17358) 2024-10-15 17:50:18 -07:00
Adithya Chakilam
c57bd3b438
supervisor/autoscaler: Skip scaling when partitions are less than minTaskCount (#17335) 2024-10-15 14:12:53 -07:00
Hardik Bajaj
32ce341a6c
Fix RejectExecutionHandler of Blocking Single Threaded executor (#17146)
Throw RejectedExecutionException when submitting tasks to executor that has been shut down.
2024-10-15 22:02:34 +05:30
Clint Wylie
c2149d59a7
remove stale comment in QueryableIndexCursorHolder (#17333) 2024-10-11 16:23:59 -07:00
Gian Merlino
b287b219a8
MSQ: Include stageId, workerNumber in processing thread names. (#17324)
* MSQ: Include stageId, workerNumber in processing thread names.

Helps identify which query was running in a thread dump.

* s/dart/msq/
2024-10-11 08:37:15 -07:00
Gian Merlino
a0c29f8bbb
MSQ WorkerResource: Fix timeout handler for httpGetChannelData. (#17328)
The timeout handler should fire if the response has not been handled yet
(i.e. if responseResolved was previously false). However, it erroneously
fires only if the response *was* handled. This causes HTTP 500 errors if
the timeout actually does fire. The timeout is 30 seconds, which can be
hit during pipelined queries, if an earlier stage of the query hasn't
produced its first frame within 30 seconds.

This fixes a regression introduced in #17140.
2024-10-11 16:29:04 +05:30
Karan Kumar
034bb9dbea
Removing enable windowing from MSQ tests. (#17276) 2024-10-11 05:33:27 +02:00
Shivam Garg
6898a5a359
Removed Microsecond from Extract function (#17247) 2024-10-11 05:32:26 +02:00
Clint Wylie
a6236c3d15
add substituteCombiningFactory implementations for datasketches aggs (#17314)
Follow up to #17214, adds implementations for substituteCombiningFactory so that more
datasketches aggs can match projections, along with some projections tests for datasketches.
2024-10-10 16:14:06 +05:30
Suneet Saldanha
fb38e483cf
statsd-emitter: Add dutyGroup to coordinator global time metric (#17320)
The duty group is a low cardinality dimension and can be helpful in providing insight
into whether a particular duty group is not running fast enough on the coordinator.
2024-10-10 16:03:50 +05:30
Gian Merlino
1d95ef34f0
Logger: Log context of DruidExceptions. (#17316)
* Logger: Log context of DruidExceptions.

There is often interesting and unique information available in the
"context" of a DruidException. This information is additive to both
the message and the cause, and was missed when we log. This patch adds
the DruidException context to log messages whenever stack traces are
enabled.

* Only log nonempty contexts.
2024-10-10 01:44:50 -07:00
Gian Merlino
074944e02c
Dart: Only use historicals as workers. (#17319)
Only historicals load the Dart worker modules. Other types of servers in
the server view (such as realtime tasks) should not be included.
2024-10-10 13:47:58 +05:30
Gian Merlino
4092f3fe47
MSQ: Call "onQueryComplete" after the query is closed. (#17313)
This fixes a concurrency issue where, for failed queries, "onQueryComplete"
could be called concurrently with "onResultsStart" or "onResultRow". Fully
closing the controller ensures that the result reader is no longer active,
which eliminates the race.
2024-10-10 10:44:44 +05:30
Gian Merlino
b27712933e
MSQ: Use leaf worker count for stages that have any leaf inputs. (#17312)
Previously, the leaf worker count was used for stages that have *no*
stage inputs. It should actually be used for stages that have *any*
non-broadcast, non-stage inputs.

This fixes a bug with broadcast joins. In a broadcast join, the stage has
both a table and a broadcast stage as input. Previously, it would be planned
using the non-leaf worker count. It should actually be planned using the
leaf worker count.
2024-10-10 10:44:31 +05:30
Kashif Faraz
3f797c52d0
Fix duplicate compaction task launched by OverlordCompactionScheduler (#17287)
Description
-----------
The `OverlordCompactionScheduler` may sometimes launch a duplicate compaction
task for an interval that has just been compacted.

This may happen as follows:
- Scheduler launches a compaction task for an uncompacted interval.
- While the compaction task is running, the `CompactionStatusTracker` does not consider
this interval as compactible and returns the `CompactionStatus` as `SKIPPED` for it.
- As soon as the compaction task finishes, the `CompactionStatusTracker` starts considering
the interval eligible for compaction again.
- This interval remains eligible for compaction until the newly published segments are polled
from the database.
- Once the new segments have been polled, the `CompactionStatus` of the interval changes
to `COMPLETE`.

Change
--------
- Keep track of the `snapshotTime` in `DataSourcesSnapshot`. This time represents the start of the poll.
- Use the `snapshotTime` to determine if a poll has happened after a compaction task completed.
- If not, then skip the interval to avoid launching duplicate tasks.
- For tests, use a future `snapshotTime` to ensure that compaction is always triggered.
2024-10-10 08:44:09 +05:30
Karan Kumar
4fdb38118a
CVE suppression for various dependencies. (#17307) 2024-10-09 18:07:09 +05:30
AmatyaAvadhanula
88d26e4541
Fix queries for updated segments on SinkQuerySegmentWalker (#17157)
Fix the logic for usage of segment descriptors from queries in SinkQuerySegmentWalker when there are upgraded segments as a result of concurrent replace.

Concurrent append and replace:
With the introduction of concurrent append and replace, for a given interval:

The same sink can correspond to a base segment V0_x0, and have multiple mappings to higher versions with distinct partition numbers such as V1_x1.... Vn_xn.
The initial segment allocation can happen on version V0, but there can be several allocations during the lifecycle of a task which can have different versions spanning from V0 to Vn.
Changes:
Maintain a new timeline of (An overshadowable holding a SegmentDescriptor)
Every segment allocation of version upgrade adds the latest segment descriptor to this timeline.
Iterate this timeline instead of the sinkTimeline to get the segment descriptors in getQueryRunnerForIntervals
Also maintain a mapping of the upgraded segment to its base segment.
When a sink is needed to process the query, find the base segment corresponding to a given descriptor, and then use the sinkTimeline to find its chunk.
2024-10-09 14:43:17 +05:30
Vadim Ogievetsky
a395368622
run npm audit fix (#17290) 2024-10-08 16:44:09 -07:00
Vadim Ogievetsky
4570809b4a
better timing bar styling (#17295) 2024-10-08 16:30:58 -07:00
anny-imply
dca69c5761
update line in architecture md (#17289) 2024-10-08 11:51:47 -07:00
Gian Merlino
baa16f30f6
DartWorkerContext: Return the correct workerId(). (#17280)
Prior to this patch, the workerId() method did not actually return
the worker ID. It returned some other string that had similar information,
but was different.

This caused the /druid/dart-worker/workers API, to return an internal
server error. The API is useful for debugging, although it is not used
during actual queries.
2024-10-08 09:52:55 -07:00
Charles Smith
5ed68622c3
[Docs] Update known issues for window functions (#17097)
* draft update to known issues

* Update known issues

Remove addressed known issues. Clarify the issue with SELECT * queries.
2024-10-08 08:47:13 -07:00
Gian Merlino
152330c5a8
WorkerManager: Correct javadoc for "stop". (#17279)
The javadoc had a factual error: Dart's implementation does not in
fact always return immediately.
2024-10-08 15:49:43 +05:30
Gian Merlino
0a279e634a
DartSqlResource: Return HTTP 202 on cancellation even if no such query. (#17278)
Return HTTP 202 (Accepted) on cancellation, even if the requested query
ID was not found.

The main reason for this is that when the Router broadcasts DELETE requests
to all Brokers, it returns the response from one of them randomly. If we
return 404 when a query ID isn't found, then the Router randomly returns 404s
even when the query really was found and canceled.

This is also arguably still correct behavior. The cancellation request
*was* accepted, it just won't do anything because the query was not in
fact running.
2024-10-08 15:49:34 +05:30
Gian Merlino
01baf99148
DartWorkerModule: Replace en dash with regular dash. (#17281)
Due to a typo, the thread name of the worker executor used an en dash (–)
rather than a regular hyphen (-). This was unintentional, and makes it
difficult to search for in thread dumps.
2024-10-08 15:48:10 +05:30
Gian Merlino
2309aa7bdf
DartSqlResource: Add controllerHost to GetQueriesResponse. (#17283)
This helps find the specific Broker that is executing a query.
2024-10-08 15:47:32 +05:30
Gian Merlino
9921ac1b19
DartSqlResource: Sort queries by start time. (#17282)
* DartSqlResource: Sort queries by start time.

This keeps the list of queries returned by the API in a consistent order.

* Fix test.
2024-10-08 15:47:21 +05:30
Gian Merlino
06bbdb38ce
MSQ: Allow for worker gaps. (#17277)
In a Dart query, all Historicals are given worker IDs, but not all of them
are going to actually be started or receive work orders. This can create gaps
in the set of workers. For example, workers 1 and 3 could have work assigned
while workers 0 and 2 do not.

This patch updates ControllerStageTracker and WorkerInputs to handle such
gaps, by using the set of actual worker numbers, rather than 0..workerCount,
in various places.
2024-10-08 15:07:57 +05:30
Gian Merlino
4fbb129027
Improve javadocs for SegmentDescriptor. (#17274)
The javadoc for SegmentDescriptor discusses differences between it and
SegmentId, but misses the most important difference: SegmentDescriptor
can have a narrower interval than the segment being referenced.
2024-10-08 00:59:55 -07:00
Clint Wylie
ab0d6eb620
Fix string array grouping comparator (#17183) 2024-10-08 09:47:28 +05:30
Edgar Melendrez
a67a3c8e0a
[docs] update tutorial for Theta sketches (#16953)
* from start to step 3 of Ingest data using Theta sketche

* updated upto "Query the Theta sketch column"

* fixed sentence

* another typo

* using sql ingestion instead of batch-sql

* waiting for explanations on DS_THETA

* Revert "using sql ingestion instead of batch-sql"

This reverts commit b95fcb9b32608dba55deee9910e295f2391d77e2.

* Revert "using sql ingestion instead of batch-sql"

This reverts commit b95fcb9b32608dba55deee9910e295f2391d77e2.

* just copy and pasting to where I was

* updated tutorial

* fixing images, and removing unused

* slightly updating explanatio

* Update docs/tutorials/tutorial-sketches-theta.md

* Apply suggestions from code review

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* addressing comments in review

* made filter clause consitent with other instances

* Apply suggestions from code review

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

---------

Co-authored-by: Benedict Jin <asdf2014@apache.org>
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2024-10-08 10:44:37 +08:00
317brian
9932f2e70a
docs: concurrent append and replace is gA (#17269) 2024-10-08 07:55:55 +05:30
AmatyaAvadhanula
f42ecc9f25
Fail concurrent replace tasks with finer segment granularity than append (#17265) 2024-10-08 07:35:13 +05:30
George Shiqi Wu
5d7c7a87ec
Add maximumCapacity to taskRunner (#17107)
* Add maximumCapacity to taskRunner

* fix tests

* pr comments
2024-10-07 15:03:51 -04:00
AmatyaAvadhanula
ff97c67945
Fix batch segment allocation failure with replicas (#17262)
Fixes #16587

Streaming ingestion tasks operate by allocating segments before ingesting rows.
These allocations happen across replicas which may send different requests but
must get the same segment id for a given (datasource, interval, version, sequenceName)
across replicas.

This patch fixes the bug by ignoring the previousSegmentId when skipLineageCheck is true.
2024-10-07 19:52:38 +05:30
Karan Kumar
6a4352f466
When removeNullBytes is set, length calculations did not take into account null bytes. (#17232)
* When replaceNullBytes is set, length calculations did not take into account null bytes.
2024-10-07 18:02:52 +05:30
Adarsh Sanjeev
c9201ad658
Minor refactors to processing module (#17136)
Refactors a few things.

- Adds SemanticUtils maps to columns.
- Add some addAll functions to reduce duplication, and for future reuse.
- Refactor VariantColumnAndIndexSupplier to only take a SmooshedFileMapper instead.
- Refactor LongColumnSerializerV2 to have separate functions for serializing a value and null.
2024-10-07 13:18:35 +05:30
Vishesh Garg
7e35e50052
Fix issues with MSQ Compaction (#17250)
The patch makes the following changes:
1. Fixes a bug causing compaction to fail on array, complex, and other non-primitive-type columns
2. Updates compaction status check to be conscious of partition dimensions when comparing dimension ordering.
3. Ensures only string columns are specified as partition dimensions
4. Ensures `rollup` is true if and only if metricsSpec is non-empty
5. Ensures disjoint intervals aren't submitted for compaction
6. Adds `compactionReason` to compaction task context.
2024-10-06 21:48:26 +05:30
Shivam Garg
7d9e6d36fd
Upgraded Protobuf to 3.25.5 (#17249)
* Bump com.google.protobuf:protobuf-java from 3.24.0 to 3.25.5

Bumps [com.google.protobuf:protobuf-java](https://github.com/protocolbuffers/protobuf) from 3.24.0 to 3.25.5.
- [Release notes](https://github.com/protocolbuffers/protobuf/releases)
- [Changelog](https://github.com/protocolbuffers/protobuf/blob/main/protobuf_release.bzl)
- [Commits](https://github.com/protocolbuffers/protobuf/compare/v3.24.0...v3.25.5)

---
updated-dependencies:
- dependency-name: com.google.protobuf:protobuf-java
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* Updated the license

* Updated licenses.yaml

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-06 12:34:02 +05:30
Clint Wylie
0bd13bcd51
Projections prototype (#17214) 2024-10-05 04:38:57 -07:00
Clint Wylie
04fe56835d
add druid.expressions.allowVectorizeFallback and default to false (#17248)
changes:

adds ExpressionProcessing.allowVectorizeFallback() and ExpressionProcessingConfig.allowVectorizeFallback(), defaulting to false until few remaining bugs can be fixed (mostly complex types and some odd interactions with mixed types)
add cannotVectorizeUnlessFallback functions to make it easy to toggle the default of this config, and easy to know what to delete when we remove it in the future
2024-10-05 12:42:42 +05:30
Gian Merlino
d1709a329f
Dart: Skip final getCounters, postFinish to idle historicals. (#17255)
In a Dart query, all Historicals are given worker IDs, but not all of them
are going to actually be started or receive work orders.

Attempting to send a getCounters or postFinish command to a worker that
never received a work order is not only wasteful, but it causes errors due
to the workers not knowing about that query ID.
2024-10-04 23:05:21 -07:00
Vadim Ogievetsky
babf7f2ef6
Web console: don't assume that activeTasks is an array (#17254) 2024-10-04 16:01:13 -07:00