Commit Graph

14573 Commits

Author SHA1 Message Date
Abhishek Radhakrishnan 83299e9882
Miscellaneous cleanup in the supervisor API flow. (#17144)
Extracting a few miscellaneous non-functional changes from the batch supervisor branch:

- Replace anonymous inner classes with lambda expressions in the SQL supervisor manager layer
- Add explicit @Nullable annotations in DynamicConfigProviderUtils to make IDE happy
- Small variable renames (copy-paste error perhaps) and fix typos
- Add table name for this exception message: Delete the supervisor from the table[%s] in the database...
- Prefer CollectionUtils.isEmptyOrNull() over list == null || list.size() > 0. We can change the Precondition checks to throwing DruidException separately for a batch of APIs at a time.
2024-09-24 13:06:23 -07:00
George Shiqi Wu d1bfabbf4d
inter-Extension dependency support (#16973)
* update docs for kafka lookup extension to specify correct extension ordering

* fix first line

* test with extension dependencies

* save work on dependency management

* working dependency graph

* working pull

* fix style

* fix style

* remove name

* load extension dependencies recursively

* generate depenencies on classloader creation

* add check for circular dependencies

* fix style

* revert style changes

* remove mutable class loader

* clean up class heirarchy

* extensions loader test working

* add unit tests

* pr comments

* fix unit tests
2024-09-24 14:17:33 -04:00
Abhishek Radhakrishnan 5c862f6ed9
Refactor: Move streaming supervisor methods to `SeekableStreamSupervisor` (#17137)
The current Supervisor interface is primarily focused on streaming use cases. However, as we introduce supervisors for non-streaming use cases, such as the recently added CompactionSupervisor (and the upcoming BatchSupervisor), certain operations like resetting offsets, checkpointing, task group handoff, etc., are not really applicable to non-streaming use cases.

So the methods are split between:

1. Supervisor: common methods that are applicable to both streaming and non-streaming use cases
2. SeekableStreamSupervisor: Supervisor + streaming-only operations. The existing streaming-only overrides exist along with the new abstract method public abstract LagStats computeLagStats(), for which custom implementations already exist in the concrete types

This PR is primarily a refactoring change with minimal functional adjustments (e.g., throwing an exception in a few places in SupervisorManager when the supervisor isn't the expected SeekableStreamSupervisor type).
2024-09-24 10:46:37 -07:00
Vadim Ogievetsky 4872f0457a
switch to using arrays by default (#17133) 2024-09-24 09:20:04 -07:00
Vadim Ogievetsky 1a13bd5485
Make the tooltip better and improve structure (#17132) 2024-09-24 09:19:52 -07:00
Kashif Faraz 9670305669
Cleanup Coordinator logs, add duty status API (#16959)
Description
-----------
Coordinator logs are fairly noisy and don't give much useful information (see example below).
Even when the Coordinator misbehaves, these logs are not very useful.

Main changes
------------
- Add API `GET /druid/coordinator/v1/duties` that returns a status list of all duty groups currently running on the Coordinator
- Emit metrics `segment/poll/time`, `segment/pollWithSchema/time`, `segment/buildSnapshot/time`
- Remove redundant logs that indicate normal operation of well-tested aspects of the Coordinator

Refactors
---------
- Move some logic from `DutiesRunnable` to `CoordinatorDutyGroup`
- Move stats collection from `CollectSegmentAndServerStats` to `PrepareBalancerAndLoadQueues`
- Minor cleanup of class `DruidCoordinator`
- Clean up class `DruidCoordinatorRuntimeParams`
  - Remove field `coordinatorStartTime`. Maintain start time in `MarkOvershadowedSegmentsAsUnused` instead.
  - Remove field `MetadataRuleManager`. Pass supplier to constructor of applicable duties instead.
  - Make `usedSegmentsNewestFirst` and `datasourcesSnapshot` as non-nullable as they are always required.
2024-09-24 19:46:22 +05:30
Vishesh Garg f576e299db
Allow MSQ engine only for compaction supervisors (#17033)
#16768 added the functionality to run compaction as a supervisor on the overlord.
This patch builds on top of that to restrict MSQ engine to compaction in the supervisor-mode only.
With these changes, users can no longer add MSQ engine as part of datasource compaction config,
or as the default cluster-level compaction engine, on the Coordinator. 

The patch also adds an Overlord runtime property `druid.supervisor.compaction.engine=<msq/native>`
to specify the default engine for compaction supervisors.

Since these updates require major changes to existing MSQ compaction integration tests,
this patch disables MSQ-specific compaction integration tests -- they will be taken up in a follow-up PR.

Key changed/added classes in this patch:
* CompactionSupervisor
* CompactionSupervisorSpec
* CoordinatorCompactionConfigsResource
* OverlordCompactionScheduler
2024-09-24 17:19:16 +05:30
Clint Wylie 77a362c555
various fixes and improvements to vectorization fallback (#17098)
changes:
* add `ApplyFunction` support to vectorization fallback, allowing many of the remaining expressions to be vectorized
* add `CastToObjectVectorProcessor` so that vector engine can correctly cast any type
* add support for array and complex vector constants
* reduce number of cases which can block vectorization in expression planner to be unknown inputs (such as unknown multi-valuedness)
* fix array constructor expression, apply map expression to make actual evaluated type match the output type inference
* fix bug in array_contains where something like array_contains([null], 'hello') would return true if the array was a numeric array since the non-null string value would cast to a null numeric
* fix isNull/isNotNull to correctly handle any type of input argument
2024-09-24 04:29:08 -07:00
Atul Mohan c1f8ae25b5
Support Iceberg ingestion from REST based catalogs (#17124)
Adds support to the iceberg input source to read from Iceberg REST Catalogs.
2024-09-23 22:13:24 -07:00
Vadim Ogievetsky ed33dbb76d
Web console: add stage graph (#17135)
* add stage graph

* update snapshot
2024-09-23 21:19:13 -07:00
PANKAJ KUMAR 36dfff4b1a
Adding extra debug logs for the checkpoint logic (#16321)
Logging to understand checkpointing better in streaming ingestion
2024-09-24 09:38:46 +05:30
Adithya Chakilam 8eaac2c051
cgroup monitors: Add mem/disk/cpu usage metrics for V2 (#16905)
* cgroup monitors: Add mem/disk/cpu usage metrics for V2

* intellij inspection

* docs and checks

* fix-dos

* add comments

* comments
2024-09-23 20:32:01 -07:00
zachjsh ba8245f114
Properly url encode schema and table names when syncing catalog (#17131)
* SQL syntax error should target USER persona

* * revert change to queryHandler and related tests, based on review comments

* * add test

* Properly encode schema and table names when syncing catalog

Previously table names and schema names were not being properly url
encododed when requested. Now they are.
2024-09-23 23:23:26 -04:00
Abhishek Radhakrishnan 37a2a12d79
rerwrite node so dynamic parameter applies to ingest node as well. (#17126) 2024-09-23 12:49:46 -07:00
Sree Charan Manamala 67d361c9bf
Window Functions : Remove enable windowing flag (#17087) 2024-09-23 08:24:26 +02:00
Akshat Jain 40414cfe78
MSQ window functions: Reject MVDs during window processing (#17036)
* MSQ window functions: Reject MVDs during window processing

* MSQ window functions: Reject MVDs during window processing

* Remove parameterization from MSQWindowTest
2024-09-23 11:39:35 +05:30
Vivek Dhiman df680bab05
Introduced `includeTrailerHeader` to enable `TrailerHeaders` in response (#16672)
Introduced includeTrailerHeader to enable TrailerHeaders in response
If enabled, a header X-Error-Message will be added to indicate reasons for partial results.
2024-09-21 14:29:37 +05:30
Virushade a4c971c373
Fix Multiple GC declared error when running Druid Cluster (#17078)
* Remove UseG1GC in all jvm.config to prevent multiple GC declared error
2024-09-20 11:06:13 +02:00
Zoltan Haindrich 2eee470f6e
Support explain in decoupled planning and log native plan consistently with DruidHook (#17101)
* enables to use DruidHook for native plan logging
* qudiem tests doesn't necessarily need to run the query to get an explain - this helps during development as if there is a runtime issue it could still be explained in the test
2024-09-20 10:53:43 +02:00
Abhishek Radhakrishnan 635e418131
Support to parse numbers in text-based input formats (#17082)
Text-based input formats like csv and tsv currently parse inputs only as strings, following the RFC4180Parser spec).
To workaround this, the web-console and other tools need to further inspect the sample data returned to sample data returned by the Druid sampler API to parse them as numbers. 

This patch introduces a new optional config, tryParseNumbers, for the csv and tsv input formats. If enabled, any numbers present in the input will be parsed in the following manner -- long data type for integer types and double for floating-point numbers, and if parsing fails for whatever reason, the input is treated as a string. By default, this configuration is set to false, so numeric strings will be treated as strings.
2024-09-19 13:21:18 -07:00
Clint Wylie 4f137d2700
hard-code compaction tasks to use ARRAY for multi-value handling to preserve order (#17110) 2024-09-19 11:56:12 -07:00
Ben Smithgall 40572360c5
fix small README typo (#17114) 2024-09-19 10:01:42 -07:00
Gian Merlino 3d45f9829c
Use the whole frame when writing rows. (#17094)
* Use the whole frame when writing rows.

This patch makes the following adjustments to enable writing larger
single rows to frames:

1) RowBasedFrameWriter: Max out allocation size on the final doubling.
   i.e., if the final allocation "naturally" would be 1 MiB but the
   max frame size is 900 KiB, use 900 KiB rather than failing the 1 MiB
   allocation.

2) AppendableMemory: In reserveAdditional, release the last block if it
   is empty. This eliminates waste when a frame writer uses a
   successive-doubling approach to find the right allocation size.

3) ArenaMemoryAllocator: Reclaim memory from the last allocation when
   the last allocation is closed.

Prior to these changes, a single row could be much smaller than the
frame size and still fail to be added to the frame.

* Style.

* Fix test.
2024-09-19 00:42:03 -07:00
Sree Charan Manamala b9a4c73e52
Window Functions : Improve performance by comparing Strings in frame bytes without converting them (#17091) 2024-09-19 09:36:28 +02:00
Abhishek Agarwal 8d1e596740
PostJoinCursor should never advance without interruption (#17099) 2024-09-19 09:10:59 +02:00
Rishabh Singh 953fe11e31
gRPC query extension (#15982)
Revives #14024 and additionally supports,

Native queries
gRPC health check endpoint
This PR doesn't have the shaded module for packaging gRPC and Guava libraries since grpc-query module uses the same Guava version as that of Druid.

The response is gRPC-specific. It provides the result schema along with the results as a binary "blob". Results can be in CSV, JSON array lines or as an array of Protobuf objects. If using Protobuf, the corresponding class must be installed along with the gRPC query extension so it is available to the Broker at runtime.
2024-09-19 12:16:32 +05:30
Pranav d1bd6a8156
Update doc for allowedHeaders (#17045)
Update doc for allowedHeaders and make allowedHeaders more restrictive
2024-09-19 08:37:39 +05:30
Gian Merlino 2d2882cdfe
Add test for exceptions in FutureUtils.transformAsync. (#17106)
Adds an additional test case to FutureUtilsTest.
2024-09-18 16:08:47 -07:00
Gian Merlino ca0cb64ee8
MSQ: Fix calculation of suggested memory in WorkerMemoryParameters. (#17108)
The "suggested server memory" figure needs to take into account
maxConcurrentStages. The fix here does not affect the main memory
calculations, but it does affect the accuracy of error messages.
2024-09-18 16:08:35 -07:00
Abhishek Radhakrishnan 39723e5401
Update note about `sys.tasks` table (#17096)
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
2024-09-18 11:02:45 -07:00
Rishabh Singh 43d790fdb7
Normalize schema fingerprint for column permutations (#17044)
Parent issue: #14989

It is possible for the order of columns to vary across segments especially during realtime ingestion.
Since, the schema fingerprint is sensitive to column order this leads to creation of a large number of segment schema in the metadata database for essentially the same set of columns.

This is wasteful, this patch fixes this problem by computing schema fingerprint on lexicographically sorted columns. This would result in creation of a single schema in the metadata database with the first observed column order for a given signature.
2024-09-18 11:37:06 +05:30
Adarsh Sanjeev 2f50138af9
Modify DataSegmentProvider to also return DataSegment (#17021)
Currently, TaskDataSegmentProvider fetches the DataSegment from the Coordinator while loading the segment, but just discards it later. This PR refactors this to also return the DataSegment so that it can be used by workers without a separate fetch.
2024-09-18 11:20:20 +05:30
Zoltan Haindrich d84d53c017
Decoupled planning: improve join support (#17039)
There were some problematic cases

join branches are run with finalize=false instead of finalize=true like normal subqueries
this inconsistency is not good - but fixing it is a bigger thing
ensure that right hand sides of joins are always subqueries - or accessible globally
To achieve the above:

operand indexes were needed for the upstream reltree nodes in the generator
source unwrapping now takes the join situation into account as well
2024-09-18 08:56:36 +05:30
Rishabh Singh dd8c7de144
Reduce the number of ITs in CDS groups (#17059)
* Reduce the number of ITs in CDS groups

The two CDS IT groups cds-task-schema-publish-disabled & cds-coordinator-metadata-query-disabled
runs a lot of ITs serially, leading to increased CI runtime.

Reducing the number of ITs running in the two groups to,
ITAppendBatchIndexTest (append-ingestion)
ITIndexerTest (batch-index)
ITOverwriteBatchIndexTest (batch-index)
ITCompactionSparseColumnIndexTest (compaction)
ITCompactionTaskTest (compaction)
ITKafkaIndexingServiceDataFormatTest (kafka-data-format)

Both these groups will be removed once CDS is enabled by default.
2024-09-17 17:28:03 -07:00
Cece Mei 88c3c20ab6
Create a FilterBundle.Builder class and use it to construct FilterBundle. (#17055) 2024-09-17 15:59:33 -07:00
Edgar Melendrez 64a4d115c5
[Docs] adding admonition for div (#17093)
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
2024-09-17 13:54:49 -07:00
Gian Merlino c5968aa463
MSQ: Add QueryKitSpec to encapsulate QueryKit params. (#17077)
* MSQ: Add QueryKitSpec to encapsulate QueryKit params.

This patch introduces QueryKitSpec, an object that encapsulates the
parameters to makeQueryDefinition that are consistent from call to
call. This simplifies things because we avoid passing around all the
components individually.

This patch also splits "maxWorkerCount" into "maxLeafWorkerCount" and
"maxNonLeafWorkerCount", which apply to leaf stages (no other stages as
inputs) and nonleaf stages respectively.

Finally, this patch also rovides a way for ControllerContext to supply a
QueryKitSpec to its liking. It is expected that this will be used by
controllers of quick interactive queries to set maxNonLeafWorkerCount = 1,
which will generate fanning-in query plans.

* Fix javadoc.
2024-09-17 13:37:14 -07:00
Clint Wylie a93546d493
add VirtualColumns.findEquivalent and VirtualColumn.EquivalenceKey (#17084) 2024-09-17 13:17:44 -07:00
Katya Macedo 490211f2b1
Docs - update streaming ingestion terminology for Kafka and Kinesis (#17003) 2024-09-17 09:49:24 -07:00
Gian Merlino 46cbb33428
FrameChannelMerger: Fix incorrect behavior of finished(). (#17088)
Previously, the processor used "remainingChannels" to track the number of
non-null entries of currentFrame. Now, "remainingChannels" tracks the
number of channels that are unfinished.

The difference is subtle. In the previous code, when an input channel
was blocked upon exiting nextFrame(), the "currentFrames" entry would be
null, and therefore the "remainingChannels" variable would be decremented.
After the next await and call to populateCurrentFramesAndTournamentTree(),
"remainingChannels" would be incremented if the channel had become
unblocked after awaiting.

This means that finished(), which returned true if remainingChannels was
zero, would not be reliable if called between nextFrame() and the
next await + populateCurrentFramesAndTournamentTree().

This patch changes things such that finished() is always reliable. This
fixes a regression introduced in PR #16911, which added a call to
finished() that was, at that time, unsafe.
2024-09-17 08:35:54 -07:00
Gian Merlino 50503fe0ef
MSQ: Properly report errors that occur when starting up RunWorkOrder. (#17069)
* MSQ: Properly report errors that occur when starting up RunWorkOrder.

In #17046, an exception thrown by RunWorkOrder#startAsync would be ignored
and replaced with a generic CanceledFault. This patch fixes it by retaining
the original error.
2024-09-17 20:32:02 +05:30
Lasse Mammen 307b8e3357
feat: json_merge expression and sql function (#17081) 2024-09-17 18:27:34 +05:30
Gian Merlino 8af9b4729f
TableInputSpecSlicer changes to support running on Brokers. (#17074)
* TableInputSpecSlicer changes to support running on Brokers.

Changes:

1) Rename TableInputSpecSlicer to IndexerTableInputSpecSlicer, in anticipation
   of a new implementation being added for controllers running on Brokers.

2) Allow the context to use the WorkerManager to build the TableInputSpecSlicer,
   in anticipation of Brokers wanting to use this to assign segments to servers
   that are already serving those segments.

3) Remove unused DataSegmentTimelineView interface.

4) Add additional javadoc to DataSegmentProvider.

* Style.
2024-09-17 03:51:18 -07:00
Gian Merlino c56e23ec37
Remove workerId parameter from postWorkerError. (#17072)
* Remove workerId parameter from postWorkerError.

It was redundant to MSQErrorReport#getTaskId.

* Fix javadoc.
2024-09-17 01:37:46 -07:00
Gian Merlino 2e4d596d82
MSQ: Include worker context maps in WorkOrders. (#17076)
* MSQ: Include worker context maps in WorkOrders.

This provides a mechanism to send contexts to workers in long-lived,
shared JVMs that are not part of the task system.

* Style, coverage.
2024-09-17 01:37:21 -07:00
Sree Charan Manamala bb1c3c1749
Add serde for ColumnBasedRowsAndColumns to fix window queries without group by (#16658)
Register a Ser-De for RowsAndColumns so that the window operator query running on leaf operators would be transferred properly on the wire. Would fix the empty response given by window queries without group by on the native engine.
2024-09-17 06:44:40 +02:00
Laksh Singla bb487a4193
Support maxSubqueryBytes for window functions (#16800)
Window queries now acknowledge maxSubqueryBytes.
2024-09-17 10:06:24 +05:30
Victoria Lim 2e2f3cf66a
docs: Refresh docs for SQL input source (#17031)
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2024-09-16 15:52:37 -07:00
Gian Merlino 9696f0b37c
Remove close method on MSQWarningReportPublisher. (#17071)
It didn't do anything and also wasn't called.
2024-09-16 18:38:36 +05:30
Gian Merlino 4d8015578d
Remove unused WorkerManagerClient interface. (#17073) 2024-09-16 18:00:47 +05:30