Commit Graph

3268 Commits

Author SHA1 Message Date
Gian Merlino b9634a8613
SuperSorter: Don't set allDone if it's already set. (#17238)
This fixes a race where, if there is no output at all, setAllDoneIfPossible
could be called twice (once when the output partitions future resolves, and
once when the batcher finishes). If the calls happen in that order, it would
try to create nil output channels both times, resulting in a "Channel already set"
error.
2024-10-04 06:41:16 +05:30
Gian Merlino db7cc4634c
Dart: Smoother handling of stage early-exit. (#17228)
Stages can be instructed to exit before they finish, especially when a
downstream stage includes a "LIMIT". This patch has improvements related
to early-exiting stages.

Bug fix:

- WorkerStageKernel: Don't allow fail() to set an exception if the stage is
  already in a terminal state (FINISHED or FAILED). If fail() is called while
  in a terminal state, log the exception, then throw it away. If it's a
  cancellation exception, don't even log it. This fixes a bug where a stage
  that exited early could transition to FINISHED and then to FAILED, causing
  the overall query to fail.

Performance:

- DartWorkerManager previously sent stopWorker commands to workers
  even when "interrupt" was false. Now it only sends those commands when
  "interrupt" is true. The method javadoc already claimed this is what the
  method did, but the implementation did not match the javadoc. This reduces
  the number of RPCs by 1 per worker per query.

Quieter logging:

- In ReadableByteChunksFrameChannel, skip logging exception from setError if
  the channel has been closed. Channels are closed when readers are done with
  them, so at that point, we wouldn't be interested in the errors.

- In RunWorkOrder, skip calling notifyListener on failure of the main work,
  in the case when stop() has already been called. The stop() method will
  set its own error using CanceledFault. This enables callers to detect
  when a stage was canceled vs. failed for some other reason.

- In WorkerStageKernel, skip logging cancellation errors in fail(). This is
  made possible by the previous change in RunWorkOrder.
2024-10-03 20:09:02 +05:30
Zoltan Haindrich 65277b17a9
Decoupled planning: add support for unnest (#17177)
* adds support for `UNNEST` expressions
* introduces `LogicalUnnestRule` to transform a `Correlate` doing UNNEST into a `LogicalUnnest`
* `UnnestInputCleanupRule` could move the final unnested expr into the `LogicalUnnest` itself (usually its an `mv_to_array` expression)
* enhanced source unwrapping to utilize `FilteredDataSource` if it looks right
2024-10-02 08:54:56 +02:00
Gian Merlino 878adff9aa
MSQ profile for Brokers and Historicals. (#17140)
This patch adds a profile of MSQ named "Dart" that runs on Brokers and
Historicals, and which is compatible with the standard SQL query API.
For more high-level description, and notes on future work, refer to #17139.

This patch contains the following changes, grouped into packages.

Controller (org.apache.druid.msq.dart.controller):

The controller runs on Brokers. Main classes are,

- DartSqlResource, which serves /druid/v2/sql/dart/.
- DartSqlEngine and DartQueryMaker, the entry points from SQL that actually
  run the MSQ controller code.
- DartControllerContext, which configures the MSQ controller.
- DartMessageRelays, which sets up relays (see "message relays" below) to read
  messages from workers' DartControllerClients.
- DartTableInputSpecSlicer, which assigns work based on a TimelineServerView.

Worker (org.apache.druid.msq.dart.worker)

The worker runs on Historicals. Main classes are,

- DartWorkerResource, which supplies the regular MSQ WorkerResource, plus
  Dart-specific APIs.
- DartWorkerRunner, which runs MSQ worker code.
- DartWorkerContext, which configures the MSQ worker.
- DartProcessingBuffersProvider, which provides processing buffers from
  sliced-up merge buffers.
- DartDataSegmentProvider, which provides segments from the Historical's
  local cache.

Message relays (org.apache.druid.messages):

To avoid the need for Historicals to contact Brokers during a query, which
would create opportunities for queries to get stuck, all connections are
opened from Broker to Historical. This is made possible by a message relay
system, where the relay server (worker) has an outbox of messages.

The relay client (controller) connects to the outbox and retrieves messages.
Code for this system lives in the "server" package to keep it separate from
the MSQ extension and make it easier to maintain. The worker-to-controller
ControllerClient is implemented using message relays.

Other changes:

- Controller: Added the method "hasWorker". Used by the ControllerMessageListener
  to notify the appropriate controllers when a worker fails.
- WorkerResource: No longer tries to respond more than once in the
  "httpGetChannelData" API. This comes up when a response due to resolved future
  is ready at about the same time as a timeout occurs.
- MSQTaskQueryMaker: Refactor to separate out some useful functions for reuse
  in DartQueryMaker.
- SqlEngine: Add "queryContext" to "resultTypeForSelect" and "resultTypeForInsert".
  This allows the DartSqlEngine to modify result format based on whether a "fullReport"
  context parameter is set.
- LimitedOutputStream: New utility class. Used when in "fullReport" mode.
- TimelineServerView: Add getDruidServerMetadata as a performance optimization.
- CliHistorical: Add SegmentWrangler, so it can query inline data, lookups, etc.
- ServiceLocation: Add "fromUri" method, relocating some code from ServiceClientImpl.
- FixedServiceLocator: New locator for a fixed set of service locations. Useful for
  URI locations.
2024-10-01 14:38:55 -07:00
Shivam Garg ab361747a8
Migrated commons-lang usages to commons-lang3 (#17156) 2024-09-28 10:28:11 +02:00
Clint Wylie f8a72b987a
read metadata in SimpleQueryableIndex if available to compute segment ordering (#17181) 2024-09-27 19:39:03 -07:00
Clint Wylie 157fe1bc1f
fix a mistake in CursorGranularizer to check doneness after advance (#17175)
Fixes a mistake introduced in #16533 which can result in CursorGranularizer incorrectly trying to get values from a selector after calling cursor.advance because of a missing check for cursor.isDone
2024-09-27 09:36:05 +05:30
Clint Wylie d77637344d
log.warn anytime a column is relying on ArrayIngestMode.MVD (#17164)
* log.warn anytime a column is relying on ArrayIngestMode.MVD
2024-09-26 13:44:37 +05:30
Cece Mei a2b011cdcd
Incorporate `estimatedComputeCost` into all `BitmapColumnIndex` classes. (#17125)
changes:
* filter index processing is now automatically ordered based on estimated 'cost', which is approximated based on how many expected bitmap operations are required to construct the bitmap used for the 'offset'
* cursorAutoArrangeFilters context flag now defaults to true, but can be set to false to disable cost based filter index sorting
2024-09-25 23:11:26 -07:00
Clint Wylie 1b5b61ef7f
add multi-value string object vector matcher and expression vector object selectors (#17162) 2024-09-25 22:57:29 -07:00
Clint Wylie 0ec7eb3ae5
use CastToObjectVectorProcessor for cast to string (#17148) 2024-09-24 15:21:33 -07:00
Abhishek Radhakrishnan 83299e9882
Miscellaneous cleanup in the supervisor API flow. (#17144)
Extracting a few miscellaneous non-functional changes from the batch supervisor branch:

- Replace anonymous inner classes with lambda expressions in the SQL supervisor manager layer
- Add explicit @Nullable annotations in DynamicConfigProviderUtils to make IDE happy
- Small variable renames (copy-paste error perhaps) and fix typos
- Add table name for this exception message: Delete the supervisor from the table[%s] in the database...
- Prefer CollectionUtils.isEmptyOrNull() over list == null || list.size() > 0. We can change the Precondition checks to throwing DruidException separately for a batch of APIs at a time.
2024-09-24 13:06:23 -07:00
George Shiqi Wu d1bfabbf4d
inter-Extension dependency support (#16973)
* update docs for kafka lookup extension to specify correct extension ordering

* fix first line

* test with extension dependencies

* save work on dependency management

* working dependency graph

* working pull

* fix style

* fix style

* remove name

* load extension dependencies recursively

* generate depenencies on classloader creation

* add check for circular dependencies

* fix style

* revert style changes

* remove mutable class loader

* clean up class heirarchy

* extensions loader test working

* add unit tests

* pr comments

* fix unit tests
2024-09-24 14:17:33 -04:00
Clint Wylie 77a362c555
various fixes and improvements to vectorization fallback (#17098)
changes:
* add `ApplyFunction` support to vectorization fallback, allowing many of the remaining expressions to be vectorized
* add `CastToObjectVectorProcessor` so that vector engine can correctly cast any type
* add support for array and complex vector constants
* reduce number of cases which can block vectorization in expression planner to be unknown inputs (such as unknown multi-valuedness)
* fix array constructor expression, apply map expression to make actual evaluated type match the output type inference
* fix bug in array_contains where something like array_contains([null], 'hello') would return true if the array was a numeric array since the non-null string value would cast to a null numeric
* fix isNull/isNotNull to correctly handle any type of input argument
2024-09-24 04:29:08 -07:00
Adithya Chakilam 8eaac2c051
cgroup monitors: Add mem/disk/cpu usage metrics for V2 (#16905)
* cgroup monitors: Add mem/disk/cpu usage metrics for V2

* intellij inspection

* docs and checks

* fix-dos

* add comments

* comments
2024-09-23 20:32:01 -07:00
Akshat Jain 40414cfe78
MSQ window functions: Reject MVDs during window processing (#17036)
* MSQ window functions: Reject MVDs during window processing

* MSQ window functions: Reject MVDs during window processing

* Remove parameterization from MSQWindowTest
2024-09-23 11:39:35 +05:30
Abhishek Radhakrishnan 635e418131
Support to parse numbers in text-based input formats (#17082)
Text-based input formats like csv and tsv currently parse inputs only as strings, following the RFC4180Parser spec).
To workaround this, the web-console and other tools need to further inspect the sample data returned to sample data returned by the Druid sampler API to parse them as numbers. 

This patch introduces a new optional config, tryParseNumbers, for the csv and tsv input formats. If enabled, any numbers present in the input will be parsed in the following manner -- long data type for integer types and double for floating-point numbers, and if parsing fails for whatever reason, the input is treated as a string. By default, this configuration is set to false, so numeric strings will be treated as strings.
2024-09-19 13:21:18 -07:00
Gian Merlino 3d45f9829c
Use the whole frame when writing rows. (#17094)
* Use the whole frame when writing rows.

This patch makes the following adjustments to enable writing larger
single rows to frames:

1) RowBasedFrameWriter: Max out allocation size on the final doubling.
   i.e., if the final allocation "naturally" would be 1 MiB but the
   max frame size is 900 KiB, use 900 KiB rather than failing the 1 MiB
   allocation.

2) AppendableMemory: In reserveAdditional, release the last block if it
   is empty. This eliminates waste when a frame writer uses a
   successive-doubling approach to find the right allocation size.

3) ArenaMemoryAllocator: Reclaim memory from the last allocation when
   the last allocation is closed.

Prior to these changes, a single row could be much smaller than the
frame size and still fail to be added to the frame.

* Style.

* Fix test.
2024-09-19 00:42:03 -07:00
Sree Charan Manamala b9a4c73e52
Window Functions : Improve performance by comparing Strings in frame bytes without converting them (#17091) 2024-09-19 09:36:28 +02:00
Abhishek Agarwal 8d1e596740
PostJoinCursor should never advance without interruption (#17099) 2024-09-19 09:10:59 +02:00
Pranav d1bd6a8156
Update doc for allowedHeaders (#17045)
Update doc for allowedHeaders and make allowedHeaders more restrictive
2024-09-19 08:37:39 +05:30
Gian Merlino 2d2882cdfe
Add test for exceptions in FutureUtils.transformAsync. (#17106)
Adds an additional test case to FutureUtilsTest.
2024-09-18 16:08:47 -07:00
Adarsh Sanjeev 2f50138af9
Modify DataSegmentProvider to also return DataSegment (#17021)
Currently, TaskDataSegmentProvider fetches the DataSegment from the Coordinator while loading the segment, but just discards it later. This PR refactors this to also return the DataSegment so that it can be used by workers without a separate fetch.
2024-09-18 11:20:20 +05:30
Cece Mei 88c3c20ab6
Create a FilterBundle.Builder class and use it to construct FilterBundle. (#17055) 2024-09-17 15:59:33 -07:00
Clint Wylie a93546d493
add VirtualColumns.findEquivalent and VirtualColumn.EquivalenceKey (#17084) 2024-09-17 13:17:44 -07:00
Gian Merlino 46cbb33428
FrameChannelMerger: Fix incorrect behavior of finished(). (#17088)
Previously, the processor used "remainingChannels" to track the number of
non-null entries of currentFrame. Now, "remainingChannels" tracks the
number of channels that are unfinished.

The difference is subtle. In the previous code, when an input channel
was blocked upon exiting nextFrame(), the "currentFrames" entry would be
null, and therefore the "remainingChannels" variable would be decremented.
After the next await and call to populateCurrentFramesAndTournamentTree(),
"remainingChannels" would be incremented if the channel had become
unblocked after awaiting.

This means that finished(), which returned true if remainingChannels was
zero, would not be reliable if called between nextFrame() and the
next await + populateCurrentFramesAndTournamentTree().

This patch changes things such that finished() is always reliable. This
fixes a regression introduced in PR #16911, which added a call to
finished() that was, at that time, unsafe.
2024-09-17 08:35:54 -07:00
Lasse Mammen 307b8e3357
feat: json_merge expression and sql function (#17081) 2024-09-17 18:27:34 +05:30
Sree Charan Manamala bb1c3c1749
Add serde for ColumnBasedRowsAndColumns to fix window queries without group by (#16658)
Register a Ser-De for RowsAndColumns so that the window operator query running on leaf operators would be transferred properly on the wire. Would fix the empty response given by window queries without group by on the native engine.
2024-09-17 06:44:40 +02:00
Laksh Singla bb487a4193
Support maxSubqueryBytes for window functions (#16800)
Window queries now acknowledge maxSubqueryBytes.
2024-09-17 10:06:24 +05:30
Gian Merlino 5b7fb5fbca
Speed up FrameFileTest, SuperSorterTest. (#17068)
* Speed up FrameFileTest, SuperSorterTest.

These are two heavily parameterized tests that, together, account for
about 60% of runtime in the test suite.

FrameFileTest changes:

1) Cache frame files in a static, rather than building the frame file
   for each parameterization of the test.

2) Adjust TestArrayCursorFactory to cache the signature, rather than
   re-creating it on each call to getColumnCapabilities.

SuperSorterTest changes:

1) Dramatically reduce the number of tests that run with
   "maxRowsPerFrame" = 1. These are particularly slow due to writing so
   many small files. Some still run, since it's useful to test edge cases,
   but much fewer than before.

2) Reduce the "maxActiveProcessors" axis of the test from [1, 2, 4] to
   [1, 3]. The aim is to reduce the number of cases while still getting
   good coverage of the feature.

3) Reduce the "maxChannelsPerProcessor" axis of the test from [2, 3, 8]
   to [2, 7]. The aim is to reduce the number of cases while still getting
   good coverage of the feature.

4) Use in-memory input channels rather than file channels.

5) Defer formatting of assertion failure messages until they are needed.

6) Cache the cursor factory and its signature in a static.

7) Cache sorted test rows (used for verification) in a static.

* It helps to include the file.

* Style.
2024-09-15 17:03:18 -07:00
Clint Wylie 73a644258d
abstract `IncrementalIndex` cursor stuff to prepare for using different "views" of the data based on the cursor build spec (#17064)
* abstract `IncrementalIndex` cursor stuff to prepare to allow for possibility of using different "views" of the data based on the cursor build spec
changes:
* introduce `IncrementalIndexRowSelector` interface to capture how `IncrementalIndexCursor` and `IncrementalIndexColumnSelectorFactory` read data
* `IncrementalIndex` implements `IncrementalIndexRowSelector`
* move `FactsHolder` interface to separate file
* other minor refactorings
2024-09-15 16:45:51 -07:00
Gian Merlino 4dc5942dab
BaseWorkerClientImpl: Don't attempt to recover from a closed channel. (#17052)
* BaseWorkerClientImpl: Don't attempt to recover from a closed channel.

This patch introduces an exception type "ChannelClosedForWritesException",
which allows the BaseWorkerClientImpl to avoid retrying when the local
channel has been closed. This can happen in cases of cancellation.

* Add some test coverage.

* wip

* Add test coverage.

* Style.
2024-09-15 02:10:58 -07:00
Gian Merlino 6fac267f17
MSQ: Improved worker cancellation. (#17046)
* MSQ: Improved worker cancellation.

Four changes:

1) FrameProcessorExecutor now requires that cancellationIds be registered
   with "registerCancellationId" prior to being used in "runFully" or "runAllFully".

2) FrameProcessorExecutor gains an "asExecutor" method, which allows that
   executor to be used as an executor for future callbacks in such a way
   that respects cancellationId.

3) RunWorkOrder gains a "stop" method, which cancels the current
   cancellationId and closes the current FrameContext. It blocks until
   both operations are complete.

4) Fixes a bug in RunAllFullyWidget where "processorManager.result()" was
   called outside "runAllFullyLock", which could cause it to be called
   out-of-order with "cleanup()" in case of cancellation or other error.

Together, these changes help ensure cancellation does not have races.
Once "cancel" is called for a given cancellationId, all existing processors
and running callbacks are canceled and exit in an orderly manner. Future
processors and callbacks with the same cancellationId are rejected
before being executed.

* Fix test.

* Use execute, which doesn't return, to avoid errorprone complaints.

* Fix some style stuff.

* Further enhancements.

* Fix style.
2024-09-15 01:22:28 -07:00
Gian Merlino fd6706cd6a
MSQ: Rework memory management. (#17057)
* MSQ: Rework memory management.

This patch reworks memory management to better support multi-threaded
workers running in shared JVMs. There are two main changes.

First, processing buffers and threads are moved from a per-JVM model to
a per-worker model. This enables queries to hold processing buffers
without blocking other concurrently-running queries. Changes:

- Introduce ProcessingBuffersSet and ProcessingBuffers to hold the
  per-worker and per-work-order processing buffers (respectively). On Peons,
  this is the JVM-wide processing pool. On Indexers, this is a per-worker
  pool of on-heap buffers. (This change fixes a bug on Indexers where
  excessive processing buffers could be used if MSQ tasks ran concurrently
  with realtime tasks.)

- Add "bufferPool" argument to GroupingEngine#process so a per-worker pool
  can be passed in.

- Add "druid.msq.task.memory.maxThreads" property, which controls the
  maximum number of processing threads to use per task. This allows usage of
  multiple processing buffers per task if admins desire.

- IndexerWorkerContext acquires processingBuffers when creating the FrameContext
  for a work order, and releases them when closing the FrameContext.

- Add "usesProcessingBuffers()" to FrameProcessorFactory so workers know
  how many sets of processing buffers are needed to run a given query.

Second, adjustments to how WorkerMemoryParameters slices up bundles, to
favor more memory for sorting and segment generation. Changes:

- Instead of using same-sized bundles for processing and for sorting,
  workers now use minimally-sized processing bundles (just enough to read
  inputs plus a little overhead). The rest is devoted to broadcast data
  buffering, sorting, and segment-building.

- Segment-building is now limited to 1 concurrent segment per work order.
  This allows each segment-building action to use more memory. Note that
  segment-building is internally multi-threaded to a degree. (Build and
  persist can run concurrently.)

- Simplify frame size calculations by removing the distinction between
  "standard" and "large" frames. The new default frame size is the same
  as the old "standard" frames, 1 MB. The original goal of of the large
  frames was to reduce the number of temporary files during sorting, but
  I think we can achieve the same thing by simply merging a larger number
  of standard frames at once.

- Remove the small worker adjustment that was added in #14117 to account
  for an extra frame involved in writing to durable storage. Instead,
  account for the extra frame whenever we are actually using durable storage.

- Cap super-sorter parallelism using the number of output partitions, rather
  than using a hard coded cap at 4. Note that in practice, so far, this cap
  has not been relevant for tasks because they have only been using a single
  processing thread anyway.

* Remove unused import.

* Fix errorprone annotation.

* Fixes for javadocs and inspections.

* Additional test coverage.

* Fix test.
2024-09-14 15:35:21 -07:00
Clint Wylie 28ec962a06
add CursorHolder.isPreAggregated method to allow cursors on pre-aggregated data (#17058)
changes:
* CursorHolder.isPreAggregated method indicates that a cursor has pre-aggregated data for all AggregatorFactory specified in a CursorBuildSpec. If true, engines should rewrite the query to use AggregatorFactory.getCombiningAggreggator, and column selector factories will provide selectors with the aggregator interediate type for the aggregator factory name
* Added groupby, timeseries, and topN support for CursorHolder.isPreAggregated
* Added synthetic test since no CursorHolder implementations support isPreAggregated at this point in time
2024-09-13 12:52:35 -07:00
Adithya Chakilam 6ef8d5d8e1
OshiSysMonitor: Add ability to skip emitting metrics (#16972)
* OshiSysMonitor: Add ability to skip emitting metrics

* comments

* static checks

* remove oshi
2024-09-12 11:32:31 -04:00
Laksh Singla d3392a23ce
Cancel the group by processing tasks if the merging runner gets scheduled post the query timeout (#17037)
If the GroupByMergingQueryRunner gets scheduled after the query timeout, it fails to clean up the processing tasks that have been scheduled. This can lead to unnecessary processing being done for the tasks whos results won't get consumed.
2024-09-12 15:10:27 +05:30
Pranav a95397e712
Allow request headers in HttpInputSource in native and MSQ Ingestion (#16974)
Support for adding the request headers in http input source. we can now pass the additional headers as json in both native and MSQ.
2024-09-12 11:18:44 +05:30
Sree Charan Manamala c7c3307e61
Fix String Frame Readers to read String Arrays correctly (#16885)
While writing to a frame, String arrays are written by setting the multivalue byte.
But while reading, it was hardcoded to false.
2024-09-10 14:20:54 +05:30
Laksh Singla 72fbaf2e56
Non querying tasks shouldn't use processing buffers / merge buffers (#16887)
Tasks that do not support querying or query processing i.e. supportsQueries = false do not require processing threads, processing buffers, and merge buffers.
2024-09-10 11:36:36 +05:30
Abhishek Agarwal 78775ad398
Prepare master for 32.0.0 release (#17022) 2024-09-10 11:01:20 +05:30
Clint Wylie f57cd6f7af
transition away from StorageAdapter (#16985)
* transition away from StorageAdapter
changes:
* CursorHolderFactory has been renamed to CursorFactory and moved off of StorageAdapter, instead fetched directly from the segment via 'asCursorFactory'. The previous deprecated CursorFactory interface has been merged into StorageAdapter
* StorageAdapter is no longer used by any engines or tests and has been marked as deprecated with default implementations of all methods that throw exceptions indicating the new methods to call instead
* StorageAdapter methods not covered by CursorFactory (CursorHolderFactory prior to this change) have been moved into interfaces which are retrieved by Segment.as, the primary classes are the previously existing Metadata, as well as new interfaces PhysicalSegmentInspector and TopNOptimizationInspector
* added UnnestSegment and FilteredSegment that extend WrappedSegmentReference since their StorageAdapter implementations were previously provided by WrappedSegmentReference
* added PhysicalSegmentInspector which covers some of the previous StorageAdapter functionality which was primarily used for segment metadata queries and other metadata uses, and is implemented for QueryableIndexSegment and IncrementalIndexSegment
* added TopNOptimizationInspector to cover the oddly specific StorageAdapter.hasBuiltInFilters implementation, which is implemented for HashJoinSegment, UnnestSegment, and FilteredSegment
* Updated all engines and tests to no longer use StorageAdapter
2024-09-09 14:55:29 -07:00
Sree Charan Manamala 51fe3c08ab
Window Functions : Reject MVDs during window processing (#17002)
This commit aims to reject MVDs in window processing as we do not support them.
Earlier to this commit, query running a window aggregate partitioned by an MVD column would fail with ClassCastException
2024-09-09 12:07:54 +05:30
Clint Wylie b0f36c1b89
fix bug with CastOperatorConversion with types which cannot be mapped to native druid types (#17011) 2024-09-06 17:07:32 -07:00
Gian Merlino 175636b28f
Frame writers: Coerce numeric and array types in certain cases. (#16994)
This patch adds "TypeCastSelectors", which is used when writing frames to
perform two coercions:

- When a numeric type is desired and the underlying type is non-numeric or
  unknown, the underlying selector is wrapped, "getObject" is called and the
  result is coerced using "ExprEval.ofType". This differs from the prior
  behavior where the primitive methods like "getLong", "getDouble", etc, would
  be called directly. This fixes an issue where a column would be read as
  all-zeroes when its SQL type is numeric and its physical type is string, which
  can happen when evolving a column's type from string to number.

-  When an array type is desired, the underlying selector is wrapped,
   "getObject" is called, and the result is coerced to Object[]. This coercion
   replaces some earlier logic from #15917.
2024-09-05 17:20:00 -07:00
Kashif Faraz ba6f804f48
Fix compaction status API response (#17006)
Description:
#16768 introduces new compaction APIs on the Overlord `/compact/status` and `/compact/progress`.
But the corresponding `OverlordClient` methods do not return an object compatible with the actual
endpoints defined in `OverlordCompactionResource`.

This patch ensures that the objects are compatible.

Changes:
- Add `CompactionStatusResponse` and `CompactionProgressResponse`
- Use these as the return type in `OverlordClient` methods and as the response entity in `OverlordCompactionResource`
- Add `SupervisorCleanupModule` bound on the Coordinator to perform cleanup of supervisors.
Without this module, Coordinator cannot deserialize compaction supervisors.
2024-09-05 23:22:01 +05:30
Clint Wylie 57bf053dc9
remove compiler warnings about unqualified calls to yield() (#16995) 2024-09-03 20:04:30 -07:00
Gian Merlino 57c4b552d9
Fix logical merge conflict in SuperSorterTest. (#16993)
Logical merge conflict between #16911 and #16914.
2024-09-03 16:14:59 -04:00
Gian Merlino 786c959e9e
MSQ: Add limitHint to global-sort shuffles. (#16911)
* MSQ: Add limitHint to global-sort shuffles.

This allows pushing down limits into the SuperSorter.

* Test fixes.

* Add limitSpec to ScanQueryKit. Fix SuperSorter tracking.
2024-09-03 09:05:29 -07:00
Sree Charan Manamala 619d8ef964
Window Functions : Numeric Arrays Frame Column Writers - fix class cast exception (#16983)
Fix ClassCastException in ArrayFrameCoulmnWriters
2024-09-03 11:44:52 +05:30