Previously, the processor used "remainingChannels" to track the number of
non-null entries of currentFrame. Now, "remainingChannels" tracks the
number of channels that are unfinished.
The difference is subtle. In the previous code, when an input channel
was blocked upon exiting nextFrame(), the "currentFrames" entry would be
null, and therefore the "remainingChannels" variable would be decremented.
After the next await and call to populateCurrentFramesAndTournamentTree(),
"remainingChannels" would be incremented if the channel had become
unblocked after awaiting.
This means that finished(), which returned true if remainingChannels was
zero, would not be reliable if called between nextFrame() and the
next await + populateCurrentFramesAndTournamentTree().
This patch changes things such that finished() is always reliable. This
fixes a regression introduced in PR #16911, which added a call to
finished() that was, at that time, unsafe.
Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
These are two heavily parameterized tests that, together, account for
about 60% of runtime in the test suite.
FrameFileTest changes:
1) Cache frame files in a static, rather than building the frame file
for each parameterization of the test.
2) Adjust TestArrayCursorFactory to cache the signature, rather than
re-creating it on each call to getColumnCapabilities.
SuperSorterTest changes:
1) Dramatically reduce the number of tests that run with
"maxRowsPerFrame" = 1. These are particularly slow due to writing so
many small files. Some still run, since it's useful to test edge cases,
but much fewer than before.
2) Reduce the "maxActiveProcessors" axis of the test from [1, 2, 4] to
[1, 3]. The aim is to reduce the number of cases while still getting
good coverage of the feature.
3) Reduce the "maxChannelsPerProcessor" axis of the test from [2, 3, 8]
to [2, 7]. The aim is to reduce the number of cases while still getting
good coverage of the feature.
4) Use in-memory input channels rather than file channels.
5) Defer formatting of assertion failure messages until they are needed.
6) Cache the cursor factory and its signature in a static.
7) Cache sorted test rows (used for verification) in a static.
Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
changes:
* filter index processing is now automatically ordered based on estimated 'cost', which is approximated based on how many expected bitmap operations are required to construct the bitmap used for the 'offset'
* cursorAutoArrangeFilters context flag now defaults to true, but can be set to false to disable cost based filter index sorting
Fixes a mistake introduced in #16533 which can result in CursorGranularizer incorrectly trying to get values from a selector after calling cursor.advance because of a missing check for cursor.isDone
The existing tests are moved into a "WithMaximalBuffering" subclass,
and a new "WithMinimalBuffering" subclass is added to test cases
where only a single frame is buffered.
Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
Makes sketch encoding in MSQ configurable by the user. This would allow a user to configure the sketch encoding method for a specific query.
The default is octet stream encoding.
Description
-----------
Coordinator logs are fairly noisy and don't give much useful information (see example below).
Even when the Coordinator misbehaves, these logs are not very useful.
Main changes
------------
- Add API `GET /druid/coordinator/v1/duties` that returns a status list of all duty groups currently running on the Coordinator
- Emit metrics `segment/poll/time`, `segment/pollWithSchema/time`, `segment/buildSnapshot/time`
- Remove redundant logs that indicate normal operation of well-tested aspects of the Coordinator
Refactors
---------
- Move some logic from `DutiesRunnable` to `CoordinatorDutyGroup`
- Move stats collection from `CollectSegmentAndServerStats` to `PrepareBalancerAndLoadQueues`
- Minor cleanup of class `DruidCoordinator`
- Clean up class `DruidCoordinatorRuntimeParams`
- Remove field `coordinatorStartTime`. Maintain start time in `MarkOvershadowedSegmentsAsUnused` instead.
- Remove field `MetadataRuleManager`. Pass supplier to constructor of applicable duties instead.
- Make `usedSegmentsNewestFirst` and `datasourcesSnapshot` as non-nullable as they are always required.
#16768 added the functionality to run compaction as a supervisor on the overlord.
This patch builds on top of that to restrict MSQ engine to compaction in the supervisor-mode only.
With these changes, users can no longer add MSQ engine as part of datasource compaction config,
or as the default cluster-level compaction engine, on the Coordinator.
The patch also adds an Overlord runtime property `druid.supervisor.compaction.engine=<msq/native>`
to specify the default engine for compaction supervisors.
Since these updates require major changes to existing MSQ compaction integration tests,
this patch disables MSQ-specific compaction integration tests -- they will be taken up in a follow-up PR.
Key changed/added classes in this patch:
* CompactionSupervisor
* CompactionSupervisorSpec
* CoordinatorCompactionConfigsResource
* OverlordCompactionScheduler
Co-authored-by: Vishesh Garg <gargvishesh@gmail.com>
changes:
* add `ApplyFunction` support to vectorization fallback, allowing many of the remaining expressions to be vectorized
* add `CastToObjectVectorProcessor` so that vector engine can correctly cast any type
* add support for array and complex vector constants
* reduce number of cases which can block vectorization in expression planner to be unknown inputs (such as unknown multi-valuedness)
* fix array constructor expression, apply map expression to make actual evaluated type match the output type inference
* fix bug in array_contains where something like array_contains([null], 'hello') would return true if the array was a numeric array since the non-null string value would cast to a null numeric
* fix isNull/isNotNull to correctly handle any type of input argument
Register a Ser-De for RowsAndColumns so that the window operator query running on leaf operators would be transferred properly on the wire. Would fix the empty response given by window queries without group by on the native engine.
(cherry picked from commit bb1c3c1749)
This PR #16890 introduced a change to skip adding tombstone segments to the cache.
It turns out that as a side effect tombstone segments appear unavailable in the console. This happens because availability of a segment in Broker is determined from the metadata cache.
The fix is to keep the segment in the metadata cache but skip them from refresh.
This doesn't affect any functionality as metadata query for tombstone returns empty causing continuous refresh of those segments.
While writing to a frame, String arrays are written by setting the multivalue byte.
But while reading, it was hardcoded to false.
(cherry picked from commit c7c3307e61)
* transition away from StorageAdapter
changes:
* CursorHolderFactory has been renamed to CursorFactory and moved off of StorageAdapter, instead fetched directly from the segment via 'asCursorFactory'. The previous deprecated CursorFactory interface has been merged into StorageAdapter
* StorageAdapter is no longer used by any engines or tests and has been marked as deprecated with default implementations of all methods that throw exceptions indicating the new methods to call instead
* StorageAdapter methods not covered by CursorFactory (CursorHolderFactory prior to this change) have been moved into interfaces which are retrieved by Segment.as, the primary classes are the previously existing Metadata, as well as new interfaces PhysicalSegmentInspector and TopNOptimizationInspector
* added UnnestSegment and FilteredSegment that extend WrappedSegmentReference since their StorageAdapter implementations were previously provided by WrappedSegmentReference
* added PhysicalSegmentInspector which covers some of the previous StorageAdapter functionality which was primarily used for segment metadata queries and other metadata uses, and is implemented for QueryableIndexSegment and IncrementalIndexSegment
* added TopNOptimizationInspector to cover the oddly specific StorageAdapter.hasBuiltInFilters implementation, which is implemented for HashJoinSegment, UnnestSegment, and FilteredSegment
* Updated all engines and tests to no longer use StorageAdapter
Problem
Currently, the delta input source only supports reading from the latest snapshot of the given Delta Lake table. This is a known documented limitation.
Description
Add support for reading Delta snapshot. By default, the Druid-Delta connector reads the latest snapshot of the Delta table in order to preserve compatibility. Users can specify a snapshotVersion to ingest change data events from Delta tables into Druid.
In the future, we can also add support for time-based snapshot reads. The Delta API to read time-based snapshots is not clear currently.
This commit aims to reject MVDs in window processing as we do not support them.
Earlier to this commit, query running a window aggregate partitioned by an MVD column would fail with ClassCastException
* Add framework for running MSQ tests with taskSpec instead of SQL
* Allow configurable datasegment for tests
* Add test
* Revert "Add test"
This reverts commit 79fb241545.
* Revert "Allow configurable datasegment for tests"
This reverts commit caf04ede2b.
In the compaction config, a range type partitionsSpec supports setting one of maxRowsPerSegment and targetRowsPerSegment. When compaction is run with the native engine, while maxRowsPerSegment = x results in segments of size x, targetRowsPerSegment = y results in segments of size 1.5 * y.
MSQ only supports rowsPerSegment = x as part of its tuning config, the resulting segment size being approx. x -- which is in line with maxRowsPerSegment behaviour in native compaction.
This PR makes the following changes:
use effective maxRowsPerSegment to pass as rowsPerSegment parameter for MSQ
persist rowsPerSegment as maxRowsPerSegment in lastCompactionState for MSQ
Use effective maxRowsPerSegment-based range spec in CompactionStatus check for both Native and MSQ.
This reverts commit f1d24c868f.
Updating nimbus to version 9+ is causing HTTP ERROR 500 java.lang.NoSuchMethodError: 'net.minidev.json.JSONObject com.nimbusds.jwt.JWTClaimsSet.toJSONObject()'
Refer to SAP/cloud-security-services-integration-library#429 (comment) for more details.
We would need to upgrade other libraries as well for updating nimbus.jose.jwt
This patch adds "TypeCastSelectors", which is used when writing frames to
perform two coercions:
- When a numeric type is desired and the underlying type is non-numeric or
unknown, the underlying selector is wrapped, "getObject" is called and the
result is coerced using "ExprEval.ofType". This differs from the prior
behavior where the primitive methods like "getLong", "getDouble", etc, would
be called directly. This fixes an issue where a column would be read as
all-zeroes when its SQL type is numeric and its physical type is string, which
can happen when evolving a column's type from string to number.
- When an array type is desired, the underlying selector is wrapped,
"getObject" is called, and the result is coerced to Object[]. This coercion
replaces some earlier logic from #15917.
Description:
#16768 introduces new compaction APIs on the Overlord `/compact/status` and `/compact/progress`.
But the corresponding `OverlordClient` methods do not return an object compatible with the actual
endpoints defined in `OverlordCompactionResource`.
This patch ensures that the objects are compatible.
Changes:
- Add `CompactionStatusResponse` and `CompactionProgressResponse`
- Use these as the return type in `OverlordClient` methods and as the response entity in `OverlordCompactionResource`
- Add `SupervisorCleanupModule` bound on the Coordinator to perform cleanup of supervisors.
Without this module, Coordinator cannot deserialize compaction supervisors.