Reduction of nullable DATE and TIMESTAMP expressions did not perform
a necessary null check, so would in some cases reduce to
1970-01-01 00:00:00 (epoch) rather than NULL.
changes:
* Added `CursorBuildSpec` which captures all of the 'interesting' stuff that goes into producing a cursor as a replacement for the method arguments of `CursorFactory.canVectorize`, `CursorFactory.makeCursor`, and `CursorFactory.makeVectorCursor`
* added new interface `CursorHolder` and new interface `CursorHolderFactory` as a replacement for `CursorFactory`, with method `makeCursorHolder`, which takes a `CursorBuildSpec` as an argument and replaces `CursorFactory.canVectorize`, `CursorFactory.makeCursor`, and `CursorFactory.makeVectorCursor`
* `CursorFactory.makeCursors` previously returned a `Sequence<Cursor>` corresponding to the query granularity buckets, with a separate `Cursor` per bucket. `CursorHolder.asCursor` instead returns a single `Cursor` (equivalent to 'ALL' granularity), and a new `CursorGranularizer` has been added for query engines to iterate over the cursor and divide into granularity buckets. This makes the non-vectorized engine behave the same way as the vectorized query engine (with its `VectorCursorGranularizer`), and simplifies a lot of stuff that has to read segments particularly if it does not care about bucketing the results into granularities.
* Deprecated `CursorFactory`, `CursorFactory.canVectorize`, `CursorFactory.makeCursors`, and `CursorFactory.makeVectorCursor`
* updated all `StorageAdapter` implementations to implement `makeCursorHolder`, transitioned direct `CursorFactory` implementations to instead implement `CursorMakerFactory`. `StorageAdapter` being a `CursorMakerFactory` is intended to be a transitional thing, ideally will not be released in favor of moving `CursorMakerFactory` to be fetched directly from `Segment`, however this PR was already large enough so this will be done in a follow-up.
* updated all query engines to use `makeCursorHolder`, granularity based engines to use `CursorGranularizer`.
Upgrade/Downgrade between any version till or before Druid 30 where the newer version runs a worker task, while the older version runs a controller task can fail. The patch removes that verification check till its safe to add it back.
Handle the following Delta complex types:
a. StructType as JSON
b. ArrayType as Java list
c. MapType as Java map
Generate and add a new Delta table complex-types-table that contains the above complex types for testing.
Update the tests to include a parameterized test with complex-types-table, with the expectations defined in ComplexTypesDeltaTable.java.
* Fix build
* Run coldSchemaExec thread periodically
* Bugfix: Run cold schema refresh periodically
* Rename metrics for deep storage only segment schema process
* Fix typo in waitUntilSegmentsLoad.
* Add a note on configuring druid.segmentCache.locations for broadcast rules.
* Update docs/operations/rule-configuration.md
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
---------
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
* Use fuzzy matchers for compaction bytes asserts.
This still enables us to test that the bytes are zero and nonzero
when they're supposed to be, without having to ge them exactly
right. The need to get bytes exactly right makes it difficult to
ensure ITs pass when making changes to default segment metadata.
* Additional fuzziness.
Bug description:
Peons to fail to start up when `WorkerTaskCountStatsMonitor` is used on MiddleManagers.
This is because MiddleManagers pass on their properties to peons and peons are unable to
find `IndexerTaskCountStatsProvider` as that is bound only for indexer nodes.
Fix:
Check if node is an indexer before trying to get instance of `IndexerTaskCountStatsProvider`.
Fixes#13936
In cases where a supervisor is idle and the overlord is restarted for some reason, the supervisor would
start spinning tasks again. In clusters where there are many low throughput streams, this would spike
the task count unnecessarily.
This commit compares the latest stream offset with the ones in metadata during the startup of supervisor
and sets it to idle state if they match.
* SQL syntax error should target USER persona
* * revert change to queryHandler and related tests, based on review comments
* * add test
* Properly handle Druid schema blending with catalog definition and segment metadata
* * add javadocs
Adds a configuration clientConnectTimeout to our http client config which controls the connection timeout for our http client requests.
It was observed that on busy K8S clusters, the default connect timeout of 500ms is sometimes not enough time to complete syn/acks for a request and in these cases, the requests timeout with the error:
exceptionType=java.net.SocketTimeoutException, exceptionMessage=Connect Timeout
This behavior was mostly observed on the router while forwarding queries to the broker.
Having a slightly higher connect timeout helped resolve these issues.
Added a new revised IT group BackwardCompatibilityMain. The idea is to catch potential backward compatibility issues that may arise during rolling upgrade.
This test group runs a docker-compose cluster with Overlord & Coordinator service on the previous druid version.
Following env vars are required in the GHA file .github/workflows/unit-and-integration-tests-unified.yml to run this test
DRUID_PREVIOUS_VERSION -> Previous druid version to test backward incompatibility.
DRUID_PREVIOUS_VERSION_DOWNLOAD_URL -> URL to fetch the tar.
* SQL syntax error should target USER persona
* * revert change to queryHandler and related tests, based on review comments
* * add test
* Docs for Kinesis input format
* * remove reference to kafka
* * fix spellcheck error
* Apply suggestions from code review
Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
---------
Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
Two performance enhancements:
1) Direct merging of input frames to output channels, without any
temporary files, if all input frames fit in memory.
2) When doing multi-level merging (now called "external mode"),
improve parallelism by boosting up the number of mergers in the
penultimate level.
To support direct merging, FrameChannelMerger is enhanced such that the
output partition min/max values are used to filter input frames. This
is necessary because all direct mergers read all input frames, but only
rows corresponding to a single output partition.
Some general refactors across Druid.
Switch to DruidExceptions
Add javadocs
Fix a bug in IntArrayColumns
Add a class for LongArrayColumns
Remove wireTransferable since it would never be called
Refactor DictionaryWriter to return the index written as a return value from write.
Refactors the SemanticCreator annotation.
Moves the interface to the semantic package.
Create a SemanticUtils to hold logic for storing semantic maps.
Add FrameMaker interface.
This PR adds checks for verification of DataSourceCompactionConfig and CompactionTask with msq engine to ensure:
each aggregator in metricsSpec is idempotent
metricsSpec is non-null when rollup is set to true
Unit tests and existing compaction ITs have been updated accordingly.
Background:
ZK-based segment loading has been completely disabled in #15705 .
ZK `servedSegmentsPath` has been deprecated since Druid 0.7.1, #1182 .
This legacy path has been replaced by the `liveSegmentsPath` and is not used in the code anymore.
Changes:
- Never create ZK loadQueuePath as it is never used.
- Never create ZK servedSegmentsPath as it is never used.
- Do not create ZK liveSegmentsPath if announcement on ZK is disabled
- Fix up tests
* SQL: Add ProjectableFilterableTable to SegmentsTable.
This allows us to skip serialization of expensive fields such as
shard_spec, dimensions, metrics, and last_compaction_state, if those
fields are not actually being queried.
* Restructure logic to avoid unnecessary toString() as well.
This PR fixes query correctness issues for MSQ window functions when using more than 1 worker (that is, maxNumTasks > 2).
Currently, we were keeping the shuffle spec of the previous stage when we didn't have any partition columns for window stage. This PR changes it to override the shuffle spec of the previous stage to MixShuffleSpec (if we have a window function with empty over clause) so that the window stage gets a single partition to work on.
A test has been added for a query which returned incorrect results prior to this change when using more than 1 workers.
When a window is defined as WINDOW W AS <DEF> and using a syntax of (PARTITION BY col1 ORDER BY col2 ROWS x PRECEDING), we would need to default the other bound to CURRENT ROW
We already have implemented this earlier, but when defined as WINDOW W AS <DEF>, Calcite takes a different route to validate the window.