1125 Commits

Author SHA1 Message Date
Zoltan Haindrich
9afdfb2dcf updates 2024-11-14 10:03:28 +00:00
Zoltan Haindrich
9448ed3825 enable trim for unnest 2024-11-14 09:53:17 +00:00
Zoltan Haindrich
fd564fda72 cleanup/enhance rule order to simplify 2024-11-14 09:07:44 +00:00
Zoltan Haindrich
f4d7ec2695 cleanup/enhance rule order to simplify 2024-11-14 07:51:22 +00:00
Zoltan Haindrich
fbfed8a07f update 2024-11-13 18:09:50 +00:00
Zoltan Haindrich
7d9693bd91 aa 2024-11-13 18:00:52 +00:00
Zoltan Haindrich
96f572e679 updates/etc 2024-11-13 17:03:06 +00:00
Zoltan Haindrich
0d09c18cab undo trials; re-introduce expr if needed (will be trimmed in most cases) 2024-11-13 16:48:31 +00:00
Zoltan Haindrich
8dfd57f922 update test 2024-11-13 16:30:08 +00:00
Zoltan Haindrich
1379c9aa2c Merge branch 'rename-d1-dbl1' into unnest-relfieldtrimmer-unnestfieldtype 2024-11-13 16:24:35 +00:00
Zoltan Haindrich
247ee17f17 fixups after merge 2024-11-13 15:35:00 +00:00
Zoltan Haindrich
a1b9f522fe Merge remote-tracking branch 'apache/master' into rename-d1-dbl1 2024-11-13 15:30:58 +00:00
Zoltan Haindrich
fc31c8a84a update iq files 2024-11-13 14:51:56 +00:00
Zoltan Haindrich
ced72b3ab7 fix nondefault 2024-11-13 14:40:12 +00:00
Zoltan Haindrich
08748436a5 fix some more 2024-11-13 14:20:41 +00:00
Zoltan Haindrich
62664e36e7 fix windowtests 2024-11-13 14:16:48 +00:00
Zoltan Haindrich
88e57f972a fix some more 2024-11-13 14:12:25 +00:00
Zoltan Haindrich
bd72b95777 fix some 2024-11-13 14:01:45 +00:00
Zoltan Haindrich
0d07cf137e dbl1/etc 2024-11-13 13:55:38 +00:00
Zoltan Haindrich
7777e48b77 dbl1/etc 2024-11-13 13:11:10 +00:00
Zoltan Haindrich
d4b4c94a3c Reapply "rename: d1/dbl1"
This reverts commit 0d03cc558781cfd4093c0ff7c87f03c7cc65e027.
2024-11-12 15:04:41 +00:00
Zoltan Haindrich
0d03cc5587 Revert "rename: d1/dbl1"
This reverts commit c49fc2796b5062d5886bc1ddd36ac3e524507c21.
2024-11-12 15:04:14 +00:00
Zoltan Haindrich
c49fc2796b rename: d1/dbl1 2024-11-12 15:04:11 +00:00
Zoltan Haindrich
04ab28f4cd one way 2024-11-12 14:28:25 +00:00
Zoltan Haindrich
e8f06bf90d make helper method 2024-11-12 14:19:29 +00:00
Zoltan Haindrich
322ec81fb0 make some assert function and fail with that 2024-11-12 14:12:30 +00:00
Zoltan Haindrich
d38215fd5a ok-but alon trim should have affected native query - right? 2024-11-12 11:46:30 +00:00
Zoltan Haindrich
f296102f05
ScanQuery should not ignore columnTypes in equals/hashCode (#17463)
* ScanQuery: equals/hashCode/toString
* DruidQuery: changes of Align ScanQuery column order with its desired signature #17457
* ScanQueryTest: add equalsverifer test
2024-11-12 14:26:59 +05:30
Zoltan Haindrich
b45b3ba495 run trimmer 2024-11-11 13:56:06 +00:00
Zoltan Haindrich
e7061ad04a more wood 2024-11-11 13:10:52 +00:00
Zoltan Haindrich
4e0a37a59b more wood 2024-11-11 12:39:30 +00:00
Zoltan Haindrich
2ceefdc079 try-unnestfiled-instead-rowtype
(cherry picked from commit c4eb2cf1b82c7a332d57701e430ac582a986593f)
2024-11-11 11:15:40 +00:00
Zoltan Haindrich
314d3f7a17 some updates 2024-11-11 10:28:11 +00:00
Zoltan Haindrich
00837c56ed clenaup 2024-11-09 16:54:13 +00:00
Zoltan Haindrich
54b057e81d x 2024-11-08 17:53:59 +00:00
Zoltan Haindrich
7021b3f42c working? 2024-11-08 17:33:31 +00:00
Zoltan Haindrich
2eac8318f8
Support Union in Decoupled planning (#17354)
* introduces `UnionQuery`
* some changes to enable a `UnionQuery` to have multiple input datasources
* `UnionQuery` execution is driven by the `QueryLogic` - which could later enable to reduce some complexity in `ClientQuerySegmentWalker`
* to run the subqueries of `UnionQuery` there was a need to access the `conglomerate` from the `Runner`; to enable that some refactors were done
* renamed `UnionQueryRunner` to `UnionDataSourceQueryRunner`
* `QueryRunnerFactoryConglomerate` have taken the place of `QueryToolChestWarehouse` which shaves of some unnecessary things here and there
* small cleanup/refactors
2024-11-05 16:58:57 +01:00
Tom
e4cdbca23c
make planner errors be user persona (#17437)
Change the persona for errors within the planner from Admin to User. The ADMIN persona is meant to be "a persona who is interacting with admin APIs and understands Druid query concepts". This isn't an admin API, it's a query API. Low quality error messages being returned to the correct audience is better than hiding all error messages.

The errors that can be returned back can be user solvable, and other times requires a druid expert. But the errors do not leak information that should only be seen by more expert/privileged personas.

The original ADMIN persona showed some reticence to tag low-quality error messages with a USER persona. but it really does seem user-directed to me so USER to me would make sense.
2024-11-04 10:48:35 -08:00
Akshat Jain
21e7e5cddd
Add benchmark suite for MSQ window functions (#17377)
* Add benchmark suite for MSQ window functions

* Fix inspection checks

* Address review comment: Rename method
2024-10-30 11:32:28 +05:30
Gian Merlino
446a8f466f
Update errorprone, mockito, jacoco, checkerframework. (#17414)
* Update errorprone, mockito, jacoco, checkerframework.

This patch updates various build and test dependencies, to see if they
cause unit tests on JDK 21 to behave more reliably.

* Update licenses, tests.

* Remove assertEquals.

* Repair two tests.

* Update some more tests.
2024-10-28 11:34:03 -07:00
Abhishek Radhakrishnan
43b325b6aa
Add missing @Nullable annotations to SqlQuery (#17398) 2024-10-22 20:34:46 -07:00
Abhishek Radhakrishnan
187e21afae
Add BrokerClient implementation (#17382)
This patch is extracted from PR 17353.

Changes:

- Added BrokerClient and BrokerClientImpl to the sql package that leverages the ServiceClient functionality; similar to OverlordClient and CoordinatorClient implementations in the server module.
- For now, only two broker API stubs are added: submitSqlTask() and fetchExplainPlan().
- Added a new POJO class ExplainPlan that encapsulates explain plan info.
- Deprecated org.apache.druid.discovery.BrokerClient in favor of the new BrokerClient in this patch.
- Clean up ExplainAttributesTest a bit and added serde verification.
2024-10-21 11:05:53 -07:00
Abhishek Radhakrishnan
9a16d4e219
Move SqlTaskStatus and SqlTaskStausTest from msq module to sql module. (#17380)
- This is a non-functional change that moves SqlTaskStatus and its unit test SqlTaskStatusTest from the msq module to the sql module to help class reuse in other places.
- This refactor is extracted from this PR to facilitate easier review.
- Fix a minor spacing issue in the TaskStartTimeoutFault error message.
2024-10-18 14:39:01 -07:00
Shivam Garg
6898a5a359
Removed Microsecond from Extract function (#17247) 2024-10-11 05:32:26 +02:00
Clint Wylie
ab0d6eb620
Fix string array grouping comparator (#17183) 2024-10-08 09:47:28 +05:30
Clint Wylie
04fe56835d
add druid.expressions.allowVectorizeFallback and default to false (#17248)
changes:

adds ExpressionProcessing.allowVectorizeFallback() and ExpressionProcessingConfig.allowVectorizeFallback(), defaulting to false until few remaining bugs can be fixed (mostly complex types and some odd interactions with mixed types)
add cannotVectorizeUnlessFallback functions to make it easy to toggle the default of this config, and easy to know what to delete when we remove it in the future
2024-10-05 12:42:42 +05:30
Zoltan Haindrich
65277b17a9
Decoupled planning: add support for unnest (#17177)
* adds support for `UNNEST` expressions
* introduces `LogicalUnnestRule` to transform a `Correlate` doing UNNEST into a `LogicalUnnest`
* `UnnestInputCleanupRule` could move the final unnested expr into the `LogicalUnnest` itself (usually its an `mv_to_array` expression)
* enhanced source unwrapping to utilize `FilteredDataSource` if it looks right
2024-10-02 08:54:56 +02:00
Gian Merlino
878adff9aa
MSQ profile for Brokers and Historicals. (#17140)
This patch adds a profile of MSQ named "Dart" that runs on Brokers and
Historicals, and which is compatible with the standard SQL query API.
For more high-level description, and notes on future work, refer to #17139.

This patch contains the following changes, grouped into packages.

Controller (org.apache.druid.msq.dart.controller):

The controller runs on Brokers. Main classes are,

- DartSqlResource, which serves /druid/v2/sql/dart/.
- DartSqlEngine and DartQueryMaker, the entry points from SQL that actually
  run the MSQ controller code.
- DartControllerContext, which configures the MSQ controller.
- DartMessageRelays, which sets up relays (see "message relays" below) to read
  messages from workers' DartControllerClients.
- DartTableInputSpecSlicer, which assigns work based on a TimelineServerView.

Worker (org.apache.druid.msq.dart.worker)

The worker runs on Historicals. Main classes are,

- DartWorkerResource, which supplies the regular MSQ WorkerResource, plus
  Dart-specific APIs.
- DartWorkerRunner, which runs MSQ worker code.
- DartWorkerContext, which configures the MSQ worker.
- DartProcessingBuffersProvider, which provides processing buffers from
  sliced-up merge buffers.
- DartDataSegmentProvider, which provides segments from the Historical's
  local cache.

Message relays (org.apache.druid.messages):

To avoid the need for Historicals to contact Brokers during a query, which
would create opportunities for queries to get stuck, all connections are
opened from Broker to Historical. This is made possible by a message relay
system, where the relay server (worker) has an outbox of messages.

The relay client (controller) connects to the outbox and retrieves messages.
Code for this system lives in the "server" package to keep it separate from
the MSQ extension and make it easier to maintain. The worker-to-controller
ControllerClient is implemented using message relays.

Other changes:

- Controller: Added the method "hasWorker". Used by the ControllerMessageListener
  to notify the appropriate controllers when a worker fails.
- WorkerResource: No longer tries to respond more than once in the
  "httpGetChannelData" API. This comes up when a response due to resolved future
  is ready at about the same time as a timeout occurs.
- MSQTaskQueryMaker: Refactor to separate out some useful functions for reuse
  in DartQueryMaker.
- SqlEngine: Add "queryContext" to "resultTypeForSelect" and "resultTypeForInsert".
  This allows the DartSqlEngine to modify result format based on whether a "fullReport"
  context parameter is set.
- LimitedOutputStream: New utility class. Used when in "fullReport" mode.
- TimelineServerView: Add getDruidServerMetadata as a performance optimization.
- CliHistorical: Add SegmentWrangler, so it can query inline data, lookups, etc.
- ServiceLocation: Add "fromUri" method, relocating some code from ServiceClientImpl.
- FixedServiceLocator: New locator for a fixed set of service locations. Useful for
  URI locations.
2024-10-01 14:38:55 -07:00
Sree Charan Manamala
661614129e
Window Functions : Context Parameter to Enable Transfer of RACs over wire (#17150) 2024-09-28 08:04:22 +02:00
Gian Merlino
dc223f22db
SQL: Use regular filters for time filtering in subqueries. (#17173)
* SQL: Use regular filters for time filtering in subqueries.

Using the "intervals" feature on subqueries, or any non-table, should be
avoided because it isn't a meaningful optimization in those cases, and
it's simpler for runtime implementations if they can assume all filters
are located in the regular filter object.

Two changes:

1) Fix the logic in DruidQuery.canUseIntervalFiltering. It was intended
   to return false for QueryDataSource, but actually returned true.

2) Add a validation to ScanQueryFrameProcessor to ensure that when running
   on an input channel (which would include any subquery), the query has
   "intervals" set to ONLY_ETERNITY.

Prior to this patch, the new test case in testTimeFilterOnSubquery would
throw a "Can only handle a single interval" error in the native engine,
and "QueryNotSupported" in the MSQ engine.

* Mark new case as having extra columns in decoupled mode.

* Adjust test.
2024-09-27 10:32:30 +05:30