Commit Graph

1044 Commits

Author SHA1 Message Date
Zoltan Haindrich 1a0ab2c3b1 Merge remote-tracking branch 'apache/master' into quidem-record 2024-06-19 12:59:26 +00:00
Zoltan Haindrich dffa331018 updates/etc 2024-06-18 16:33:22 +00:00
Zoltan Haindrich d14b7374ba add 2024-06-18 15:59:29 +00:00
Zoltan Haindrich e725df7110 fix loader 2024-06-18 15:43:35 +00:00
Zoltan Haindrich f5720ce97d u 2024-06-17 16:05:29 +00:00
Zoltan Haindrich 7eccf5b518 add some validation/etc 2024-06-17 15:42:48 +00:00
Zoltan Haindrich e06e54631e fix style; emitter 2024-06-17 14:26:48 +00:00
Zoltan Haindrich 47696a6108 updates 2024-06-17 13:05:11 +00:00
Laksh Singla da1e293a57
Deserialize dimensions in group by queries to their respective types when reading from their serialized format (#16511)
* init

* tests, pair groupable

* framework change

* tests

* update benchmarks

* comments

* add javadoc for the jsonMapper

* remove extra deserialization

* add special serde for map based result rows

* revert unnecessary change

---------

Co-authored-by: asdf2014 <asdf2014@apache.org>
2024-06-14 16:27:47 +08:00
Zoltan Haindrich ac19b148c2
Upgrade calcite to 1.37.0 (#16504)
* contains Make a full copy of the parser and apply our modifications to it #16503
* some minor api changes pair/entry
* some unnecessary aggregation was removed from a set of queries in `CalciteSubqueryTest`
* `AliasedOperatorConversion` was detecting `CHAR_LENGTH` as not a function ; I've removed the check
  * the field it was using doesn't look maintained that much
  * the `kind` is passed for the created `SqlFunction` so I don't think this check is actually needed
* some decoupled test cases become broken - will be fixed later
* some aggregate related changes: due to the fact that SUM() and COUNT() of no inputs are different
* upgrade avatica to 1.25.0
* `CalciteQueryTest#testExactCountDistinctWithFilter` is now executable

Close apache/druid#16503
2024-06-13 08:47:50 +02:00
Zoltan Haindrich f8645de341
Remove incorrect utf8 conversion of ResultCache keys (#16569) 2024-06-12 13:12:05 -07:00
Clint Wylie fee509df2e
fix NestedDataColumnIndexerV4 to not report cardinality (#16507)
* fix NestedDataColumnIndexerV4 to not report cardinality
changes:
* fix issue similar to #16489 but for NestedDataColumnIndexerV4, which can report STRING type if it only processes a single type of values. this should be less common than the auto indexer problem
* fix some issues with sql benchmarks
2024-06-11 20:58:12 -07:00
zachjsh 3f5f5921e0
Fix sql syntax error user (#16583)
This fixes an issue where in some cases, a SQL syntax error encountered when parsing / planning a query results in an error returned to the user with persona a `admin` when it should instead be `user`.
2024-06-11 18:08:35 -04:00
Zoltan Haindrich 7a65938fd6 use druidhook instead Hook 2024-06-11 15:55:55 +00:00
Clint Wylie 3fb6ba22e8
fix expression column capabilities to not report dictionary encoded unless input is string (#16577) 2024-06-08 13:05:19 -07:00
Gian Merlino 277006446d
Fallback vectorization for FunctionExpr and BaseMacroFunctionExpr. (#16366)
* Fallback vectorization for FunctionExpr and BaseMacroFunctionExpr.

This patch adds FallbackVectorProcessor, a processor that adapts non-vectorizable
operations into vectorizable ones. It is used in FunctionExpr and BaseMacroFunctionExpr.

In addition:

- Identifiers are updated to offer getObjectVector for ARRAY and COMPLEX in addition
  to STRING. ExprEvalObjectVector is updated to offer ARRAY and COMPLEX as well.

- In SQL tests, cannotVectorize now fails tests if an exception is not thrown. This makes
  it easier to identify tests that can now vectorize.

- Fix a null-matcher bug in StringObjectVectorValueMatcher.

* Fix tests.

* Fixes.

* Fix tests.

* Fix test.

* Fix test.
2024-06-05 20:03:02 -07:00
Gian Merlino b837ce565b
Simplify serialized form of JsonInputFormat. (#15691)
* Simplify serialized form of JsonInputFormat.

Use JsonInclude for keepNullColumns, assumeNewlineDelimited, and
useJsonNodeReader. Because the default value of keepNullColumns is
variable, we store the original configured value rather than the
derived value, and include if the original value is nonnull.

* Fix test.
2024-06-05 20:01:14 -07:00
Gian Merlino 1040a29bc5
Fix capabilities reported by UnnestStorageAdapter. (#16551)
UnnestStorageAdapter and its cursors did not return capabilities correctly
for the output column. This patch fixes two problems:

1) UnnestStorageAdapter returned the capabilities of the unnest virtual
   column prior to unnesting. It should return the post-unnest capabilities.

2) UnnestColumnValueSelectorCursor passed through isDictionaryEncoded from
   the unnest virtual column. This is incorrect, because the dimension selector
   created by this class never has a dictionary. This is the cause of #16543.
2024-06-05 15:19:42 -07:00
Akshat Jain 6d7d2ffa63
Add interface method for returning canonical lookup name (#16557)
* Add interface method for returning canonical lookup name

* Address review comment

* Add test in LookupReferencesManagerTest for coverage check

* Add test in LookupSerdeModuleTest for coverage check
2024-06-05 14:33:18 -07:00
Abhishek Radhakrishnan b9ba286423
Fix task bootstrapping & simplify segment load/drop flows (#16475)
* Fix task bootstrap locations.

* Remove dependency of SegmentCacheManager from SegmentLoadDropHandler.

- The load drop handler code talks to the local cache manager via
SegmentManager.

* Clean up unused imports and stuff.

* Test fixes.

* Intellij inspections and test bind.

* Clean up dependencies some more

* Extract test load spec and factory to its own class.

* Cleanup test util

* Pull SegmentForTesting out to TestSegmentUtils.

* Fix up.

* Minor changes to infoDir

* Replace server announcer mock and verify that.

* Add tests.

* Update javadocs.

* Address review comments.

* Separate methods for download and bootstrap load

* Clean up return types and exception handling.

* No callback for loadSegment().

* Minor cleanup

* Pull out the test helpers into its own static class so it can have better state control.

* LocalCacheManager stuff

* Fix build.

* Fix build.

* Address some CI warnings.

* Minor updates to javadocs and test code.

* Address some CodeQL test warnings and checkstyle fix.

* Pass a Consumer<DataSegment> instead of boolean & rename variables.

* Small updates

* Remove one test constructor.

* Remove the other constructor that wasn't initializing fully and update usages.

* Cleanup withInfoDir() builder and unnecessary test hooks.

* Remove mocks and elaborate on comments.

* Commentary

* Fix a few Intellij inspection warnings.

* Suppress corePoolSize intellij-inspect warning.

The intellij-inspect tool doesn't seem to correctly inspect
lambda usages. See ScheduledExecutors.

* Update docs and add more tests.

* Use hamcrest for asserting order on expectation.

* Shutdown bootstrap exec.

* Fix checkstyle
2024-06-04 10:44:46 -07:00
Sree Charan Manamala 6bbf9613f8
Throw soft exception in case of empty signature while building Scan Query (#16502) 2024-05-29 09:41:54 +02:00
Sree Charan Manamala 27cfe12f4a
Enable reordering of window operators (#16482)
This commit aims to enable the re-ordering of window operators in order to optimise
the sort and partition operators.
Example : 
```
SELECT m1, m2,
SUM(m1) OVER(PARTITION BY m2) as sum1,
SUM(m2) OVER() as sum2
from numFoo
GROUP BY m1,m2
```

In order to compute this query, we can order the operators as to first compute the operators
corresponding to sum2 and then place the operators corresponding to sum1 which would
help us in reducing one sort operator if we order our operators by sum1 and then sum2.
2024-05-29 12:17:12 +05:30
Zoltan Haindrich 88da8d9659 rename 2024-05-28 16:21:04 +00:00
Zoltan Haindrich ff7abeb0f1 small cleanup 2024-05-28 15:54:29 +00:00
Zoltan Haindrich 07e05a664d cleasnup 2024-05-28 15:30:06 +00:00
Zoltan Haindrich ff31c14dba clkeaup 2024-05-28 15:27:18 +00:00
Zoltan Haindrich a713b663b0 undo 2024-05-28 15:23:04 +00:00
Zoltan Haindrich a1b7f981fb cleanup 2024-05-28 15:09:42 +00:00
Zoltan Haindrich b20ee99371 clean 2024-05-28 15:07:09 +00:00
Zoltan Haindrich 9ae80f05de Merge remote-tracking branch 'kgyrtkirk/quidem-runner-extension-submit' into quidem-record 2024-05-27 10:52:01 +00:00
Zoltan Haindrich ec2ecde235 updates 2024-05-27 10:49:38 +00:00
Clint Wylie 4e1de50e30
fix issue with auto column grouping (#16489)
* fix issue with auto column grouping
changes:
* fixes bug where AutoTypeColumnIndexer reports incorrect cardinality, allowing it to incorrectly use array grouper algorithm for realtime queries producing incorrect results for strings
* fixes bug where auto LONG and DOUBLE type columns incorrectly report not having null values, resulting in incorrect null handling when grouping

* fix test
2024-05-27 11:18:17 +05:30
zachjsh b0cc1ee84b
Add ability to turn off Druid Catalog specific validation done on catalog defined tables in Druid (#16465)
* * add property to enable / disable catalog validation and add tests

* * add integration tests for catalog validation disabled

* * add integration tests

* * remove debugging logs

* * fix forbidden api call
2024-05-23 13:19:51 -04:00
Zoltan Haindrich 12f79acc7e
Enable quidem shadowing for decoupled testcases (#16431)
* Altered `QueryTestBuilder` to be able to switch to a backing quidem test
* added a small crc to ensure that the shadow testcase does not deviate from the original one
* Packaged all decoupled related things into a a single `DecoupledExtension` to reduce copy-paste
* `DecoupledTestConfig#quidemReason` must describe why its being used
* `DecoupledTestConfig#separateDefaultModeTest` can be used to make multiple case files based on `NullHandling` state
* fixed a cosmetic bug during decoupled join translation
* enhanced `!druidPlan` to report the final logical plan in non-decoupled mode as well
* add check to ensure that only supported params are present in a druidtest uri
* enabled shadow testcases for previously disabled testcases
2024-05-23 07:03:16 +02:00
Zoltan Haindrich 4595b0c128 u[pdate 2024-05-21 14:18:55 +00:00
Zoltan Haindrich ba09e7d1de add 2024-05-21 14:00:02 +00:00
Zoltan Haindrich 08b73d1969 Revert "some stuff"
This reverts commit 52598d3bca.
2024-05-21 12:02:46 +00:00
Zoltan Haindrich 52598d3bca some stuff 2024-05-21 12:02:44 +00:00
Gian Merlino 599586bcfc
Add SQL DIV function. (#16464)
* Add SQL DIV function.

This function has been documented for some time, but lacked a binding,
so it wasn't usable.

* Add a case with two expression inputs.
2024-05-17 11:11:32 -07:00
Zoltan Haindrich e7e119b559 reduce copypaste 2024-05-16 13:33:27 +00:00
Zoltan Haindrich f4c73e1499 make old defaults to overides 2024-05-16 11:31:28 +00:00
Zoltan Haindrich 59be71c068 add other 2024-05-16 11:23:26 +00:00
Zoltan Haindrich 607ef174c5 indent 2024-05-16 11:18:54 +00:00
Zoltan Haindrich 3658ab24c3 remove f 2024-05-16 11:18:24 +00:00
Zoltan Haindrich bec1f38a0e move sqlmodule down 2024-05-16 11:17:05 +00:00
Zoltan Haindrich 688611eab3 undo 2024-05-16 11:11:55 +00:00
Zoltan Haindrich 93892b6524 undo some 2024-05-16 11:11:03 +00:00
Zoltan Haindrich 118eb61939 there - with 1 boot 2024-05-16 10:31:38 +00:00
Zoltan Haindrich 28ea884e19 almost ready? 2024-05-16 10:01:22 +00:00
Zoltan Haindrich 8ee41f58d0 it does work 2024-05-15 15:14:43 +00:00