druid

Commit Graph

Author	SHA1	Message	Date
Zoltan Haindrich	2e31cb2901	DrillWindowQueryTest: use proper way to decide if the query is ordered (#15118 )	2023-10-23 10:54:28 -04:00
Zoltan Haindrich	b95035f183	Fix VirtualColumn related issues in window expressions (#15119 ) for some exotic queries like: SELECT '_'\|\|dim1, MIN(cast(0 as double)) OVER (), MIN(cast((cnt\|\|cnt) as bigint)) OVER () FROM foo the compilation have resulted in NPE -s mostly because VirtualColumn -s were not handled properly	2023-10-23 14:05:59 +05:30
Zoltan Haindrich	fbbb9c7730	Allow DESC ordering in window expressions (#15195 )	2023-10-20 07:55:28 -04:00
Zoltan Haindrich	9fb0dbfc9f	Fix json inputs for drill windowing tests (#15148 ) This PR: adds a flag to JsonToParquet to do the fix during conversion updates the json files to more correct conents some resultset mismatches were fixed by this updates parquet to 1.13.1	2023-10-19 14:02:41 +05:30
Clint Wylie	061cfee224	add native filters for "(filter) is true" and "(filter) is false" (#15182 ) * add native filters for "(filter) is true" and "(filter) is false" changes: * add IsTrueDimFilter, IsFalseDimFilter, and abstract IsBooleanDimFilter for native json filter implementations of `(filter) IS TRUE` and `(filter) IS FALSE` * add IsBooleanFilter for actual filtering logic for these filters, which ignore includeUnknown to always use matches with false for true and !matches with true for false * fix test incorrectly adjusted to wrong answer in #15058 * add tests for default value mode	2023-10-18 13:07:35 -07:00
Zoltan Haindrich	c58b7f40ee	Rename windowing option (#15184 )	2023-10-18 10:54:20 +05:30
Laksh Singla	dc8d2192c3	Introduce natural comparator for types that don't have a StringComparator (#15145 ) Fixes a bug when executing queries with the ordering of arrays	2023-10-16 10:37:32 +05:30
Zoltan Haindrich	6d62c75866	Fix columns with null values in windowing expressions (#15131 )	2023-10-13 10:42:45 -04:00
Clint Wylie	a0fd9ec55c	fix issue with SQL boolean constants not respecting nulls when strict booleans and sql compatible null handling are enabled (#15135 )	2023-10-12 01:23:24 -07:00
Clint Wylie	d0f64608eb	sql compatible three-valued logic native filters (#15058 ) * sql compatible tri-state native logical filters when druid.expressions.useStrictBooleans=true and druid.generic.useDefaultValueForNull=false, and new druid.generic.useThreeValueLogicForNativeFilters=true * log.warn if non-default configurations are used to guide operators towards SQL complaint behavior	2023-10-12 00:06:23 -07:00
Zoltan Haindrich	ae88f2c0b6	Fix non-sqlcompat validation in CalciteWindowQueryTest (#15086 ) * fixes * check for latest rewrite place * Revert "check for latest rewrite place" This reverts commit `5cf1e2c1ca`. * some stuff (cherry picked from commit ab346d4373ea888eb8ef6115e018e7fb0d27407f) * update test output * updates to test ouptuts * some stuff * move validator * cleanup * fix * change test slightly * add apidoc cleanup warnings * cleanup/etc * instead of telling the story; add a fail with some reason whats the issue * lead-lag fix * add test * remove unnecessary throw * druidexception-trial * Revert "druidexception-trial" This reverts commit `8fa06644bc`. * undo changes to no_grouping; add no_grouping2 * add missing assert on resultcount * rename method; update * introduce enum/etc * make resultmatchmode accessible from TestBuilder#expectedResults * fix dump results to use log * fix * handle null correctly * disable feature type based things for MSQ * fix varianssqlaggtest * use eps in other test * fix intellij error * add final * addrss review * update test/string/etc * write concat in 3 lines :D	2023-10-11 12:34:31 -07:00
Vishesh Garg	c6ca990f1f	Rewrite EARLIEST/LATEST query operators to EARLIEST_BY/LATEST_BY (#15095 ) EARLIEST and LATEST operators implicitly reference the __time column for calculation of the aggregate value. Since the reference isn't explicit, Calcite sometimes fails to update the __time column name when there's column renaming --such as in the case of nested queries -- resulting in column not found errors. This change rewrites these operators to EARLIEST_BY and LATEST_BY during query processing to make the reference explicit to Calcite.	2023-10-11 19:48:36 +05:30
Laksh Singla	5f86072456	Prepare master for Druid 29 (#15121 ) Prepare master for Druid 29	2023-10-11 10:33:45 +05:30
Zoltan Haindrich	23605c1edd	Enable resultset validation of Drill tests (#15096 ) - introduces a test_X method for every testcase (995 testcases) - added a resultset parser which reads the expected resultset based on the result schema - loaded a few more datasets - added a testcase to ensure that all files have a corresponding testcase - renamed DecoupledIgnore to NegativeTest - categorized the failing 268 tests	2023-10-10 14:40:50 +05:30
Clint Wylie	1fc8fb1b20	add a bunch of tests with array typed columns to CalciteArraysQueryTest (#15101 ) * add a bunch of tests with array typed columns to CalciteArraysQueryTest * fix a bug with unnest filter pushdown when filtering on unnested array columns	2023-10-09 06:16:06 -07:00
Laksh Singla	549ef56288	UNION ALLs in MSQ (#14981 ) MSQ now supports UNION ALL with UnionDataSource	2023-10-09 18:18:15 +05:30
Zoltan Haindrich	b5a87fd89b	Support constant args in window functions (#15071 ) Instead of passing the constants around in a new parameter; InputAccessor was introduced to take care of transparently handling the constants - this new class started picking up some copy-paste debris around field accesses; and made them a little bit more readble.	2023-10-08 12:14:25 +05:30
Zoltan Haindrich	7b869fd37a	Change type of AVG aggregates to double (#15089 ) The sql standard is not very restrictive regarding this: If AVG is specified and DT is exact numeric, then the declared type of the result is an implemen- tation-defined exact numeric type with precision not less than the precision of DT and scale not less than the scale of DT. so; using the same type is also ok (without patch); however the avg of 0 and 1 is 0 right now because of the retention of the integer typ Postgres,MySql and Oracle and Drill seem to increase precision ; mssql returns 0 http://sqlfiddle.com/#!9/6f7248/1 I think we should also increase precision as its already calculated more precisely	2023-10-07 18:01:09 +05:30
Soumyava	57ab8e13dc	Updating plans when using joins with unnest on the left (#15075 ) * Updating plans when using joins with unnest on the left * Correcting segment map function for hashJoin * The changes done here are not reflected into MSQ yet so these tests might not run in MSQ * native tests * Self joins with unnest data source * Making this pass * Addressing comments by adding explanation and new test	2023-10-06 19:23:12 -07:00
Soumyava	1a06ef5a24	Fixing old function used (#15099 )	2023-10-05 17:25:00 -07:00
Pranav	06c5527c85	Allow aliasing of Macros and add new alias for complex decode 64 (#15034 ) * Add AliasExprMacro to allow aliasing of native expression macros * Add decode_base64_complex alias for complex_decode_base64	2023-10-05 16:24:36 -07:00
Zoltan Haindrich	36d7b3cc65	Add CalciteSysQueryTest to enable some testing of bindable plans. (#15070 )	2023-10-05 11:37:49 -07:00
Clint Wylie	b4bc9b6950	fix issue with auto columns with mix of scalar values and empty arrays (#15083 )	2023-10-05 10:15:45 +05:30
Laksh Singla	b8d03d36b0	Free up the resources when materializing the results as Frames (#15032 ) Refactor the code to clean up the result sequences when materializing the results as Frames	2023-10-05 10:14:27 +05:30
Laksh Singla	30cf76db99	Field writers for numerical arrays (#14900 ) Row-based frames, and by extension, MSQ now supports numeric array types. This means that all queries consuming or producing arrays would also work with MSQ. Numeric arrays can also be ingested via MSQ. Post this patch, queries like, SELECT [1, 2] would work with MSQ since they consume a numeric array, instead of failing with an unsupported column type exception.	2023-10-04 23:16:47 +05:30
Zoltan Haindrich	90e4b25620	Fix lead/lag to be usable without offset (#15057 )	2023-10-04 17:38:46 +05:30
Zoltan Haindrich	3342e03ea8	Windowing processing may have run into Exceptions when the whole table was processed (#15064 ) Earlier when the query was processing the whole table; the planning may have ended with a NPE; as it was not possible to create a scanquery from it.	2023-10-04 11:27:11 +05:30
Xavier Léauté	adef2069b1	Make unit tests pass with Java 21 (#15014 ) This change updates dependencies as needed and fixes tests to remove code incompatible with Java 21 As a result all unit tests now pass with Java 21. * update maven-shade-plugin to 3.5.0 and follow-up to #15042 * explain why we need to override configuration when specifying outputFile * remove configuration from dependency management in favor of explicit overrides in each module. * update to mockito to 5.5.0 for Java 21 support when running with Java 11+ * continue using latest mockito 4.x (4.11.0) when running with Java 8 * remove need to mock private fields * exclude incorrectly declared mockito dependency from pac4j-oidc * remove mocking of ByteBuffer, since sealed classes can no longer be mocked in Java 21 * add JVM options workaround for system-rules junit plugin not supporting Java 18+ * exclude older versions of byte-buddy from assertj-core * fix for Java 19 changes in floating point string representation * fix missing InitializedNullHandlingTest * update easymock to 5.2.0 for Java 21 compatibility * update animal-sniffer-plugin to 1.23 * update nl.jqno.equalsverifier to 3.15.1 * update exec-maven-plugin to 3.1.0	2023-10-03 22:41:21 -07:00
Soumyava	cb050282a0	Intervals are updated properly for Unnest queries (#15020 ) Fixes a bug where the unnest queries were not updated with the correct intervals.	2023-10-04 02:52:10 +05:30
Zoltan Haindrich	f3d1c8b70e	Enable back testcases in CalciteWindowQueryTest (#15045 ) Most of the testcases were disabled in CalciteWindowQueryTest during the Calcite-1.35 upgrade; there were some changes arising from the fact that the removal of DRUID_SUM had some unexpected sideffects: SqlStdOperatorTable.SUM became the SUM operator because of that SqlToRelConverter started rewriting windowed SUM -s into SUM0 -s my opinion is that w.r.t to Druid this rewrite provides no real advantage - as SUM0 is serviced by SUM here I believe that's not 100% correct in cases when it aggregates just null-s but that doesnt matter in this case I propose to introduce back a local DRUID_SUM thing as an unchanged SUM and later when CALCITE-6020 is fixed ; we can drop that.	2023-10-03 10:18:44 +05:30
Soumyava	261f54dc04	coalesce on unnest row mismatch fix (#15019 ) * coalesce on unnest row mismatch fix * new example with coalesce over unnest with nested array columns * New example with change in order which triggers the nvl * new test plan update for useDefault=true	2023-10-02 17:26:50 -07:00
Pranav	f1edd671fb	Exposing optional replaceMissingValueWith in lookup function and macros (#14956 ) * Exposing optional replaceMissingValueWith in lookup function and macros * args range validation * Updating docs * Addressing comments * Update docs/querying/sql-scalar.md Co-authored-by: Clint Wylie <cjwylie@gmail.com> * Update docs/querying/sql-functions.md Co-authored-by: Clint Wylie <cjwylie@gmail.com> * Addressing comments --------- Co-authored-by: Clint Wylie <cjwylie@gmail.com>	2023-10-02 17:09:23 -07:00
Zoltan Haindrich	2785e062d7	Correct quotation in drill query files (#15044 )	2023-10-02 08:17:15 -07:00
Pranav	07c28f17ca	Fix missing format strings in calls to DruidException.build (#15056 ) * Fix the NPE bug in nonStrictFormat * using non null format string * using Assert.assertThrows	2023-09-29 17:00:36 -07:00
Zoltan Haindrich	db71e28808	Enable SortProjectTransposeRule (#15002 ) contains Enable already passing tests in DecoupledPlanningCalciteQueryTest #14996 enables a transpose rule to support a query plan in which the plan was in the shape: Sort Project Aggregate	2023-09-29 10:49:03 +05:30
Zoltan Haindrich	022950a0c5	MV_FILTER_ONLY may run into Exceptions in case duplicate values were processed (#15012 )	2023-09-27 19:19:42 +05:30
Gian Merlino	3dabfead05	Fix getResultType for HLL, quantiles aggregators. (#15043 ) The aggregators had incorrect types for getResultType when shouldFinalze is false. They had the finalized type, but they should have had the intermediate type. Also includes a refactor of how ExprMacroTable is handled in tests, to make it easier to add tests for this to the MSQ module. The bug was originally noticed because the incorrect result types caused MSQ queries with DS_HLL to behave erratically.	2023-09-27 08:51:14 +05:30
Soumyava	75af741a96	Revert "SQL: Plan non-equijoin conditions as cross join followed by filter. (#14978 )" (#15029 ) This reverts commit `4f498e6469`.	2023-09-25 11:35:44 -07:00
Gian Merlino	0850e615b2	Remove istrue, isfalse vectorized impls. (#14991 ) These were added in #14977, but the implementations are incorrect, because they return null when the input arg is null. They should return false when the input is null. Remove them for now, rather than fixing them, since they're so new that they might as well never have existed.	2023-09-25 11:34:24 +05:30
Soumyava	c184b5250f	Unnest now works on MSQ (#14886 ) This entails: Removing the enableUnnest flag and additional machinery Updating the datasource plan and frame processors to support unnest Adding support in MSQ for UnnestDataSource and FilteredDataSource CalciteArrayTest now has a MSQ test component Additional tests for Unnest on MSQ	2023-09-25 09:19:21 +05:30
Zoltan Haindrich	e76962f453	Use annotation to mark DecoupleIgnore (#15005 )	2023-09-21 12:36:52 +05:30
Laksh Singla	ebb794632a	Allow users with STATE permissions to read and write the state APIs for querying with deep storage (#14944 ) Currently, only the user who has submitted the async query has permission to interact with the status APIs for that async query. However, often we want an administrator to interact with these resources as well. Druid handles these with the STATE resource traditionally, and if the requesting user has necessary permissions on it as well, alternatively, they should be allowed to interact with the status APIs, irrespective of whether they are the submitter of the query.	2023-09-21 06:55:07 +05:30
Pranav	883c2692d2	Adding new function decode_base64_utf8 and expr macro (#14943 ) * Adding new function decode_base64_utf8 and expr macro * using BaseScalarUnivariateMacroFunctionExpr * Print stack trace in case of debug in ChainedExecutionQueryRunner * fix static check	2023-09-20 17:06:34 -07:00
Gian Merlino	823f620ede	Add IS [NOT] DISTINCT FROM to SQL and join matchers. (#14976 ) * Add IS [NOT] DISTINCT FROM to SQL and join matchers. Changes: 1) Add "isdistinctfrom" and "notdistinctfrom" native expressions. 2) Add "IS [NOT] DISTINCT FROM" to SQL. It uses the new native expressions when generating expressions, and is treated the same as equals and not-equals when generating native filters on literals. 3) Update join matchers to have an "includeNull" parameter that determines whether we are operating in "equals" mode or "is not distinct from" mode. * Main changes: - Add ARRAY handling to "notdistinctfrom" and "isdistinctfrom". - Include null in pushed-down filters when using "notdistinctfrom" in a join. Other changes: - Adjust join filter analyzer to more explicitly use InDimFilter's ValuesSets, relying less on remembering to get it right to avoid copies. * Remove unused "wrap" method. * Fixes. * Remove methods we do not need. * Fix bug with INPUT_REF.	2023-09-20 10:44:32 -07:00
Zoltan Haindrich	e8773f4d0f	Enable already passing tests in DecoupledPlanningCalciteQueryTest (#14996 )	2023-09-20 15:42:52 +05:30
Gian Merlino	4f498e6469	SQL: Plan non-equijoin conditions as cross join followed by filter. (#14978 ) * SQL: Plan non-equijoin conditions as cross join followed by filter. Druid has previously refused to execute joins with non-equality-based conditions. This was well-intentioned: the idea was to push people to write their queries in a different, hopefully more performant way. But as we're moving towards fuller SQL support, it makes more sense to allow these conditions to go through with the best plan we can come up with: a cross join followed by a filter. In some cases this will allow the query to run, and people will be happy with that. In other cases, it will run into resource limits during execution. But we should at least give the query a chance. This patch also updates the documentation to explain how people can tell whether their queries are being planned this way. * cartesian is a word. * Adjust tests. * Update docs/querying/datasource.md Co-authored-by: Benedict Jin <asdf2014@apache.org> --------- Co-authored-by: Benedict Jin <asdf2014@apache.org>	2023-09-19 10:23:42 -07:00
Soumyava	279b3818f0	Make Unnest work with nullif operator (#14993 ) This is due to the recursive filter creation in unnest storage adapter not performing correctly in case of an empty children. This PR addresses the issue	2023-09-15 09:54:14 +05:30
Gian Merlino	3ae5e97801	Add IS [NOT] TRUE, IS [NOT] FALSE native functions. (#14977 ) They are not quite the same as "x == true", "x != true", etc. These functions never return null, even when "x" itself is null.	2023-09-14 09:19:09 -07:00
Soumyava	7bbefd5741	Updating version in from.ftl (#14982 )	2023-09-14 05:11:36 +00:00
Soumyava	bf99d2c7b2	Fix for schema mismatch to go down using the non vectorize path till we update the vectorized aggs properly (#14924 ) * Fix for schema mismatch to go down using the non vectorize path till we update the vectorized aggs properly * Fixing a failed test * Updating numericNilAgg * Moving to use default values in case of nil agg * Adding the same for first agg * Fixing a test * fixing vectorized string agg for last/first with cast if numeric * Updating tests to remove mockito and cover the case of string first/last on non string columns * Updating a test to vectorize * Addressing review comments: Name change to NilVectorAggregator and using static variables now * fixing intellij inspections	2023-09-13 13:15:14 -07:00

1 2 3 4 5 ...

847 Commits