druid

Commit Graph

Author	SHA1	Message	Date
Clint Wylie	fda8d2b7f3	fix debugging and running with intellij runConfiguration (#15115 )	2023-10-09 17:03:06 -07:00
Laksh Singla	95bf331c08	Rename the default setting of 'maxSubqueryBytes' from 'unlimited' to 'disabled' (#15108 ) The default setting of 'maxSubqueryBytes' is renamed from 'unlimited' to 'disabled'.	2023-10-10 02:03:29 +05:30
Abhishek Agarwal	90a1458ac9	Parse passwords containing colon correctly (#15109 )	2023-10-09 20:45:10 +05:30
Laksh Singla	b0edbc3d91	MSQ writes out string arrays instead of MVDs by default (#15093 ) MSQ uses the string dimension schema for ARRAY<STRING> typed columns, which creates MVDs instead of string arrays as required. Therefore someone trying to ingest columns of type ARRAY<STRING> from an external data source or another data source would get STRING columns in the newly generated segments. This patch changes the following: - Use auto dimension schema to ingest the ARRAY<STRING> columns, which will create columns with the desired type. - Add an undocumented flag ingestStringArraysAsMVDs to preserve the legacy behavior. Legacy behaviour is turned on by default. - Create MSQArraysInsertTest and refactor some of the tests in MSQInsertTest.	2023-10-09 20:31:07 +05:30
Laksh Singla	36edbce036	Fix compilation failure in master (#15111 ) Merging since it's a dev blocker.	2023-10-09 20:05:48 +05:30
Clint Wylie	1fc8fb1b20	add a bunch of tests with array typed columns to CalciteArraysQueryTest (#15101 ) * add a bunch of tests with array typed columns to CalciteArraysQueryTest * fix a bug with unnest filter pushdown when filtering on unnested array columns	2023-10-09 06:16:06 -07:00
Laksh Singla	549ef56288	UNION ALLs in MSQ (#14981 ) MSQ now supports UNION ALL with UnionDataSource	2023-10-09 18:18:15 +05:30
AmatyaAvadhanula	40a6dc4631	Optimize used segment fetching in Kill tasks (#15107 ) * Optimize used segment fetching in Kill tasks	2023-10-09 17:54:13 +05:30
Adarsh Sanjeev	7a35ce886d	Add ability for MSQ tasks to query realtime tasks (#15024 ) This PR aims to add the capabilities to: 1. Fetch the realtime segment metadata from the coordinator server view, 2. Adds the ability for workers to query indexers, similar to how brokers do the same for native queries.	2023-10-09 15:14:03 +05:30
kaisun2000	e2cc1c4ad1	Add metric -- count of queries waiting for merge buffers (#15025 ) Add 'mergeBuffer/pendingRequests' metric that exposes the count of waiting queries (threads) blocking in the merge buffers pools.	2023-10-09 12:56:23 +05:30
Gian Merlino	c483cb863d	Fix IndexerWorkerClient#fetchChannelData when response has data and error. (#15084 ) * Fix IndexerWorkerClient#fetchChannelData when response has data and error. When a channel data response from a worker includes some data and then some I/O error, then when the call is retried, we will re-read the set of data that was read by the previous connection and add it to the local channel again. This causes the local channel to become corrupted. The patch fixes this case by skipping data that has already been read.	2023-10-09 11:12:28 +05:30
Pranav	c7d0615af3	Fix the build for #15013.: Lookup jitter upstream build fix (#15103 ) Fix the build for #15013.	2023-10-09 09:35:39 +05:30
Zoltan Haindrich	b5a87fd89b	Support constant args in window functions (#15071 ) Instead of passing the constants around in a new parameter; InputAccessor was introduced to take care of transparently handling the constants - this new class started picking up some copy-paste debris around field accesses; and made them a little bit more readble.	2023-10-08 12:14:25 +05:30
Zoltan Haindrich	7b869fd37a	Change type of AVG aggregates to double (#15089 ) The sql standard is not very restrictive regarding this: If AVG is specified and DT is exact numeric, then the declared type of the result is an implemen- tation-defined exact numeric type with precision not less than the precision of DT and scale not less than the scale of DT. so; using the same type is also ok (without patch); however the avg of 0 and 1 is 0 right now because of the retention of the integer typ Postgres,MySql and Oracle and Drill seem to increase precision ; mssql returns 0 http://sqlfiddle.com/#!9/6f7248/1 I think we should also increase precision as its already calculated more precisely	2023-10-07 18:01:09 +05:30
Soumyava	57ab8e13dc	Updating plans when using joins with unnest on the left (#15075 ) * Updating plans when using joins with unnest on the left * Correcting segment map function for hashJoin * The changes done here are not reflected into MSQ yet so these tests might not run in MSQ * native tests * Self joins with unnest data source * Making this pass * Addressing comments by adding explanation and new test	2023-10-06 19:23:12 -07:00
Xavier Léauté	f9439970c9	run build and unit tests using Java 21 (#15088 ) * run build and unit test using Java 21 * run static checks with Java 21 * use setup-java for unit tests, since Java 21 is not built-in * skip maven cache from setup-java * add comments to explain cache behavior	2023-10-06 12:45:07 -07:00
Soumyava	1a06ef5a24	Fixing old function used (#15099 )	2023-10-05 17:25:00 -07:00
Pranav	06c5527c85	Allow aliasing of Macros and add new alias for complex decode 64 (#15034 ) * Add AliasExprMacro to allow aliasing of native expression macros * Add decode_base64_complex alias for complex_decode_base64	2023-10-05 16:24:36 -07:00
Zoltan Haindrich	36d7b3cc65	Add CalciteSysQueryTest to enable some testing of bindable plans. (#15070 )	2023-10-05 11:37:49 -07:00
317brian	2164dafb99	docs: update unnest to use crossjoin instead of comma (#15074 )	2023-10-05 09:01:08 -07:00
Adarsh Sanjeev	7e987e3d69	Add query context parameter for segment load wait (#15076 ) Add segmentLoadWait as a query context parameter. If this is true, the controller queries the broker and waits till the segments created (if any) have been loaded by the load rules. The controller also provides this information in the live reports and task reports. If this is false, the controller exits immediately after finishing the query.	2023-10-05 18:26:34 +05:30
Laksh Singla	2c286d6f42	Fix monomorphic processing code running on JDK8 since it references a non-existing method (#15092 ) Code relying on monomorphic processing on JDK8 doesn't work correctly, since it tries to reference getArrayLength using method handles, which might have been accidentally removed here since it seems unused. This PR adds the method back as is.	2023-10-05 11:05:38 +05:30
Clint Wylie	b4bc9b6950	fix issue with auto columns with mix of scalar values and empty arrays (#15083 )	2023-10-05 10:15:45 +05:30
Laksh Singla	b8d03d36b0	Free up the resources when materializing the results as Frames (#15032 ) Refactor the code to clean up the result sequences when materializing the results as Frames	2023-10-05 10:14:27 +05:30
Clint Wylie	3afe09a19d	urlencode nested serializer temp file names so they dont explode stuff (#15068 ) Fixes a bug caused by #14919, which was just using the column name as part of a temp file name, which.. isn't very cool, my bad. Switched to use StringUtils.urlEncode so that ugly chars don't explode stuff. The modified test fails without the changes in this PR.	2023-10-05 10:13:45 +05:30
Laksh Singla	30cf76db99	Field writers for numerical arrays (#14900 ) Row-based frames, and by extension, MSQ now supports numeric array types. This means that all queries consuming or producing arrays would also work with MSQ. Numeric arrays can also be ingested via MSQ. Post this patch, queries like, SELECT [1, 2] would work with MSQ since they consume a numeric array, instead of failing with an unsupported column type exception.	2023-10-04 23:16:47 +05:30
317brian	88476e0e83	docs: add note about transparent_reconnection for Avatica (#15066 ) * add note about transparent_reconnection * Update docs/api-reference/sql-jdbc.md	2023-10-04 09:52:48 -07:00
Zoltan Haindrich	90e4b25620	Fix lead/lag to be usable without offset (#15057 )	2023-10-04 17:38:46 +05:30
Tejaswini Bandlamudi	c888ac5d61	fix path of druid service IT logs (#15082 )	2023-10-04 15:38:38 +05:30
Gian Merlino	a9021e4cd7	Fix NPE with lenient aggregators merging in segmentMetadata. (#15078 ) When merging analyses, lenient merging sets unmergeable aggregators to null. Merging such a null aggregator record into a nonnull record would potentially lead to NPE in getMergingFactory. The new code only calls getMergingFactory if both the old and new aggregators are nonnull; else, if either is null, then the merged aggregator is also set to null.	2023-10-04 02:41:41 -07:00
Clint Wylie	632811b285	fix json compat layer to not rewrite v4 into v5 after segment merging (#14997 )	2023-10-04 00:18:18 -07:00
Tejaswini Bandlamudi	28870c702a	Resolve reported CVEs (#15081 )	2023-10-04 11:59:01 +05:30
Gian Merlino	2ed4fd1ae3	Compute broadcast-join segmentMapFn only once per worker. (#15007 ) This patch introduces "processor managers" to processor factories, as a replacement for the sequence of processors. Processor managers can use the results of earlier processors to influence the creation of later processors, which provides us with the building block we need to ensure that broadcast join data is only read once. In particular, when broadcast join is happening, the BaseFrameProcessorFactory now uses a ChainedProcessorManager to first run BroadcastJoinSegmentMapFnProcessor (in a single thread), and then run all of the regular processors (possibly multithreaded).	2023-10-04 11:47:00 +05:30
Vishesh Garg	7e8f3e69ef	Avoid intermediate offsets in bucketStart calculation logic to handle DST transition (#15038 ) When moving timestamps by an offset using org.joda.time.chrono.ISOChronology library, if the new timestamp falls in Daylight Savings Time (DST) transition period, the library rounds it off to the nearest valid time. This can lead to incorrect final timestamp when calculated using intermediate offsets landing in DST transition, for e.g. +21D arrived at using +14D and +7D offset, where +14D lands in DST transition period. Since bucketStart values are calculated using this library, this behaviour can lead to incorrect bucketStart times.	2023-10-04 11:32:29 +05:30
Zoltan Haindrich	3342e03ea8	Windowing processing may have run into Exceptions when the whole table was processed (#15064 ) Earlier when the query was processing the whole table; the planning may have ended with a NPE; as it was not possible to create a scanquery from it.	2023-10-04 11:27:11 +05:30
Xavier Léauté	adef2069b1	Make unit tests pass with Java 21 (#15014 ) This change updates dependencies as needed and fixes tests to remove code incompatible with Java 21 As a result all unit tests now pass with Java 21. * update maven-shade-plugin to 3.5.0 and follow-up to #15042 * explain why we need to override configuration when specifying outputFile * remove configuration from dependency management in favor of explicit overrides in each module. * update to mockito to 5.5.0 for Java 21 support when running with Java 11+ * continue using latest mockito 4.x (4.11.0) when running with Java 8 * remove need to mock private fields * exclude incorrectly declared mockito dependency from pac4j-oidc * remove mocking of ByteBuffer, since sealed classes can no longer be mocked in Java 21 * add JVM options workaround for system-rules junit plugin not supporting Java 18+ * exclude older versions of byte-buddy from assertj-core * fix for Java 19 changes in floating point string representation * fix missing InitializedNullHandlingTest * update easymock to 5.2.0 for Java 21 compatibility * update animal-sniffer-plugin to 1.23 * update nl.jqno.equalsverifier to 3.15.1 * update exec-maven-plugin to 3.1.0	2023-10-03 22:41:21 -07:00
Soumyava	cb050282a0	Intervals are updated properly for Unnest queries (#15020 ) Fixes a bug where the unnest queries were not updated with the correct intervals.	2023-10-04 02:52:10 +05:30
George Shiqi Wu	64754b6799	Allow users to pass task payload via deep storage instead of environment variable (#14887 ) This change is meant to fix a issue where passing too large of a task payload to the mm-less task runner will cause the peon to fail to startup because the payload is passed (compressed) as a environment variable (TASK_JSON). In linux systems the limit for a environment variable is commonly 128KB, for windows systems less than this. Setting a env variable longer than this results in a bunch of "Argument list too long" errors.	2023-10-03 14:08:59 +05:30
Zoltan Haindrich	f3d1c8b70e	Enable back testcases in CalciteWindowQueryTest (#15045 ) Most of the testcases were disabled in CalciteWindowQueryTest during the Calcite-1.35 upgrade; there were some changes arising from the fact that the removal of DRUID_SUM had some unexpected sideffects: SqlStdOperatorTable.SUM became the SUM operator because of that SqlToRelConverter started rewriting windowed SUM -s into SUM0 -s my opinion is that w.r.t to Druid this rewrite provides no real advantage - as SUM0 is serviced by SUM here I believe that's not 100% correct in cases when it aggregates just null-s but that doesnt matter in this case I propose to introduce back a local DRUID_SUM thing as an unchanged SUM and later when CALCITE-6020 is fixed ; we can drop that.	2023-10-03 10:18:44 +05:30
Soumyava	261f54dc04	coalesce on unnest row mismatch fix (#15019 ) * coalesce on unnest row mismatch fix * new example with coalesce over unnest with nested array columns * New example with change in order which triggers the nvl * new test plan update for useDefault=true	2023-10-02 17:26:50 -07:00
Pranav	f1edd671fb	Exposing optional replaceMissingValueWith in lookup function and macros (#14956 ) * Exposing optional replaceMissingValueWith in lookup function and macros * args range validation * Updating docs * Addressing comments * Update docs/querying/sql-scalar.md Co-authored-by: Clint Wylie <cjwylie@gmail.com> * Update docs/querying/sql-functions.md Co-authored-by: Clint Wylie <cjwylie@gmail.com> * Addressing comments --------- Co-authored-by: Clint Wylie <cjwylie@gmail.com>	2023-10-02 17:09:23 -07:00
Parth Agrawal	d038237ece	memcached cache: switch to AWS elasticache-java-cluster-client and add TLS support (#14827 ) This PR updates the library used for Memcached client to AWS Elasticache Client : https://github.com/awslabs/aws-elasticache-cluster-client-memcached-for-java This enables us to use the option of encrypting data in transit: Amazon ElastiCache for Memcached now supports encryption of data in transit For clusters running the Memcached engine, ElastiCache supports Auto Discovery—the ability for client programs to automatically identify all of the nodes in a cache cluster, and to initiate and maintain connections to all of these nodes. Benefits of Auto Discovery - Amazon ElastiCache AWS has forked spymemcached 2.12.1, and has since added all the patches included in 2.12.2 and 2.12.3 as part of the 1.2.0 release. So, this can now be considered as an equivalent drop-in replacement. GitHub - awslabs/aws-elasticache-cluster-client-memcached-for-java: Amazon ElastiCache Cluster Client for Java - enhanced library to connect to ElastiCache clusters. https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/elasticache/AmazonElastiCacheClient.html#AmazonElastiCacheClient-- How to enable TLS with Elasticache On server side: https://docs.aws.amazon.com/AmazonElastiCache/latest/mem-ug/in-transit-encryption-mc.html#in-transit-encryption-enable-existing-mc On client side: GitHub - awslabs/aws-elasticache-cluster-client-memcached-for-java: Amazon ElastiCache Cluster Client for Java - enhanced library to connect to ElastiCache clusters.	2023-10-02 12:51:05 -07:00
Zoltan Haindrich	2785e062d7	Correct quotation in drill query files (#15044 )	2023-10-02 08:17:15 -07:00
Pranav	07c28f17ca	Fix missing format strings in calls to DruidException.build (#15056 ) * Fix the NPE bug in nonStrictFormat * using non null format string * using Assert.assertThrows	2023-09-29 17:00:36 -07:00
YongGang	86087cee0a	Fix Peon not fail gracefully (#14880 ) * fix Peon not fail gracefully * move methods to Task interface * fix checkstyle * extract to interface * check runThread nullability * fix merge conflict * minor refine * minor refine * fix unit test * increase latch waiting time	2023-09-29 12:39:59 -07:00
Karan Kumar	2f1bcd6717	Adding `"segment/scan/active" metric for processing thread pool. (#15060 )	2023-09-29 12:34:28 -07:00
Rishabh Singh	ebb9724c26	Pass jvm option to write heap dump on out of memory (#15053 )	2023-09-29 17:54:53 +05:30
Yuanli Han	9a4433bbad	Fix invalid segment path when using hdfs as the intermediate deepstore (#14984 ) This PR fixes the invalid segment path when enabling druid_processing_intermediaryData_storage_type: "deepstore" and using hdfs as the deep store.	2023-09-29 12:53:46 +05:30
Zoltan Haindrich	db71e28808	Enable SortProjectTransposeRule (#15002 ) contains Enable already passing tests in DecoupledPlanningCalciteQueryTest #14996 enables a transpose rule to support a query plan in which the plan was in the shape: Sort Project Aggregate	2023-09-29 10:49:03 +05:30
Zoltan Haindrich	5f3b310115	Build reliablity fixes (#15048 ) * disable parallel builds; enable batch mode to get rid of transfer progress * restore .m2 from setup-java if not found * some change to sql * add ws * fix quote * fix quote * undo querytest change * nullhandling in mvtest * init more * skip commitid plugin * add-back 1.0C to build ; remove redundant skip-s from copy-resources; add comment	2023-09-28 12:27:52 -07:00

... 2 3 4 5 6 ...

13425 Commits All Branches Search

13425 Commits

All Branches