druid

mirror of https://github.com/apache/druid.git synced 2025-02-28 06:19:13 +00:00

Author	SHA1	Message	Date
Katya Macedo	92e660dd21	Add Druid 30.0.0 upgrade notes (#16522 )	2024-05-31 13:23:22 -07:00
Atul Mohan	b53d75758f	IcebergInputSource : Add option to toggle case sensitivity while reading columns from iceberg catalog (#16496 ) * Toggle case sensitivity while reading columns from iceberg * Fix tests * Drop case check and set unconditionally	2024-05-31 10:18:52 -07:00
George Shiqi Wu	0936798122	Add limit to task payload size (#16512 ) * Add limit to task payload size * Change to a warning * Remove test * Fix unit tests * Optionally throw alert * PR comments * Update indexing-service/src/main/java/org/apache/druid/indexing/overlord/TaskQueue.java Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com> * PR comments * Reject large payloads * Update docs/configuration/index.md Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com> * Update indexing-service/src/main/java/org/apache/druid/indexing/overlord/TaskQueue.java Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com> --------- Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>	2024-05-31 09:17:36 -07:00
Kashif Faraz	b5b900b6a0	Do minor cleanup of AutoCompactionSnapshot.Builder (#16523 ) Changes: - Use `final` modifier for immutable - Use builder methods for chaining - Shorter lambda syntax	2024-05-31 16:06:53 +05:30
Jill Osborne	3c72ec8413	docs: Migration guide for subquery limit (#16519 ) Adds a migration guide for Druid 30 to help users understand the new byte-based subquery limit property maxSubqueryBytes	2024-05-31 09:26:07 +05:30
Charles Smith	92e565e3b8	Adds a migration guide overview page to the release-info section (#16506 ) Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com> Co-authored-by: Katya Macedo <katya.macedo@imply.io>	2024-05-30 09:50:30 -07:00
Adithya Chakilam	a9044ac235	Add cgroup cpu/mem/disk usage metrics (#16472 ) * Add cgroup cpu/mem usage metrics * checks * comments * docs fix * add disk metrics * fapi check * checkstyle * issues * spelling * change asserts * checks * use proc builder instead of runtime * specify charset * spotbug	2024-05-29 12:44:37 -07:00
Abhishek Radhakrishnan	75937c98e8	Upgrade delta kernel from 3.1.0 to 3.2.0 (#16513 ) Upstream release: https://github.com/delta-io/delta/releases/tag/v3.2.0 - Upgrade kernel dependency to 3.2.0 - Notable breaking changes introduced in upstream that affects the Druid extension: - Rename TableClient -> Engine - Rename DefaultTableClient -> DefaultEngine - Exceptions moved to a separate package - Table.getPath() doesn't throw TableNotFoundException. Instead the exception is thrown when getting snapshot info from the Table object	2024-05-29 10:46:30 -07:00
George Shiqi Wu	b3b62ac431	Update azure input source docs (#16508 ) Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>	2024-05-29 10:00:46 -07:00
Sree Charan Manamala	6bbf9613f8	Throw soft exception in case of empty signature while building Scan Query (#16502 )	2024-05-29 09:41:54 +02:00
Sree Charan Manamala	27cfe12f4a	Enable reordering of window operators (#16482 ) This commit aims to enable the re-ordering of window operators in order to optimise the sort and partition operators. Example : ``` SELECT m1, m2, SUM(m1) OVER(PARTITION BY m2) as sum1, SUM(m2) OVER() as sum2 from numFoo GROUP BY m1,m2 ``` In order to compute this query, we can order the operators as to first compute the operators corresponding to sum2 and then place the operators corresponding to sum1 which would help us in reducing one sort operator if we order our operators by sum1 and then sum2.	2024-05-29 12:17:12 +05:30
George Shiqi Wu	f7013e012c	Add new test for handoff API (#16492 ) * Add new test for handoff API * Add new method * fix test * Update test	2024-05-28 12:57:51 -07:00
Adarsh Sanjeev	21f725f33e	Add octet streaming of sketchs in MSQ (#16269 ) There are a few issues with using Jackson serialization in sending datasketches between controller and worker in MSQ. This caused a blowup due to holding multiple copies of the sketch being stored. This PR aims to resolve this by switching to deserializing the sketch payload without Jackson. The PR adds a new query parameter used during communication between controller and worker while fetching sketches, "sketchEncoding". If the value of this parameter is OCTET, the sketch is returned as a binary encoding, done by ClusterByStatisticsSnapshotSerde. If the value is not the above, the sketch is encoded by Jackson as before.	2024-05-28 18:12:38 +05:30
Kashif Faraz	9d77ef04f4	Cleanup usages of stopwatch (#16478 ) Changes: - Remove synchronized methods from `Stopwatch` - Access stopwatch methods in `ChangeRequestHttpSyncer` inside a lock	2024-05-27 23:08:46 +05:30
Clint Wylie	4e1de50e30	fix issue with auto column grouping (#16489 ) * fix issue with auto column grouping changes: * fixes bug where AutoTypeColumnIndexer reports incorrect cardinality, allowing it to incorrectly use array grouper algorithm for realtime queries producing incorrect results for strings * fixes bug where auto LONG and DOUBLE type columns incorrectly report not having null values, resulting in incorrect null handling when grouping * fix test	2024-05-27 11:18:17 +05:30
Sensor	6bc29534a7	[Web Console] Datasource page support search datasource by keyword (#16371 ) * Frontend segment_timeline support filter by datasource * add dependency * fix eslint issues * resolve code style issue, update snapshot * fix comment * update licence * update package-lock.json * update snapshot * Update segment-timeline.tsx * Update segment-timeline.tsx	2024-05-24 11:54:26 -07:00
zachjsh	b0cc1ee84b	Add ability to turn off Druid Catalog specific validation done on catalog defined tables in Druid (#16465 ) * * add property to enable / disable catalog validation and add tests * * add integration tests for catalog validation disabled * * add integration tests * * remove debugging logs * * fix forbidden api call	2024-05-23 13:19:51 -04:00
Pranav	204a25d3e6	Moving object contains to Bound for string/object matchers (#16241 )	2024-05-23 16:56:04 +02:00
Zoltan Haindrich	12f79acc7e	Enable quidem shadowing for decoupled testcases (#16431 ) * Altered `QueryTestBuilder` to be able to switch to a backing quidem test * added a small crc to ensure that the shadow testcase does not deviate from the original one * Packaged all decoupled related things into a a single `DecoupledExtension` to reduce copy-paste * `DecoupledTestConfig#quidemReason` must describe why its being used * `DecoupledTestConfig#separateDefaultModeTest` can be used to make multiple case files based on `NullHandling` state * fixed a cosmetic bug during decoupled join translation * enhanced `!druidPlan` to report the final logical plan in non-decoupled mode as well * add check to ensure that only supported params are present in a druidtest uri * enabled shadow testcases for previously disabled testcases	2024-05-23 07:03:16 +02:00
Vadim Ogievetsky	10ea88e5bf	Web console: more robust durable storage setting detection (#16493 ) * more robust durable storage setting * add test	2024-05-22 15:47:20 -07:00
Gian Merlino	eb410f712d	Use typecasting comparator for numeric "any" aggregations. (#16494 ) This brings them in line with the behavior of other numeric aggregations. It is important because otherwise ClassCastExceptions can arise if comparing different numeric types that may arise from deserialization.	2024-05-22 12:38:51 -07:00
Zoltan Haindrich	44ea4e1c51	Fix cds-coordinator-metadata-query-disabled (#16488 ) fixes the issue with the newly enabled `cds-coordiantor-metadata-query-disabled` [split](https://github.com/apache/druid/pull/16468) * configures to use `prepopulated-data` environment things to configure `S3` for access * this is needed because these tests use a [dataset which is loaded from s3](https://github.com/apache/druid/blob/master/integration-tests/docker/test-data/cds-coordinator-metadata-query-disabled-sample-data.sql) * also undoes the previous [fix](https://github.com/apache/druid/pull/16469) of setting the aws region explicitly as this is a more complete solution - and configuring `prepopulated-data` also sets the region; so that's not needed anymore	2024-05-22 20:42:11 +02:00
Vadim Ogievetsky	0ab3b34117	Web console: enable copy data as inline SQL (via VALUES) (#16458 ) * copy as values * address NULL issue * add decription * extend test * fix json * more types * fix braces with nulls * fix test * update functions to scan	2024-05-22 08:33:07 -07:00
dependabot[bot]	80db8cd93b	Bump org.openrewrite.maven:rewrite-maven-plugin from 5.27.0 to 5.31.0 (#16477 )	2024-05-21 09:47:05 +02:00
Zoltan Haindrich	c948201507	Fix cds-task-schema-publish-disabled (#16469 ) set AWS_REGION=us-west-2 to avoid retries	2024-05-21 12:18:30 +05:30
Rishabh Singh	28473e7c4d	Use correct IT group name for the group `cds-coordinator-metadata-query-disabled` in GHA (#16468 ) * Fix build * Use the correct IT test group name in gha * update	2024-05-21 11:30:23 +05:30
Kashif Faraz	15d27f340d	Fix fetch of task location in SpecificTaskServiceLocator (#16462 ) * Fix fetch of task location in SpecificTaskServiceLocator * Resolve future if exception occurs while invoking API * Remove unused import	2024-05-20 12:35:04 +05:30
Vadim Ogievetsky	a124c6cbbd	fix typo in extension name (#16466 )	2024-05-20 09:47:22 +08:00
Gian Merlino	599586bcfc	Add SQL DIV function. (#16464 ) * Add SQL DIV function. This function has been documented for some time, but lacked a binding, so it wasn't usable. * Add a case with two expression inputs.	2024-05-17 11:11:32 -07:00
zachjsh	dd5dc500ce	Catalog integration tests (#16424 ) * * add new catalog IT with failure to ensure that it is run in CI * * actually add failing test referred to and fix checkstyle * * add some tests * * fix checkstyle * * add test descriptions * * add more tests	2024-05-17 11:49:09 -04:00
George Shiqi Wu	ed9881df88	Cleanup logic from handoff API (#16457 ) * Cleanup logic from handoff API * Fix test * Fix checkstyle * Update docs	2024-05-16 08:42:44 -07:00
Vadim Ogievetsky	435b58f101	Web console: fix Druid doctor check to accept Java 17 (#16250 ) * fix Druid doctor check * fix doc link * Update web-console/src/dialogs/doctor-dialog/doctor-checks.tsx Co-authored-by: Abhishek Radhakrishnan <abhishek.rb19@gmail.com> --------- Co-authored-by: Abhishek Radhakrishnan <abhishek.rb19@gmail.com>	2024-05-15 20:37:15 -07:00
Gian Merlino	0fb09445a5	Fix ExpressionPredicateIndexSupplier numeric replace-with-default behavior. (#16448 ) * Fix ExpressionPredicateIndexSupplier numeric replace-with-default behavior. In replace-with-default mode, null numeric values from the index should be interpreted as zeroes by expressions. This makes the index supplier more consistent with the behavior of the selectors created by the expression virtual column. * Fix test case.	2024-05-15 15:11:47 +05:30
Vadim Ogievetsky	c419ae5f73	use objectGlob (#16452 ) Catching up to a change introduced in #13027	2024-05-15 15:11:11 +05:30
Akshat Jain	ddfd62d9a9	Disable loading lookups by default in CompactionTask (#16420 ) This PR updates CompactionTask to not load any lookups by default, unless transformSpec is present. If transformSpec is present, we will make the decision based on context values, loading all lookups by default. This is done to ensure backward compatibility since transformSpec can reference lookups. If transform spec is not present and no context value is passed, we donot load any lookup. This behavior can be overridden by supplying lookupLoadingMode and lookupsToLoad in the task context.	2024-05-15 11:39:23 +05:30
kaisun2000	91cd07d892	Add logging to reveal reason to persist the hydrants (#16409 )	2024-05-15 08:39:29 +05:30
Codegass	621525a5cb	Refactor: Clean up `DecimalParquetInputTest` using Assume (#16436 )	2024-05-14 21:13:07 +05:30
Gian Merlino	72432c2e78	Speed up SQL IN using SCALAR_IN_ARRAY. (#16388 ) * Speed up SQL IN using SCALAR_IN_ARRAY. Main changes: 1) DruidSqlValidator now includes a rewrite of IN to SCALAR_IN_ARRAY, when the size of the IN is above inFunctionThreshold. The default value of inFunctionThreshold is 100. Users can restore the prior behavior by setting it to Integer.MAX_VALUE. 2) SearchOperatorConversion now generates SCALAR_IN_ARRAY when converting to a regular expression, when the size of the SEARCH is above inFunctionExprThreshold. The default value of inFunctionExprThreshold is 2. Users can restore the prior behavior by setting it to Integer.MAX_VALUE. 3) ReverseLookupRule generates SCALAR_IN_ARRAY if the set of reverse-looked-up values is greater than inFunctionThreshold. * Revert test. * Additional coverage. * Update docs/querying/sql-query-context.md Co-authored-by: Benedict Jin <asdf2014@apache.org> * New test. --------- Co-authored-by: Benedict Jin <asdf2014@apache.org>	2024-05-14 08:09:27 -07:00
George Shiqi Wu	c1bf4fed90	API for stopping streaming tasks early (#16310 ) * Try stopping task early * Fix checkstyle * Add unit test * Add a couple more tests * PR changes * Use notice * fix checkstyle * PR changes * Update indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/supervisor/SeekableStreamSupervisor.java Co-authored-by: Suneet Saldanha <suneet@apache.org> * Update indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/supervisor/SeekableStreamSupervisor.java Co-authored-by: Suneet Saldanha <suneet@apache.org> * Change payload * Remove quotes --------- Co-authored-by: Suneet Saldanha <suneet@apache.org>	2024-05-14 06:39:50 -07:00
Gian Merlino	cdf78ecccd	Fix IndexSpec in SqlBenchmark to use stringEncodingStrategy (#16336 )	2024-05-14 14:59:15 +05:30
Adarsh Sanjeev	18a4722d11	Resolve a bug where datasketches would not downsample sketches sufficiently (#16119 ) * Fix sketch memory issue * Rename function * Add unit test * Revert downsampling change	2024-05-14 10:23:57 +05:30
Sree Charan Manamala	b8dd7478d0	Custom Calcite Rule to remove redundant references (#16402 ) Custom calcite rule mimicking AggregateProjectMergeRule to extend support to expressions. The current calcite rule return null in such cases. In addition, this removes the redundant references.	2024-05-14 06:38:05 +02:00
Vadim Ogievetsky	760e449875	Web console: Fix order-by-delta in explore view table (#16417 ) * change to using measure name * Implment order by delta * less paring, stricter types * safeDivide0 * fix no query * new DTQ alows parsing JSON_VALUE(...RETURNING...)	2024-05-13 19:03:46 -07:00
Akshat Jain	d1100a6f63	Add retries for building S3 client (#16438 ) * Add retries for building S3 client * Use S3Utils instead of RetryUtils * Add test	2024-05-13 16:32:06 -07:00
Laksh Singla	4bfc186153	Support sorting on complex columns in MSQ (#16322 ) MSQ sorts the columns in a highly specialized manner by byte comparisons. As such the values are serialized differently. This works well for the primitive types and primitive arrays, however complex types cannot be serialized specially. This PR adds the support for sorting the complex columns by deserializing the value from the field and comparing it via the type strategy. This is a lot slower than the byte comparisons, however, it's the only way to support sorting on complex columns that can have arbitrary serialization not optimized for MSQ. The primitives and the arrays are still compared via the byte comparison, therefore this doesn't affect the performance of the queries supported before the patch. If there's a sorting key with mixed complex and primitive/primitive array types, for example: longCol1 ASC, longCol2 ASC, complexCol1 DESC, complexCol2 DESC, stringCol1 DESC, longCol3 DESC, longCol4 ASC, the comparison will happen like: longCol1, longCol2 (ASC) - Compared together via byte-comparison, since both are byte comparable and need to be sorted in ascending order complexCol1 (DESC) - Compared via deserialization, cannot be clubbed with any other field complexCol2 (DESC) - Compared via deserialization, cannot be clubbed with any other field, even though the prior field was a complex column with the same order stringCol1, longCol3 (DESC) - Compared together via byte-comparison, since both are byte comparable and need to be sorted in descending order longCol4 (ASC) - Compared via byte-comparison, couldn't be coalesced with the previous fields as the direction was different This way, we only deserialize the field wherever required	2024-05-13 15:07:05 +05:30
Akshat Jain	bacdb4c48d	Update integration tests related documentation for better clarity (#16313 )	2024-05-13 11:27:21 +05:30
Sensor	1601a0f8f8	add ignore path (#16429 )	2024-05-11 17:54:52 +08:00
aho135	9459722ebf	Use canonical hostname instead of ip by default (#16386 ) Co-authored-by: Andrew Ho <a.ho@salesforce.com>	2024-05-11 17:53:22 +08:00
Alberic Liu	811dcd1726	update protobuf.md (#16434 )	2024-05-11 17:52:54 +08:00
Benedict Jin	cb7c2c1e37	Downgrade the version of Apache Curator from 5.5.0 to 5.3.0 to avoid a bug in the new version (#16425 )	2024-05-10 15:08:33 +05:30

1 2 3 4 5 ...

14056 Commits