druid

Commit Graph

Author	SHA1	Message	Date
Benedict Jin	0f38a98368	Update the link of Helm Chart to avoid 404 error (#15905 )	2024-02-21 16:57:41 +08:00
Suneet Saldanha	cbc53d53b4	Update k8sTaskRunner log message (#15871 )	2024-02-21 14:34:00 +08:00
Gian Merlino	e20004c7df	Remove helm chart. (#15904 ) The helm chart was originally moved here in #11163 from https://github.com/helm/charts/tree/master/incubator/druid after the helm/charts repository was deprecated. However, it has been excluded from releases since then, due to uncertainty around whether we need IP clearance. We have not had volunteers willing to sort this out, so this patch removes the code. It can be re-added if a volunteer is available to sort out the IP clearance process. See thread at: https://lists.apache.org/thread/ygyzt23m06vc775nq5dsm349rf0j47dg	2024-02-21 14:21:37 +08:00
Laksh Singla	a1b2c7326e	Numeric array support for columnar frames (#15917 ) Columnar frames used in subquery materialization and window functions now support numeric arrays.	2024-02-21 11:32:33 +05:30
George Shiqi Wu	2c0d1128f8	Fix pod template reading logic (#15915 ) * Fix pod template reading * PR changes * Fix unit tests	2024-02-20 11:13:51 -05:00
Adarsh Sanjeev	9eaaeb5c16	Add security ITs to the revised integration tests (#15885 ) * Add IT for security * Add admin client * Clean up code * Clean up code * Address review comments	2024-02-20 11:32:08 +05:30
Gian Merlino	9c41827dba	Globally disable AUTO_CLOSE_JSON_CONTENT. (#15880 ) * Globally disable AUTO_CLOSE_JSON_CONTENT. This JsonGenerator feature is on by default. It causes problems with code like this: try (JsonGenerator jg = ...) { jg.writeStartArray(); for (x : xs) { jg.writeObject(x); } jg.writeEndArray(); } If a jg.writeObject call fails due to some problem with the data it's reading, the JsonGenerator will write the end array marker automatically when closed as part of the try-with-resources. If the generator is writing to a stream where the reader does not have some other mechanism to realize that an exception was thrown, this leads the reader to believe that the array is complete when it actually isn't. Prior to this patch, we disabled AUTO_CLOSE_JSON_CONTENT for JSON-wrapped SQL result formats in #11685, which fixed an issue where such results could be erroneously interpreted as complete. This patch fixes a similar issue with task reports, and all similar issues that may exist elsewhere, by disabling the feature globally. * Update test.	2024-02-16 08:52:48 -08:00
Clint Wylie	fe2ba8cc28	fix return type inference of parse_long, which can also be null if string is not parseable into a long (#15909 ) * fix return type inference of parse_long, which can also be null if string is not parseable into a long * fix msq test	2024-02-15 08:45:34 -08:00
Vadim Ogievetsky	66f54f2066	allow compaction config slots to drop to 0 (#15877 )	2024-02-15 15:27:15 +08:00
Parth Agrawal	495e66f2e7	CVE Fix: Update json-path version (#15772 ) Apache Druid brings the dependency json-path which is affected by CVE-2023-51074. Its latest version 2.9.0 fixes the above CVE. Append function has been added to json-path and so the unit test to check for the append function not present has been updated. --------- Co-authored-by: Xavier Léauté <xvrl@apache.org>	2024-02-14 20:58:27 -08:00
Tom	f224035c7e	Fix Flakiness in KafkaEmitterTest (#15907 ) * thrust of the fix to allow for the json values to be out of order The existing problem is that toMap doesn't turn some values into json primitive values, for example segmentMetadata just has DateTime objects for it's time in the EventMap, but Alert event converts those into strings when calling toMap. This creates an issue because when we check the emitted events the mapper deserializing the string value for dateTime leaves it as a string in the EventMap. So the question is do we alter the events toMap() to return string/map version of objects or to make the expected events do a round trip of eventMap -> string -> eventMap to turn everything into json primitives * fix issue by making toMap events convert Objects into strings, or maps * fix linting errors * use method of using mapper to round trip expected data to make it have same type as those of the events emitted * remove unnecessary comment	2024-02-15 10:01:55 +05:30
317brian	c98d54f3c4	docs: delete unused file that causes confusion (#15910 )	2024-02-14 16:42:02 -08:00
YongGang	19ed5c863f	Enhance rolling Supervisor restarts at taskDuration (#15859 )	2024-02-14 15:44:34 -08:00
Abhishek Radhakrishnan	c324e37751	Add javadocs to `KafkaEmitterTest` & fix flaky test (#15898 ) * Address review comment: add test javadocs * Fix flaky assertion failure. Use ConcurrentHashMap instead of HashMap because the producer callback can trigger concurrently and override the map initialization. * fixup intellij inspection	2024-02-14 11:52:06 -08:00
Sam Rash	be0ee2ee33	update version check for profiling to >= 17 (#15686 )	2024-02-14 21:44:20 +05:30
Peter Marshall	cae9cbd7d7	Update tasks.md (#15887 ) Remove erroneous white space causing render issues on this page.	2024-02-13 05:20:09 -08:00
Clint Wylie	dad8398a4d	start process of deprecating non-sql compatible legacy configurations (#15713 ) Starting the process to officially deprecate non SQL compatible modes by updating docs to aggressively call out that Druids non SQL compliant modes are deprecated and will go away someday. There are no code or behavior changes at this PR.	2024-02-13 15:31:45 +05:30
Tom	c225c19f81	fix copy paste issue in earlier PR (#15890 )	2024-02-12 19:49:19 -05:00
Gian Merlino	0f6a895372	Rework ExprMacro base classes to simplify implementations. (#15622 ) * Rework ExprMacro base classes to simplify implementations. This patch removes BaseScalarUnivariateMacroFunctionExpr, adds BaseMacroFunctionExpr at the top of the hierarchy (a suitable base class for ExprMacros that take either arrays or scalars), and adds an implementation for "visit" to BaseMacroFunctionExpr. The effect on implementations is generally cleaner code: - Exprs no longer need to implement "visit". - Exprs no longer need to implement "stringify", even if they don't use all of their args at runtime, because BaseMacroFunctionExpr has access to even unused args. - Exprs that accept arrays can extend BaseMacroFunctionExpr and inherit a bunch of useful methods. The only one they need to implement themselves that scalar exprs don't is "supplyAnalyzeInputs". * Make StringDecodeBase64UTFExpression a static class. * Remove unused import. * Formatting, annotation changes.	2024-02-12 15:50:45 -08:00
Katya Macedo	0f29ece6a9	[Docs] Refactor streaming ingestion section (#15591 ) Merging the work so far. @ektravel , @vogievetsky if there are additional improvements, let's track them & make another pr. * Refactor streaming ingestion docs * Update property definition * Update after review * Update known issues * Move kinesis and kafka topics to ingestion, add redirects * Saving changes * Saving * Add input format text * Update after review * Minor text edit * Update example syntax * Revert back to colon * Fix merge conflicts * Fix broken links * Fix spelling error	2024-02-12 13:52:42 -08:00
Charles Smith	2a42b11660	remove legacy Jupyter tutorial files (#15834 ) * remove legacy files * redirection for the jupyter tutorial page * remove tutorial from sidebar * remove redirection	2024-02-12 13:45:47 -08:00
Abhishek Radhakrishnan	51fd79ee58	Clean up kafka emitter tests, add more validations and code coverage. (#15878 ) * Clean up kafka emitter tests a bit and add more validations. The test wasn't validating what events were sent, but simply the dropped counters, which aren't that useful. Additionally, this module has fewer tests, so folks often run into code coverage issue in this extension. Hopefully this change helps with that too. * Change things to feed-based rather than topic-based. * Another test for shared topic * Switch to DruidException, add test dependencies and sad path config tests. * missing test dependency * minor renames. * Add more tests - to test unknown events and drop when queue is full	2024-02-12 16:22:19 -05:00
Gian Merlino	7fea34abdd	LOOKUP docs: clarify behavior of replaceMissingValueWith. (#15879 ) Clarify behavior when expr is null.	2024-02-11 13:11:00 -08:00
zachjsh	f9ee2c353b	Extend the PARTITION BY clause to accept string literals for the time partitioning (#15836 ) This PR contains a portion of the changes from the inactive draft PR for integrating the catalog with the Calcite planner https://github.com/apache/druid/pull/13686 from @paul-rogers, extending the PARTITION BY clause to accept string literals for the time partitioning	2024-02-09 11:45:38 -05:00
Vishesh Garg	6e9eee4c5f	Add failure check (#15873 )	2024-02-09 08:27:10 -08:00
Lasse Mammen	4255711b3e	fix: handle BOOKMARK events in kubernetes pod discovery (#15819 )	2024-02-09 18:50:04 +05:30
Tom	11a8624ef1	allow for kafka-emitter to have extra dimensions be set for each event it emits (#15845 ) * allow for kafka-emitter to have extra dimensions be set for each event it emits * fix checktsyle issue in kafkaemitterconfig * make changes to fix docs, and cleanup copy paste error in #toString() * undo formatting to markdown table * add more branches so test passes * fix checkstyle issue	2024-02-08 22:55:24 -08:00
George Shiqi Wu	d703b2c709	Add azure kill test (#15833 ) * Add kill test * Extra line * Don't need toString * Add back test * Remove newline * move kill verification into main test	2024-02-08 16:15:30 -05:00
Sree Charan Manamala	57e12df352	Sql Single Value Aggregator for scalar queries (#15700 ) Executing single value correlated queries will throw an exception today since single_value function is not available in druid. With these added classes, this provides druid, the capability to plan and run such queries.	2024-02-08 19:20:30 +05:30
Soumyava	f3996b96ff	Fixes for safe_divide with vectorize and datatypes (#15839 ) * Fix for save_divide with vectorize * More fixes * Update to use expr.eval(null) for both cases when denominator is 0	2024-02-08 14:40:42 +05:30
Abhishek Radhakrishnan	1a5b57df84	Update `groupId` for delta-lake and iceberg extensions (#15843 ) * Update the group id to org.apache.druid.extensions.contrib for contrib exts. * Note iceberg and delta lake extensions in extensions.md * properties and shell backticks * Update groupId in distribution/pom.xml * remove delta-lake from dist. * Add note on downloading extension.	2024-02-07 23:54:06 -08:00
Vadim Ogievetsky	26815d425b	Web console: add system fields UI (#15858 ) This PR adds console support for configuring system fields in the batch data loader.	2024-02-08 11:08:55 +05:30
Gian Merlino	21a97f4c61	Fix HllSketchHolderObjectStrategy#isSafeToConvertToNullSketch. (#15860 ) * Fix HllSketchHolderObjectStrategy#isSafeToConvertToNullSketch. The prior code from #15162 was reading only the low-order byte of an int representing the size of a coupon set. As a result, it would erroneously believe that a coupon set with a multiple of 256 elements was empty.	2024-02-08 08:14:28 +05:30
Adarsh Sanjeev	514b3b4d01	Add export capabilities to MSQ with SQL syntax (#15689 ) * Add test * Parser changes to support export statements * Fix builds * Address comments * Add frame processor * Address review comments * Fix builds * Update syntax * Webconsole workaround * Refactor * Refactor * Change export file path * Update docs * Remove webconsole changes * Fix spelling mistake * Parser changes, add tests * Parser changes, resolve build warnings * Fix failing test * Fix failing test * Fix IT tests * Add tests * Cleanup * Fix unparse * Fix forbidden API * Update docs * Update docs * Address review comments * Address review comments * Fix tests * Address review comments * Fix insert unparse * Add external write resource action * Fix tests * Add resource check to overlord resource * Fix tests * Add IT * Update syntax * Update tests * Update permission * Address review comments * Address review comments * Address review comments * Add tests * Add check for runtime parameter for bucket and path * Add check for runtime parameter for bucket and path * Add tests * Update docs * Fix NPE * Update docs, remove deadcode * Fix formatting	2024-02-07 22:08:50 +05:30
Vadim Ogievetsky	f2b242b6e6	update console to core Druid changes (#15854 )	2024-02-07 19:44:25 +05:30
Clint Wylie	23d4fade90	use NullFilter for SQL rewrite of MV_CONTAINS and MV_OVERLAP for null array elements (#15855 ) Fixes an oversight after #14542 that happens in the SQL planner rewrite of MV_CONTAINS and MV_OVERLAP when faced with array elements that are NULL, where we were incorrectly using EqualityFilter instead of NullFilter for null elements (EqualityFilter does not accept null elements).	2024-02-07 19:40:41 +05:30
Bartosz Mikulski	45c26e8682	Fix Inspection Check in DirectDruidClientTest (#15857 )	2024-02-07 02:56:26 -08:00
Zoltan Haindrich	fdc7cec271	Support Window operators in decoupled planning (#15815 )	2024-02-07 04:09:48 -05:00
Bartosz Mikulski	43a1c96cd1	Fix query-cancellation-executor memory leak (#15754 ) This PR fixes #15069 by resolving a memory leak caused by a thread leak in query-cancellation-executor.	2024-02-07 10:54:38 +05:30
Pramod Immaneni	59bca0951a	Parallelize storage of incremental segments (#13982 ) During ingestion, incremental segments are created in memory for the different time chunks and persisted to disk when certain thresholds are reached (max number of rows, max memory, incremental persist period etc). In the case where there are a lot of dimension and metrics (1000+) it was observed that the creation/serialization of incremental segment file format for persistence and persisting the file took a while and it was blocking ingestion of new data. This affected the real-time ingestion. This serialization and persistence can be parallelized across the different time chunks. This update aims to do that. The patch adds a simple configuration parameter to the ingestion tuning configuration to specify number of persistence threads. The default value is 1 if it not specified which makes it the same as it is today.	2024-02-07 10:43:05 +05:30
Sam Wheating	4c58856f10	Fix incorrect ordering of args in log statement (#15846 )	2024-02-06 16:12:04 -08:00
Abhishek Radhakrishnan	1affa35b29	Bump up Delta Lake Kernel to 3.1.0 (#15842 ) This patch bumps Delta Lake Kernel dependency from 3.0.0 to 3.1.0, which released last week - please see https://github.com/delta-io/delta/releases/tag/v3.1.0 for release notes. There were a few "breaking" API changes in 3.1.0, you can find the rationale for some of those changes here. Next-up in this extension: add and expose filter predicates.	2024-02-06 21:25:17 +05:30
317brian	2dc71c7874	docs: fix rendering (#15835 )	2024-02-06 07:18:43 -08:00
Gian Merlino	54b30646f3	Add sqlReverseLookupThreshold for ReverseLookupRule. (#15832 ) If lots of keys map to the same value, reversing a LOOKUP call can slow things down unacceptably. To protect against this, this patch introduces a parameter sqlReverseLookupThreshold representing the maximum size of an IN filter that will be created as part of lookup reversal. If inSubQueryThreshold is set to a smaller value than sqlReverseLookupThreshold, then inSubQueryThreshold will be used instead. This allows users to use that single parameter to control IN sizes if they wish.	2024-02-06 16:32:05 +05:30
Soumyava	b86f31f2c0	Addressing shapeshifting issues with window functions (#15807 ) Addressing shapeshifting issues with window functions	2024-02-06 11:12:20 +05:30
Zoltan Haindrich	392d585ff8	Identify not range filters without negating subexpressions (#15766 ) * Identify not range filters without negating subexpressions Earlier betweenish (range/bounds) filters were identified thru a process of negating the subexpressions which may have not performed that well. (it could have dominated the runtime in some cases) This patch makes that unnecessary as its able to create the negate expression directly. * add test;fix for multiple intervals	2024-02-05 19:12:58 -08:00
George Shiqi Wu	edb1ac1b71	Update azure console tile (#15820 ) * Save web console changes * Working new input type * fix tests	2024-02-05 13:11:39 -08:00
Clint Wylie	358892e5b0	add nested array index support, fix some bugs (#15752 ) This PR wires up ValueIndexes and ArrayElementIndexes for nested arrays, ValueIndexes for nested long and double columns, and fixes a handful of bugs I found after adding nested columns to the filter test gauntlet.	2024-02-05 15:12:09 +05:30
Laksh Singla	ee78a0367d	Fix serialization bug in PassthroughAggregatorFactory (#15830 ) PassthroughAggregatorFactory overrides a deprecated method in the AggregatorFactory, on which it relies on for serializing one of its fields complexTypeName. This was accidentally removed, leading to a bug in the factory, where the type name doesn't get serialized properly, and places null in the type name. This PR revives that method with a different name and adds tests for the same.	2024-02-05 15:11:10 +05:30
Rishabh Singh	de959e513d	Add QueryLifecycle#authorize for grpc-query-extension (#15816 ) Proposal #13469 Original PR #14024 A new method is being added in QueryLifecycle class to authorise a query based on authentication result. This method is required since we authenticate the query by intercepting it in the grpc extension and pass down the authentication result.	2024-02-02 21:49:57 +05:30

... 2 3 4 5 6 ...

13849 Commits All Branches Search

13849 Commits

All Branches