druid

Commit Graph

Author	SHA1	Message	Date
TessaIO	93c123a482	docs: fix cached lookup module documentation (#17527 ) * docs: fix loading lookup documentation Signed-off-by: TessaIO <ahmedgrati1999@gmail.com> * docs: fix indentation and punctuation Signed-off-by: TessaIO <ahmedgrati1999@gmail.com> --------- Signed-off-by: TessaIO <ahmedgrati1999@gmail.com>	2024-12-06 00:09:37 -08:00
Kashif Faraz	3de46746ca	Fix NPE in segment allocation when reduceMetadataIO is true (#17537 )	2024-12-05 12:58:47 +05:30
Karan Kumar	0eb8d733d4	Adding leader and not being leader logging on the overlord. (#17519 )	2024-12-03 22:36:53 +05:30
Clint Wylie	9ef46fc92d	suppress kafka cve for ranger extension (#17531 )	2024-12-02 21:25:39 -08:00
Zoltan Haindrich	c1ef38b052	Minor fixes and enhancements in UnionQuery handling (#17483 ) * plan consistently with either UnionDataSource or UnionQuery for decoupled mode * expose errors * move decoupled related setting from PlannerConfig to QueryContexts	2024-11-28 10:05:12 +01:00
Vadim Ogievetsky	ddbb985369	Web console: refactor and improve the segment timeline (try 2) (#17521 ) * refactor and improve the segment timeline * us consistent state * type cleanup * add shpitz * better bubble * Update web-console/src/components/segment-timeline/segment-bar-chart-render.tsx Co-authored-by: Clint Wylie <cjwylie@gmail.com> --------- Co-authored-by: Clint Wylie <cjwylie@gmail.com>	2024-11-27 19:30:40 -08:00
Charles Smith	0325f62af2	[Docs] Remove ambiguous advice regarding TopN correctness (#17522 )	2024-11-27 11:41:28 -08:00
Vadim Ogievetsky	f3e1f1e586	Revert "Web console: refactor and improve the segment timeline (#17508 )" (#17520 ) This reverts commit `09432c099b`.	2024-11-27 09:38:48 -08:00
Vadim Ogievetsky	09432c099b	Web console: refactor and improve the segment timeline (#17508 ) * refactor and improve the segment timeline * us consistent state * type cleanup * add shpitz * better bubble	2024-11-27 09:37:01 -08:00
Vishesh Garg	1b9a6dde9f	Fix compilation error for MSQCompactionRunnerTest (#17516 )	2024-11-27 12:46:30 +01:00
Gian Merlino	80d6763e39	ServerSelector: Synchronize getAllServers(). (#17499 ) This method was missing some required synchronization. This patch also adds GuardedBy annotations to historicalServers and realtimeServers, which would have caught it.	2024-11-27 13:31:00 +05:30
Vishesh Garg	5333c53d71	Support non time order in MSQ compaction (#17318 ) This patch supports sorting segments by non-time columns (added in #16849) to MSQ compaction. Specifically, if `forceSegmentSortByTime` is set in the data schema, either via the user-supplied compaction config or in the inferred schema, the following steps are taken: - Skip adding `__time` explicitly as the first column to the dimension schema since it already comes as part of the schema - Ensure column mappings propagate `__time` in the order specified by the schema - Set `forceSegmentSortByTime` in the MSQ context.	2024-11-27 13:26:10 +05:30
Clint Wylie	2831d79871	update kafka dependency version to 3.9.0 (#17513 ) * update kafka dependency version to 3.9.0 * update licenses.yaml	2024-11-27 12:14:05 +05:30
Akshat Jain	dd46c7722d	Remove pre-java-11 profile (#17511 ) We have removed support for Java 8 in #17466. This PR removes an unused profile pre-java-11 which activated for JDK < 11.	2024-11-26 08:43:20 +01:00
Kashif Faraz	207ad16f07	Reduce metadata IO during segment allocation (#17496 ) Changes --------- - Add Overlord runtime property `druid.indexer.tasklock.batchAllocationReduceMetadataIO` - Setting this flag to true (default value) allows the Overlord to fetch only necessary segment payloads during segment allocation - Setting this flag to false restores original segment allocation behaviour	2024-11-26 11:40:09 +05:30
Clint Wylie	ede9e4077a	add support for aggregate only projections (#17484 )	2024-11-25 09:22:46 -08:00
Zoltan Haindrich	20aea29a51	Rename d1/d2 columns in tests (#17471 )	2024-11-22 14:58:56 +01:00
Rishabh Singh	74422b58f5	Emit disk spill and merge buffer utilisation metrics for GroupBy queries (#17360 ) This change is to emit following metrics as part of GroupByStatsMonitor monitor, mergeBuffer/used -> Number of merge buffers used. mergeBuffer/acquisitionTimeNs -> Total time required to acquire merge buffer. mergeBuffer/acquisition -> Number of queries that acquired a batch of merge buffers. groupBy/spilledQueries -> Number of queries that spilled onto the disk. groupBy/spilledBytes-> Spilled bytes on the disk. groupBy/mergeDictionarySize -> Size of the merging dictionary.	2024-11-22 14:22:03 +05:30
Adarsh Sanjeev	df649c0bbd	Refactors (#17498 ) Follow-up PR to #17493 to address pending unaddressed comments.	2024-11-22 09:22:38 +05:30
Katya Macedo	bd93d0046d	Docs: update text and example (#17480 ) * Docs: update text and example * Update after review * Update the spelling file * Update text for clarity * Update after review	2024-11-21 08:40:41 -08:00
Vivek Dhiman	bb44f85bb6	Updated error response to hide error stack in case of JsonMappingException (#16821 ) Added flag druid.server.http.showDetailedJsonMappingError similar druid.server.http.showDetailedJettyError to configure error message detail.	2024-11-21 19:11:48 +05:30
Adarsh Sanjeev	2726c6f388	Minor refactors to processing Some refactors across druid to clean up the code and add utility functions where required.	2024-11-21 15:37:55 +05:30
Akshat Jain	17215cd677	Remove support for Java 8 (#17466 ) All JDK 8 based CI checks have been removed. Images used in Dockerfile(s) have been updated to Java 17 based images. Documentation has been updated accordingly.	2024-11-21 15:33:08 +05:30
Adithya Chakilam	c1d6328249	StreamingTaskRunner: Close the rejection period updater executor service (#17490 )	2024-11-19 12:49:20 -08:00
zachjsh	8853c7e5c6	Add `ingest/notices/queueSize` and `ingest/pause/time` to statsd emitter (#17487 ) * SQL syntax error should target USER persona * * revert change to queryHandler and related tests, based on review comments * * add test * * add `ingest/notices/queueSize` and `ingest/pause/time` to statsd emitter * * add taskStatus dimension to `service/heartbeat` metric * Revert "* add taskStatus dimension to `service/heartbeat` metric" This reverts commit `cfb02a2813`.	2024-11-18 20:58:00 -05:00
Adithya Chakilam	6f436301be	supervisor: make rejection periods work with stopTasksCount (#17442 ) * kafka-indexing: Report consumer io time * commit * backward * tests * remove unwanted changes * comments * comments * coverage * change name * fixes * fixes * comments	2024-11-18 13:12:24 -08:00
Clint Wylie	24a1fafaa7	projection segment merge fixes (#17460 ) changes: * fix issue when merging projections from multiple-incremental persists which was hoping that some 'dim conversion' buffers were not closed, but they already were (by the merging iterator). fix involves selectively persisting these conversion buffers to temp files in the segment write out directory and mapping them and tying them to the segment level closer so that they are available after the lifetime of the parent merger * modify auto column serializers to use segment write out directory for temp files instead of java.io.tmpdir * fix queryable index projection to not put the time-like column as a dimension, instead only adding it as __time * use smoosh for temp files so can safely write any Serializer to a temp smoosh	2024-11-15 16:46:04 -08:00
Rishabh Singh	7f335ff486	Resolve CVEs: Upgrade jetty version and suppress azure cve (#17385 )	2024-11-15 10:55:02 +05:30
Katya Macedo	75d9ece665	Docs: update descriptions and default values (#17473 )	2024-11-13 16:29:27 -08:00
zachjsh	b0c73d7c2a	Add 'ingest/notices/time' metric to statsd emitter (#17468 ) * SQL syntax error should target USER persona * * revert change to queryHandler and related tests, based on review comments * * add test * Add 'ingest/notices/time' metric to statsd emitter This metric gives the milliseconds taken to process a notice by the supervisor.	2024-11-13 12:17:01 -05:00
Akshat Jain	390c2d68c8	Remove `intellij-inspections` check from CI (#17469 )	2024-11-13 18:58:17 +05:30
Kiran Gadhave	1dbd005df6	updated docs with behavior for empty collections in pod template selector config (#17464 )	2024-11-12 13:21:27 -08:00
zachjsh	1f3b1f85f9	Add documentation for Druids catalog extension (#17459 ) * SQL syntax error should target USER persona * * revert change to queryHandler and related tests, based on review comments * * add test * Add documentation for druid-catalog extension * * fix error * * fix error * Apply suggestions from code review Co-authored-by: Andreas Maechler <amaechler@gmail.com> * * fix spelling error * * fix spelling --------- Co-authored-by: Andreas Maechler <amaechler@gmail.com>	2024-11-12 14:50:55 -05:00
Zoltan Haindrich	f296102f05	ScanQuery should not ignore columnTypes in equals/hashCode (#17463 ) * ScanQuery: equals/hashCode/toString * DruidQuery: changes of Align ScanQuery column order with its desired signature #17457 * ScanQueryTest: add equalsverifer test	2024-11-12 14:26:59 +05:30
Akshat Jain	c571e6905d	Refactor WindowOperatorQueryKit to use WindowStage class for representing different window stages (#17158 )	2024-11-12 14:18:16 +05:30
Virushade	8278a1f7df	Fix Javadocs in ColumnCapablities.java (#17462 )	2024-11-12 11:30:33 +05:30
Akshat Jain	3f56b57c7e	MSQ WF: Pass a flag from broker to determine operator chain transformation (#17443 )	2024-11-12 09:28:28 +05:30
Shekhar Prasad Rajak	ae049a4bab	AWS Glue Catalog for Iceberg ingest extension (#17392 ) * iceberg glue catalog dependencies added * GlueIcebergCatalog added in druid module * default version of iceberg glue catalog implementation - basics * basic tests added * removed dependecy iceberg-aws-bundle * glue catalog support - docs update for iceberg * Update IcebergDruidModule.java * Update IcebergDruidModule.java * updates in dependencies and warehousePath must be under catalogProp * removed some dependencies - which not required * only glue sdk added * update license * avro exclusion removed * doc update * doc update * set the type to glue * minor change * minor change * fixing codestyle * checkstyle fixes * checkstyle fixes * checkstyle fixes * dependency check fixes * update pom for ignore warning for glue catalog * compile scope needed - iceberg-aws and awssdk * updates pom with comment * minor change * mvn dependency check in iceberg extension * revert pom.xml changes * aws sdk sts and s3 for gluecatalog initialize * dependency check - ignore aws sdk s3 and sts --------- Co-authored-by: SHEKHAR PRASAD RAJAK <shekhar_rajak@apple.com>	2024-11-10 18:43:55 -08:00
jtuglu-netflix	f906d0d446	Fix query failed metric double count bug (#17454 )	2024-11-08 23:15:03 -08:00
Vivek Dhiman	0dcc2bc469	Fixed NPE in `array_overlap` and `array_contains`. (#17465 )	2024-11-08 20:39:14 -08:00
George Shiqi Wu	5764183d4e	k8s-based-ingestion: Wait for task lifecycles to enter RUNNING state before returning from KubernetesTaskRunner.start (#17446 ) * Add a wait on start() for task lifecycle to go into running * handle exceptions * Fix logging messages * Don't pass in the settable future as a arg * add some unit tests	2024-11-08 11:13:35 -05:00
Gian Merlino	d8162163c8	Run JDK 21 workflows with 21.0.4. (#17458 ) * Run JDK 21 workflows with 21.0.4. To work around #17429, run our JDK 21 workflows with version 21.0.4. It does not appear to have this problem. * Undo changes in standard-its.yml * Add comments. --------- Co-authored-by: Zoltan Haindrich <kirk@rxd.hu>	2024-11-07 10:53:52 -08:00
Nandini Anagondi	32394e55f9	Upgrading org.codehaus to com.fasterxml (#17371 )	2024-11-07 10:55:47 +01:00
Akshat Jain	73cbce9109	WindowOperatorQueryFrameProcessorFactory: Pass QueryContext instead of WindowOperatorQuery to WindowOperatorQueryFrameProcessor (#17405 ) * WindowOperatorQueryKit: Pass QueryContext instead of WindowOperatorQuery to subsequent layers * Add serializer for QueryContext class * Revert changes of WindowOperatorQueryFrameProcessorFactory json param * Fix checkstyle * Address review comment: Remove older method in favor of calling new method inline	2024-11-07 11:29:49 +05:30
Gian Merlino	9c25226e06	QueryableIndexSegment: Re-use time boundary inspector. (#17397 ) This patch re-uses timeBoundaryInspector for each cursor holder, which enables caching of minDataTimestamp and maxDataTimestamp. Fixes a performance regression introduced in #16533, where these fields stopped being cached across cursors. Prior to that patch, they were cached in the QueryableIndexStorageAdapter.	2024-11-06 09:27:59 -08:00
Abhishek Radhakrishnan	d8e4be654f	ManageLifecycle DropwizardEmitter instantiation. (#17451 )	2024-11-05 18:57:22 -08:00
George Shiqi Wu	8850023811	Fix error where communication failures to k8s can lead to stuck tasks (#17431 ) * Fix save logs error * Update extensions-contrib/kubernetes-overlord-extensions/src/main/java/org/apache/druid/k8s/overlord/common/KubernetesPeonClient.java Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com> * make things final * fix merge conflicts --------- Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>	2024-11-05 09:58:30 -08:00
Zoltan Haindrich	2eac8318f8	Support Union in Decoupled planning (#17354 ) * introduces `UnionQuery` * some changes to enable a `UnionQuery` to have multiple input datasources * `UnionQuery` execution is driven by the `QueryLogic` - which could later enable to reduce some complexity in `ClientQuerySegmentWalker` * to run the subqueries of `UnionQuery` there was a need to access the `conglomerate` from the `Runner`; to enable that some refactors were done * renamed `UnionQueryRunner` to `UnionDataSourceQueryRunner` * `QueryRunnerFactoryConglomerate` have taken the place of `QueryToolChestWarehouse` which shaves of some unnecessary things here and there * small cleanup/refactors	2024-11-05 16:58:57 +01:00
Virushade	ba76264244	Update build documentation (#17444 ) Add build instructions for developers Follow up from issue #17375, add instructions solely for distribution profile. Note that this build command is mostly used by me, everyone is welcome to add further optimizations for a faster distribution build. Co-authored-by: Abhishek Radhakrishnan <abhishek.rb19@gmail.com> * Update docs/development/build.md Co-authored-by: Abhishek Radhakrishnan <abhishek.rb19@gmail.com> * Update docs/development/build.md Co-authored-by: Abhishek Radhakrishnan <abhishek.rb19@gmail.com> --------- Co-authored-by: Abhishek Radhakrishnan <abhishek.rb19@gmail.com>	2024-11-04 18:31:46 -08:00
Tom	e4cdbca23c	make planner errors be user persona (#17437 ) Change the persona for errors within the planner from Admin to User. The ADMIN persona is meant to be "a persona who is interacting with admin APIs and understands Druid query concepts". This isn't an admin API, it's a query API. Low quality error messages being returned to the correct audience is better than hiding all error messages. The errors that can be returned back can be user solvable, and other times requires a druid expert. But the errors do not leak information that should only be seen by more expert/privileged personas. The original ADMIN persona showed some reticence to tag low-quality error messages with a USER persona. but it really does seem user-directed to me so USER to me would make sense.	2024-11-04 10:48:35 -08:00

1 2 3 4 5 ...

14655 Commits All Branches Search

14655 Commits

All Branches