druid

mirror of https://github.com/apache/druid.git synced 2025-02-06 01:58:20 +00:00

Author	SHA1	Message	Date
Zoltan Haindrich	1063948749	this doesnt work	2024-08-05 14:15:02 +00:00
Zoltan Haindrich	c40474285c	updates	2024-08-05 13:49:08 +00:00
Zoltan Haindrich	f4af51ef7f	extend/cleanup/etc	2024-08-05 13:41:53 +00:00
Zoltan Haindrich	bc70443c7f	update few more	2024-08-05 13:20:13 +00:00
Zoltan Haindrich	841ab462dd	Merge branch 'quidem-record' into quidem-msq	2024-08-05 13:00:59 +00:00
Zoltan Haindrich	fda0d63e44	Merge remote-tracking branch 'apache/master' into quidem-record	2024-08-05 13:00:50 +00:00
Zoltan Haindrich	929e68c11a	undo unrelated	2024-08-05 12:59:50 +00:00
Zoltan Haindrich	436ba18815	x	2024-08-05 12:59:19 +00:00
Zoltan Haindrich	26e3c44f4b	Quidem record (#16624 ) * enables to launch a fake broker based on test resources (druidtest uri) * could record queries into new testfiles during usage * instead of re-purpose Calcite's Hook migrates to use DruidHook which we can add further keys * added a quidem-ut module which could be the place for tests which could iteract with modules/etc	2024-08-05 14:58:32 +02:00
Akshat Jain	08f9ec1cae	Memoize the redundant calls to overlord in sql statements endpoint (#16839 )	2024-08-05 16:52:56 +05:30
Rushikesh Bankar	c8323d1a7c	Add indexer task success and failure metrics (#16829 ) This PR adds indexer-level task metrics- "indexer/task/failed/count" "indexer/task/success/count" the current "worker/task/completed/count" metric shows all the tasks completed irrespective of success or failure status so these metrics would help us get more visibility into the status of the completed tasks	2024-08-05 16:21:27 +05:30
Zoltan Haindrich	70e46eadb9	update	2024-08-05 09:07:46 +00:00
Zoltan Haindrich	090f937d58	Merge branch 'quidem-record' into quidem-msq	2024-08-05 09:03:53 +00:00
Zoltan Haindrich	bb23ace518	builtintypes instead nesteddata	2024-08-05 08:59:48 +00:00
Laksh Singla	c84e689eb8	Don't use ComplexMetricExtractor to fetch the class of the object in field readers (#16825 ) This patch fixes queries like `SELECT COUNT(DISTINCT json_col) FROM foo`	2024-08-05 14:13:56 +05:30
Laksh Singla	0411c4e67e	Add metrics for number of rows/bytes materialized while running subqueries (#16835 ) subquery/rows and subquery/bytes metrics have been added, which indicate the size of the results materialized on the heap.	2024-08-05 14:13:20 +05:30
Zoltan Haindrich	e6add9ea84	Merge remote-tracking branch 'apache/master' into quidem-record	2024-08-05 07:04:02 +00:00
Sree Charan Manamala	c7eacd079e	fallback SQL IN filter to expression filter when VirtualColumnRegistry is null (#16836 )	2024-08-05 11:27:51 +05:30
Abhishek Radhakrishnan	31b43753fb	Add `druid.indexing.formats.stringMultiValueHandlingMode` system config (#16822 ) This patch introduces an optional cluster configuration, druid.indexing.formats.stringMultiValueHandlingMode, allowing operators to override the default mode SORTED_SET for string dimensions. The possible values for the config are SORTED_SET, SORTED_ARRAY, or ARRAY (SORTED_SET is the default). Case insensitive values are allowed. While this cluster property allows users to manage the multi-value handling mode for string dimension types, it's recommended to migrate to using real array types instead of MVDs. This fixes a long-standing issue where compaction will honor the configured cluster wide property instead of rewriting it as the default SORTED_ARRAY always, even if the data was originally ingested with ARRAY or SORTED_SET.	2024-08-03 10:23:44 -07:00
Kashif Faraz	9dc2569f22	Track and emit segment loading rate for HttpLoadQueuePeon on Coordinator (#16691 ) Design: The loading rate is computed as a moving average of at least the last 10 GiB of successful segment loads. To account for multiple loading threads on a server, we use the concept of a batch to track load times. A batch is a set of segments added by the coordinator to the load queue of a server in one go. Computation: batchDurationMillis = t(load queue becomes empty) - t(first load request in batch is sent to server) batchBytes = total bytes successfully loaded in batch avg loading rate in batch (kbps) = (8 * batchBytes) / batchDurationMillis overall avg loading rate (kbps) = (8 * sumOverWindow(batchBytes)) / sumOverWindow(batchDurationMillis) Changes: - Add `LoadingRateTracker` which computes a moving average load rate based on the last few GBs of successful segment loads. - Emit metric `segment/loading/rateKbps` from the Coordinator. In the future, we may also consider emitting this metric from the historicals themselves. - Add `expectedLoadTimeMillis` to response of API `/druid/coordinator/v1/loadQueue?simple`	2024-08-03 13:14:21 +05:30
Abhishek Radhakrishnan	fe6772a101	Rename test builder `MSQTester.setExpectedSegment` (#16837 ) * Rename setExpectedSegment to setExpectedSegments in MSQTestBase. * Add expected segments for max num segments test cases.	2024-08-02 10:01:55 -07:00
zachjsh	9b731e8f0a	Kinesis Input Format for timestamp, and payload parsing (#16813 ) * SQL syntax error should target USER persona * * revert change to queryHandler and related tests, based on review comments * * add test * Introduce KinesisRecordEntity to support Kinesis headers in InputFormats * * add kinesisInputFormat and Reader, and tests * * bind KinesisInputFormat class to module * * improve test coverage * * remove references to kafka * * resolve review comments * * remove comment * * fix grammer of comment * * fix comment again * * fix comment again * * more review comments * * add partitionKey * * add check for same timestamp and partitionKey column name * * fix intellij inspection	2024-08-02 08:48:44 -04:00
Akshat Jain	63ba5a4113	Fix issues with fetching task reports in SQL statements endpoint for middlemanager (#16832 )	2024-08-01 23:37:15 -04:00
Vadim Ogievetsky	8c170f7d0e	Web console: use stages, counters, and warnings from the new detailed status API (#16809 ) * stages and counters can be seen on the status reponse * warnings are exposed also * mark as msq when attached * update snapshots * download CSV/TSV null as empty cell	2024-08-01 02:30:30 -07:00
Akshat Jain	bb4d6cc001	Add task report fields in response of SQL statements endpoint (#16808 ) If the optional query parameter detail is supplied, then the response also includes the following: * A stages object that summarizes information about the different stages being used for query execution, such as stage number, phase, start time, duration, input and output information, processing methods, and partitioning. * A counters object that provides details on the rows, bytes, and files processed at various stages for each worker across different channels, along with sort progress. * A warnings object that provides details about any warnings.	2024-08-01 10:26:04 +05:30
Zoltan Haindrich	5e81a026e9	Merge branch 'quidem-record' into quidem-msq	2024-07-31 15:27:59 +00:00
Gian Merlino	01f6cfcbf5	MSQ worker: Support in-memory shuffles. (#16790 ) * MSQ worker: Support in-memory shuffles. This patch is a follow-up to #16168, adding worker-side support for in-memory shuffles. Changes include: 1) Worker-side code now respects the same context parameter "maxConcurrentStages" that was added to the controller in #16168. The parameter remains undocumented for now, to give us a chance to more fully develop and test this functionality. 1) WorkerImpl is broken up into WorkerImpl, RunWorkOrder, and RunWorkOrderListener to improve readability. 2) WorkerImpl has a new StageOutputHolder + StageOutputReader concept, which abstract over memory-based or file-based stage results. 3) RunWorkOrder is updated to create in-memory stage output channels when instructed to. 4) ControllerResource is updated to add /doneReadingInput/, so the controller can tell when workers that sort, but do not gather statistics, are done reading their inputs. 5) WorkerMemoryParameters is updated to consider maxConcurrentStages. Additionally, WorkerChatHandler is split into WorkerResource, so as to match ControllerChatHandler and ControllerResource. * Updates for static checks, test coverage. * Fixes. * Remove exception. * Changes from review. * Address static check. * Changes from review. * Improvements to docs and method names. * Update comments, add test. * Additional javadocs. * Fix throws. * Fix worker stopping in tests. * Fix stuck test.	2024-07-30 18:41:24 -07:00
Edgar Melendrez	3bb6d40285	[docs] batch 5 updating functions (#16812 ) * batch 5 * Update docs/querying/sql-functions.md * applying suggestions --------- Co-authored-by: Benedict Jin <asdf2014@apache.org>	2024-07-30 17:30:01 -07:00
Edgar Melendrez	85a8a1d805	[Docs]Batch04 - Bitwise numeric functions (#16805 ) * Batch04 - Bitwise numeric functions * Batch04 - Bitwise numeric functions * minor fixes * rewording bitwise_shift functions * rewording bitwise_shift functions * Update docs/querying/sql-functions.md * applying suggestions --------- Co-authored-by: Benedict Jin <asdf2014@apache.org>	2024-07-30 10:53:59 -07:00
Zoltan Haindrich	a03fb49f4f	remove exception	2024-07-30 16:34:06 +00:00
Zoltan Haindrich	5f6290eb54	use updated hook class	2024-07-30 16:11:57 +00:00
Zoltan Haindrich	de207c7295	move key	2024-07-30 16:04:11 +00:00
Zoltan Haindrich	b1ab252b31	Merge branch 'quidem-record' into quidem-msq	2024-07-30 16:03:33 +00:00
Zoltan Haindrich	ca121aa083	remove firehose	2024-07-30 14:29:02 +00:00
Zoltan Haindrich	eb2a047e4b	Merge remote-tracking branch 'apache/master' into quidem-record	2024-07-30 14:24:37 +00:00
Zoltan Haindrich	57980066b3	rename module (cherry picked from commit 8d40cca50a3e4c8098f49f5d588c7b7220b76788)	2024-07-30 14:04:07 +00:00
Zoltan Haindrich	7f99ee24d7	fix copy-pasted crap	2024-07-30 14:03:46 +00:00
Zoltan Haindrich	b345dd9d03	updates/fix style/etc	2024-07-30 13:25:40 +00:00
Zoltan Haindrich	df42245685	add apidoc/etc	2024-07-30 13:14:51 +00:00
Zoltan Haindrich	9ac26e3a89	wire-in hookdispatcher thru connection/etc	2024-07-30 12:29:36 +00:00
Zoltan Haindrich	78b75d3e8e	move more to non-static	2024-07-30 10:42:41 +00:00
Zoltan Haindrich	f6cc540368	use druidhookdispatcherr#1	2024-07-30 10:33:57 +00:00
Zoltan Haindrich	ce667eeb5e	move stuff around / prepare to unglobalize	2024-07-30 10:23:35 +00:00
Zoltan Haindrich	4157a8f105	add/.etc	2024-07-30 10:16:03 +00:00
Kashif Faraz	954aaafe0c	Refactor: Clean up compaction config classes (#16810 ) Changes: - Rename `CoordinatorCompactionConfig` to `DruidCompactionConfig` - Rename `CompactionConfigUpdateRequest` to `ClusterCompactionConfig` - Refactor methods in `DruidCompactionConfig` - Clean up `DataSourceCompactionConfigHistory` and its tests - Clean up tests and add new tests - Change API path `/druid/coordinator/v1/config/global` to `/druid/coordinator/v1/config/cluster`	2024-07-30 12:17:25 +05:30
AmatyaAvadhanula	92a40d8169	Add API to fetch conflicting task locks (#16799 ) * Add API to fetch conflicting active locks	2024-07-30 11:40:48 +05:30
Vishesh Garg	e9ea243d97	Enable compaction ITs on MSQ engine (#16778 ) Follow-up to #16291, this commit enables a subset of existing native compaction ITs on the MSQ engine. In the process, the following changes have been introduced in the MSQ compaction flow: - Populate `metricsSpec` in `CompactionState` from `querySpec` in `MSQControllerTask` instead of `dataSchema` - Add check for pre-rolled-up segments having `AggregatorFactory` with different input and output column names - Fix passing missing cluster-by clause in scan queries - Add annotation of `CompactionState` to tombstone segments	2024-07-30 09:34:46 +05:30
Zoltan Haindrich	c7cde31a89	HAVING clauses may not contain window functions (#16742 ) Rejects having clauses if they contain windowed expressions. Also added a check to produce a more descriptive error if an OVER expression reaches the filter translation layer. --------- Co-authored-by: Benedict Jin <asdf2014@apache.org>	2024-07-29 04:11:36 -04:00
dependabot[bot]	f5527dc3e7	Bump io.grpc:grpc-netty-shaded from 1.57.2 to 1.65.1 (#16731 ) Bumps [io.grpc:grpc-netty-shaded](https://github.com/grpc/grpc-java) from 1.57.2 to 1.65.1. - [Release notes](https://github.com/grpc/grpc-java/releases) - [Commits](https://github.com/grpc/grpc-java/compare/v1.57.2...v1.65.1) --- updated-dependencies: - dependency-name: io.grpc:grpc-netty-shaded dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: asdf2014 <asdf2014@apache.org>	2024-07-29 14:51:39 +08:00
dependabot[bot]	cbca0dc969	Bump jclouds.version from 2.5.0 to 2.6.0 (#16796 ) Bumps `jclouds.version` from 2.5.0 to 2.6.0. Updates `org.apache.jclouds:jclouds-core` from 2.5.0 to 2.6.0 Updates `org.apache.jclouds.api:openstack-swift` from 2.5.0 to 2.6.0 Updates `org.apache.jclouds.driver:jclouds-slf4j` from 2.5.0 to 2.6.0 Updates `org.apache.jclouds.api:openstack-keystone` from 2.5.0 to 2.6.0 Updates `org.apache.jclouds.api:rackspace-cloudfiles` from 2.5.0 to 2.6.0 Updates `org.apache.jclouds.provider:rackspace-cloudfiles-us` from 2.5.0 to 2.6.0 Updates `org.apache.jclouds.provider:rackspace-cloudfiles-uk` from 2.5.0 to 2.6.0 --- updated-dependencies: - dependency-name: org.apache.jclouds:jclouds-core dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: org.apache.jclouds.api:openstack-swift dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: org.apache.jclouds.driver:jclouds-slf4j dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: org.apache.jclouds.api:openstack-keystone dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: org.apache.jclouds.api:rackspace-cloudfiles dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: org.apache.jclouds.provider:rackspace-cloudfiles-us dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: org.apache.jclouds.provider:rackspace-cloudfiles-uk dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: asdf2014 <asdf2014@apache.org>	2024-07-29 14:49:26 +08:00

1 2 3 4 5 ...

14430 Commits