druid

mirror of https://github.com/apache/druid.git synced 2025-02-27 22:09:12 +00:00

Author	SHA1	Message	Date
Zoltan Haindrich	d75dcea4dc	undo integration-tests module changes	2024-06-10 08:44:08 +00:00
Zoltan Haindrich	fb4b32e5b7	updates	2024-05-29 16:10:38 +00:00
Zoltan Haindrich	d1b4587eae	uu	2024-05-28 16:38:47 +00:00
Zoltan Haindrich	ff7abeb0f1	small cleanup	2024-05-28 15:54:29 +00:00
Zoltan Haindrich	295c09a03c	launcher-x	2024-05-28 15:44:22 +00:00
Zoltan Haindrich	ff31c14dba	clkeaup	2024-05-28 15:27:18 +00:00
Zoltan Haindrich	6c264f2977	move stuff	2024-05-28 15:22:42 +00:00
Zoltan Haindrich	ee195719b7	inline class	2024-05-28 15:17:07 +00:00
Zoltan Haindrich	a1b7f981fb	cleanup	2024-05-28 15:09:42 +00:00
Zoltan Haindrich	b20ee99371	clean	2024-05-28 15:07:09 +00:00
Zoltan Haindrich	9ae80f05de	Merge remote-tracking branch 'kgyrtkirk/quidem-runner-extension-submit' into quidem-record	2024-05-27 10:52:01 +00:00
Zoltan Haindrich	ec2ecde235	updates	2024-05-27 10:49:38 +00:00
Zoltan Haindrich	44ea4e1c51	Fix cds-coordinator-metadata-query-disabled (#16488 ) fixes the issue with the newly enabled `cds-coordiantor-metadata-query-disabled` [split](https://github.com/apache/druid/pull/16468) * configures to use `prepopulated-data` environment things to configure `S3` for access * this is needed because these tests use a [dataset which is loaded from s3](https://github.com/apache/druid/blob/master/integration-tests/docker/test-data/cds-coordinator-metadata-query-disabled-sample-data.sql) * also undoes the previous [fix](https://github.com/apache/druid/pull/16469) of setting the aws region explicitly as this is a more complete solution - and configuring `prepopulated-data` also sets the region; so that's not needed anymore	2024-05-22 20:42:11 +02:00
Zoltan Haindrich	4595b0c128	u[pdate	2024-05-21 14:18:55 +00:00
Zoltan Haindrich	ba09e7d1de	add	2024-05-21 14:00:02 +00:00
Zoltan Haindrich	08b73d1969	Revert "some stuff" This reverts commit 52598d3bca3c63a9563dba2e5b2fa775cc2e9cbd.	2024-05-21 12:02:46 +00:00
Zoltan Haindrich	52598d3bca	some stuff	2024-05-21 12:02:44 +00:00
Zoltan Haindrich	c948201507	Fix cds-task-schema-publish-disabled (#16469 ) set AWS_REGION=us-west-2 to avoid retries	2024-05-21 12:18:30 +05:30
zachjsh	dd5dc500ce	Catalog integration tests (#16424 ) * * add new catalog IT with failure to ensure that it is run in CI * * actually add failing test referred to and fix checkstyle * * add some tests * * fix checkstyle * * add test descriptions * * add more tests	2024-05-17 11:49:09 -04:00
Zoltan Haindrich	e7e119b559	reduce copypaste	2024-05-16 13:33:27 +00:00
Zoltan Haindrich	fc9a6c7740	move/etc	2024-05-16 13:23:45 +00:00
Zoltan Haindrich	cabf2a31c3	fix	2024-05-16 13:19:30 +00:00
Zoltan Haindrich	1d2a79f5be	cleanup	2024-05-16 13:01:53 +00:00
Zoltan Haindrich	1fb9fac159	remove cl	2024-05-16 12:59:20 +00:00
Zoltan Haindrich	76ffbfb7cf	cl	2024-05-16 12:50:38 +00:00
Zoltan Haindrich	e2986ae612	cleanup	2024-05-16 12:49:10 +00:00
Zoltan Haindrich	bec1f38a0e	move sqlmodule down	2024-05-16 11:17:05 +00:00
Zoltan Haindrich	93892b6524	undo some	2024-05-16 11:11:03 +00:00
Zoltan Haindrich	b63a80e5b7	passes basic test	2024-05-16 11:01:39 +00:00
Zoltan Haindrich	118eb61939	there - with 1 boot	2024-05-16 10:31:38 +00:00
Zoltan Haindrich	28ea884e19	almost ready?	2024-05-16 10:01:22 +00:00
Zoltan Haindrich	27735f2621	move disco	2024-05-16 09:50:10 +00:00
Zoltan Haindrich	cab3d945be	up	2024-05-16 09:48:18 +00:00
Zoltan Haindrich	c9638b7836	update	2024-05-16 09:44:16 +00:00
Zoltan Haindrich	5f552a2997	c	2024-05-16 09:30:41 +00:00
Zoltan Haindrich	074161dfde	add some service crap	2024-05-16 05:53:42 +00:00
Zoltan Haindrich	55b2051f9d	workinhg stuff	2024-05-15 16:23:11 +00:00
Zoltan Haindrich	8ee41f58d0	it does work	2024-05-15 15:14:43 +00:00
Zoltan Haindrich	d4b052a579	stuff	2024-05-15 11:57:13 +00:00
Zoltan Haindrich	73011267af	triaks	2024-05-15 10:34:48 +00:00
Zoltan Haindrich	43fd8af63c	Revert "add" This reverts commit 3fbb3cb853456bebccfbf8fc16ba7f30a810c26c.	2024-05-14 09:39:04 +00:00
Zoltan Haindrich	3fbb3cb853	add	2024-05-14 09:39:02 +00:00
Akshat Jain	bacdb4c48d	Update integration tests related documentation for better clarity (#16313 )	2024-05-13 11:27:21 +05:30
Alberic Liu	92fb0ff718	upgrade mysql:mysql-connector-java to 8.2.0 (#16024 ) * upgrade mysql:mysql-connector-java to 8.2.0 * fix the check errors * remove unused comment	2024-05-06 21:58:37 +08:00
Rishabh Singh	c61c3785a0	Followup changes to 15817 (Segment schema publishing and polling) (#16368 ) * Fix build * Nit changes in KillUnreferencedSegmentSchema * Replace reference to the abbreviation SMQ with Metadata Query, rename inTransit maps in schema cache * nitpicks * Remove reference to smq abbreviation from integration-tests * Remove reference to smq abbreviation from integration-tests * minor change * Update index.md * Add delimiter while computing schema fingerprint hash	2024-05-03 19:13:52 +05:30
Kashif Faraz	e5b40b0b8c	Miscellaneous cleanup of load queue references (#16367 ) Changes: - Rename `DataSegmentChangeRequestAndStatus` to `DataSegmentChangeResponse` - Rename `SegmentLoadDropHandler.Status` to `SegmentChangeStatus` - Remove method `CoordinatorRunStats.getSnapshotAndReset()` as it was used only in load queue peon implementations. Using an atomic reference is much simpler. - Remove `ServerTestHelper.MAPPER`. Use existing `TestHelper.makeJsonMapper()` instead.	2024-05-02 15:59:50 +05:30
Gian Merlino	5d1950d451	MSQ controller: Support in-memory shuffles; towards JVM reuse. (#16168 ) * MSQ controller: Support in-memory shuffles; towards JVM reuse. This patch contains two controller changes that make progress towards a lower-latency MSQ. First, support for in-memory shuffles. The main feature of in-memory shuffles, as far as the controller is concerned, is that they are not fully buffered. That means that whenever a producer stage uses in-memory output, its consumer must run concurrently. The controller determines which stages run concurrently, and when they start and stop. "Leapfrogging" allows any chain of sort-based stages to use in-memory shuffles even if we can only run two stages at once. For example, in a linear chain of stages 0 -> 1 -> 2 where all do sort-based shuffles, we can use in-memory shuffling for each one while only running two at once. (When stage 1 is done reading input and about to start writing its output, we can stop 0 and start 2.) 1) New OutputChannelMode enum attached to WorkOrders that tells workers whether stage output should be in memory (MEMORY), or use local or durable storage. 2) New logic in the ControllerQueryKernel to determine which stages can use in-memory shuffling (ControllerUtils#computeStageGroups) and to launch them at the appropriate time (ControllerQueryKernel#createNewKernels). 3) New "doneReadingInput" method on Controller (passed down to the stage kernels) which allows stages to transition to POST_READING even if they are not gathering statistics. This is important because it enables "leapfrogging" for HASH_LOCAL_SORT shuffles, and for GLOBAL_SORT shuffles with 1 partition. 4) Moved result-reading from ControllerContext#writeReports to new QueryListener interface, which ControllerImpl feeds results to row-by-row while the query is still running. Important so we can read query results from the final stage using an in-memory channel. 5) New class ControllerQueryKernelConfig holds configs that control kernel behavior (such as whether to pipeline, maximum number of concurrent stages, etc). Generated by the ControllerContext. Second, a refactor towards running workers in persistent JVMs that are able to cache data across queries. This is helpful because I believe we'll want to reuse JVMs and cached data for latency reasons. 1) Move creation of WorkerManager and TableInputSpecSlicer to the ControllerContext, rather than ControllerImpl. This allows managing workers and work assignment differently when JVMs are reusable. 2) Lift the Controller Jersey resource out from ControllerChatHandler to a reusable resource. 3) Move memory introspection to a MemoryIntrospector interface, and introduce ControllerMemoryParameters that uses it. This makes it easier to run MSQ in process types other than Indexer and Peon. Both of these areas will have follow-ups that make similar changes on the worker side. * Address static checks. * Address static checks. * Fixes. * Report writer tests. * Adjustments. * Fix reports. * Review updates. * Adjust name. * Small changes.	2024-04-30 21:30:27 -07:00
Adarsh Sanjeev	9a2d7c28bc	Prepare master branch for 31.0.0 release (#16333 )	2024-04-26 09:22:43 +05:30
Rishabh Singh	e30790e013	Introduce Segment Schema Publishing and Polling for Efficient Datasource Schema Building (#15817 ) Issue: #14989 The initial step in optimizing segment metadata was to centralize the construction of datasource schema in the Coordinator (#14985). Thereafter, we addressed the problem of publishing schema for realtime segments (#15475). Subsequently, our goal is to eliminate the requirement for regularly executing queries to obtain segment schema information. This is the final change which involves publishing segment schema for finalized segments from task and periodically polling them in the Coordinator.	2024-04-24 22:22:53 +05:30
Laksh Singla	b9bbde5c0a	Fix deadlock that can occur while merging group by results (#15420 ) This PR prevents such a deadlock from happening by acquiring the merge buffers in a single place and passing it down to the runner that might need it.	2024-04-22 14:10:44 +05:30

1 2 3 4 5 ...

648 Commits