druid

Commit Graph

Author	SHA1	Message	Date
Abhishek Radhakrishnan	c7f1872bd1	Fixup KillUnusedSegmentsTest (#16094 ) Changes: - Use an actual SqlSegmentsMetadataManager instead of TestSqlSegmentsMetadataManager - Simplify TestSegmentsMetadataManager - Add a test for large interval segments.	2024-03-11 13:37:48 +05:30
Kashif Faraz	5f203725dd	Clean up SqlSegmentsMetadataManager and corresponding tests (#16044 ) Changes: Improve `SqlSegmentsMetadataManager` - Break the loop in `populateUsedStatusLastUpdated` before going to sleep if there are no more segments to update - Add comments and clean up logs Refactor `SqlSegmentsMetadataManagerTest` - Merge `SqlSegmentsMetadataManagerEmptyTest` into this test - Add method `testPollEmpty` - Shave a few seconds off of the tests by reducing poll duration - Simplify creation of test segments - Some renames here and there - Remove unused methods - Move `TestDerbyConnector.allowLastUsedFlagToBeNull` to this class Other minor changes - Add javadoc to `NoneShardSpec` - Use lambda in `SqlSegmentMetadataPublisher`	2024-03-08 07:34:51 +05:30
AmatyaAvadhanula	5871b81a78	Fix race in BaseNodeRoleWatcher tests (#16064 ) * Fix race in BaseNodeRoleWatcher tests * Make non static	2024-03-07 13:41:16 -08:00
Laksh Singla	5f588fa45c	Fix bug while materializing scan's result to frames (#15987 ) While converting Sequence<ScanResultValue> to Sequence<Frames>, when maxSubqueryBytes is enabled, we batch the results to prevent creating a single frame per ScanResultValue. Batching requires peeking into the actual value, and checking if the row signature of the scan result’s value matches that of the previous value. Since we can do this indefinitely (in the worst case all of them have the same signature), we keep fetching them and accumulating them in a list (on the heap). We don’t really know how much to batch before we actually write the value as frames. The PR modifies the batching logic to not accumulate the results in an intermediary list	2024-03-07 17:11:44 +05:30
Parth Agrawal	bf39c71d2a	Update protocol for MemcachedCache (#16035 )	2024-03-06 22:28:11 -08:00
AmatyaAvadhanula	c2841425f4	Handle uninitialized cache in Node role watchers (#15726 ) BaseNodeRoleWatcher counts down cacheInitialized after a timeout, but also sets some flag that it was a timed-out initialization. and call nodeViewInitializationTimedOut (new method on listeners) instead of nodeViewInitialized. Then listeners can do what is most appropriate with this information.	2024-03-06 16:00:24 +05:30
Gian Merlino	930655ff18	Move retries into DataSegmentPusher implementations. (#15938 ) * Move retries into DataSegmentPusher implementations. The individual implementations know better when they should and should not retry. They can also generate better error messages. The inspiration for this patch was a situation where EntityTooLarge was generated by the S3DataSegmentPusher, and retried uselessly by the retry harness in PartialSegmentMergeTask. * Fix missing var. * Adjust imports. * Tests, comments, style. * Remove unused import.	2024-03-04 10:36:21 -08:00
Sree Charan Manamala	820febf38c	Improved Connection Count server select strategy (#15975 ) Updated the Direct Druid Client so as to make Connection Count Server Selector Strategy work more efficiently. If creating connection to a node is slow, then that slowness wouldn't be accounted for if we count the open connections after sending the request. So we increment the counter and then send the request.	2024-03-04 15:02:32 +05:30
George Shiqi Wu	ef48aceff8	Fix segment/unavailable/count (#16020 )	2024-03-01 15:38:27 -05:00
Sensor	e0bce0ef90	Add pre-check for heavy debug logs (#15706 ) Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com> Co-authored-by: Benedict Jin <asdf2014@apache.org>	2024-02-29 12:58:14 +05:30
Adarsh Sanjeev	d2c2036ea2	Optimize MSQ realtime queries (#15399 ) Currently, while reading results from realtime tasks, requests are sent on a segment level. This is slightly wasteful, as when contacting a data servers, it is possible to transfer results for all segments which it is hosting, instead of only one segment at a time. One change this PR makes is to group the segments on the basis of servers. This reduces the number of queries to data servers made. Since we don't have access to the number of rows for realtime segments, the grouping is done with a fixed estimated number of rows for each realtime segment.	2024-02-28 11:32:14 +05:30
Abhishek Radhakrishnan	beccc401e1	Segments created in the same batch have the same `created_date` entry & rename metric (#15977 ) * All segments stored in the same batch have the same created_date entry. In the absence of a group_id column, this metadata would allow us to easily reason about and troubleshoot ingestion-related issues. * Rename metric name and code references to eligibleUnusedSegments. Address review comment from https://github.com/apache/druid/pull/15941#discussion_r1503631992	2024-02-27 17:28:43 +05:30
Abhishek Radhakrishnan	38ecf980d0	Refactor and add tests and metric to KillUnusedSegments duty (auto-kill) (#15941 ) * Kill duty and test improvements. Initial commit with: - Bug fixes - auto-kill can throw NPE when there are no datasources present and defaults mismatch. - Add new stat for candidate segment intervals killed. - Move a couple of debug logs to info logs for improved visibility (should only log once per kill period). - Remove redundant checks for code readability. - Updated tests from using mocks (also the mocks weren't using last updated timestamp) and add more test coverage for different config parameters. - Add a couple of unit tests that are ignored for the eternity case to prove that the kill duty doesn't clean up segments with ALL grain or that end in DateTimes.MAX. - Migrate Druid exception from user to operator persona. * Address review comments. * Remove unused methods. * fix up format specifier and validate bad config tests. * Consolidate the helpers a bit more and add another test. * Update test names. Add javadoc placeholders for slightly involved tests. * Add docs for metric kill/candidateUnusedSegments/count. Also, rename to disambiguate. * Comments. * Apply logging suggestions from code review Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com> * Review comments - Clarify docs on eligibility. - Add test for multiple segments in the same interval. Clarify comment. - Remove log line from test. - Remove lastUpdatedDate = now.plus(10) from test. * minor cleanup. * Clarify javadocs for getUnusedSegmentIntervals(). --------- Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>	2024-02-27 12:14:41 +05:30
Laksh Singla	17e4f3ac60	Refactor GroupBy and TopN code to relax the constraint of dimensions being comparable (#15559 ) The code in the groupBy engine and the topN engine assume that the dimensions are comparable and can call dimA.compareTo(dimB) to sort the dimensions and group them together. This works well for the primitive dimensions, because they are Comparable, however falls apart when the dimensions can be arrays (or in future scenarios complex columns). In cases when the dimensions are not comparable, Druid resorts to having a wrapper type ComparableStringArray and ComparableList, which is a Comparable, based on the list comparator.	2024-02-27 11:39:29 +05:30
AmatyaAvadhanula	e2b7289dea	Try to fetch the task status for an active from memory (#15724 ) * Reduce metadata calls to fetch the status for an active task	2024-02-26 13:53:05 +05:30
Bartosz Mikulski	45c26e8682	Fix Inspection Check in DirectDruidClientTest (#15857 )	2024-02-07 02:56:26 -08:00
Bartosz Mikulski	43a1c96cd1	Fix query-cancellation-executor memory leak (#15754 ) This PR fixes #15069 by resolving a memory leak caused by a thread leak in query-cancellation-executor.	2024-02-07 10:54:38 +05:30
Pramod Immaneni	59bca0951a	Parallelize storage of incremental segments (#13982 ) During ingestion, incremental segments are created in memory for the different time chunks and persisted to disk when certain thresholds are reached (max number of rows, max memory, incremental persist period etc). In the case where there are a lot of dimension and metrics (1000+) it was observed that the creation/serialization of incremental segment file format for persistence and persisting the file took a while and it was blocking ingestion of new data. This affected the real-time ingestion. This serialization and persistence can be parallelized across the different time chunks. This update aims to do that. The patch adds a simple configuration parameter to the ingestion tuning configuration to specify number of persistence threads. The default value is 1 if it not specified which makes it the same as it is today.	2024-02-07 10:43:05 +05:30
Rishabh Singh	de959e513d	Add QueryLifecycle#authorize for grpc-query-extension (#15816 ) Proposal #13469 Original PR #14024 A new method is being added in QueryLifecycle class to authorise a query based on authentication result. This method is required since we authenticate the query by intercepting it in the grpc extension and pass down the authentication result.	2024-02-02 21:49:57 +05:30
Zoltan Haindrich	f701197224	Enable ArrayListRowsAndColumns to StorageAdapter conversion (#15735 )	2024-01-31 02:36:58 -05:00
AmatyaAvadhanula	d9e8448c50	Close open segments when a newer segment with higher version is allocated (#15727 )	2024-01-31 09:11:00 +05:30
zachjsh	ae6afc0751	Extend unused segment metadata api response to include created date and last used updated time (#15738 ) ### Description The unusedSegment api response was extended to include the original DataSegment object with the creation time and last used update time added to it. A new object `DataSegmentPlus` was created for this purpose, and the metadata queries used were updated as needed. example response: ``` [ { "dataSegment": { "dataSource": "inline_data", "interval": "2023-01-02T00:00:00.000Z/2023-01-03T00:00:00.000Z", "version": "2024-01-25T16:06:42.835Z", "loadSpec": { "type": "local", "path": "/Users/zachsherman/projects/opensrc-druid/distribution/target/apache-druid-30.0.0-SNAPSHOT/var/druid/segments/inline_data/2023-01-02T00:00:00.000Z_2023-01-03T00:00:00.000Z/2024-01-25T16:06:42.835Z/0/index/" }, "dimensions": "str_dim,double_measure1,double_measure2", "metrics": "", "shardSpec": { "type": "numbered", "partitionNum": 0, "partitions": 1 }, "binaryVersion": 9, "size": 1402, "identifier": "inline_data_2023-01-02T00:00:00.000Z_2023-01-03T00:00:00.000Z_2024-01-25T16:06:42.835Z" }, "createdDate": "2024-01-25T16:06:44.196Z", "usedStatusLastUpdatedDate": "2024-01-25T16:07:34.909Z" } ] ```	2024-01-26 15:47:40 -05:00
Gian Merlino	01e9d963bd	Merge hydrant runners flatly for realtime queries. (#15757 ) * Merge hydrant runners flatly for realtime queries. Prior to this patch, we have two layers of mergeRunners for realtime queries: one for each Sink (a logical segment) and one across all Sinks. This is done because to keep metrics and results grouped by Sink, given that each FireHydrant within a Sink has its own separate storage adapter. However, it costs extra memory usage due to the extra layer of materialization. This is especially pronounced for groupBy queries, which only use their merge buffers at the top layer of merging. The lower layer of merging materializes ResultRows directly into the heap, which can cause heap exhaustion if there are enough ResultRows. This patch changes to a single layer of merging when bySegment: false, just like Historicals. To accommodate that, segment metrics like query/segment/time are now per-FireHydrant instead of per-Sink. Two layers of merging are retained when bySegment: true. This isn't common, because it's typically only used when segment level caching is enabled on the Broker, which is off by default. * Use SinkQueryRunners. * Remove unused method.	2024-01-25 19:07:57 +08:00
Abhishek Agarwal	0ab2781a7f	Disable eager initialization for non-query connection requests (#15751 )	2024-01-25 14:38:50 +05:30
Abhishek Radhakrishnan	06b228ff7c	Return a 503 status code instead of 400 during transient errors (#15756 ) * Fix up HTTP status error code * Keep the unsupported proxy as 400 * Tests and rename	2024-01-24 18:12:24 -08:00
Karan Kumar	c4990f56d6	Prepare main branch for next 30.0.0 release. (#15707 )	2024-01-23 15:55:54 +05:30
Zoltan Haindrich	d6a12c4389	Add ability to enable ResultCache in tests (#15465 )	2024-01-22 09:02:59 -05:00
Abhishek Radhakrishnan	38c1def95a	Kill tasks honor the buffer period of unused segments (#15710 ) * Kill tasks should honor the buffer period of unused segments. - The coordinator duty KillUnusedSegments determines an umbrella interval for each datasource to determine the kill interval. There can be multiple unused segments in an umbrella interval with different used_status_last_updated timestamps. For example, consider an unused segment that is 30 days old and one that is 1 hour old. Currently the kill task after the 30-day mark would kill both the unused segments and not retain the 1-hour old one. - However, when a kill task is instantiated with this umbrella interval, it’d kill all the unused segments regardless of the last updated timestamp. We need kill tasks and RetrieveUnusedSegmentsAction to honor the bufferPeriod to avoid killing unused segments in the kill interval prematurely. * Clarify default behavior in docs. * test comments * fix canDutyRun() * small updates. * checkstyle * forbidden api fix * doc fix, unused import, codeql scan error, and cleanup logs. * Address review comments * Rename maxUsedFlagLastUpdatedTime to maxUsedStatusLastUpdatedTime This is consistent with the column name `used_status_last_updated`. * Apply suggestions from code review Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com> * Make period Duration type * Remove older variants of runKilLTask() in OverlordClient interface * Test can now run without waiting for canDutyRun(). * Remove previous variants of retrieveUnusedSegments from internal metadata storage coordinator interface. Removes the following interface methods in favor of a new method added: - retrieveUnusedSegmentsForInterval(String, Interval) - retrieveUnusedSegmentsForInterval(String, Interval, Integer) * Chain stream operations * cleanup * Pass in the lastUpdatedTime to markUnused test function and remove sleep. --------- Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>	2024-01-18 22:23:50 -08:00
AmatyaAvadhanula	a26defd64b	Clean up stale entries from upgradeSegments table (#15637 ) * Clean up stale entries from upgradeSegments table	2024-01-17 20:49:52 +05:30
AmatyaAvadhanula	67720b60ae	Skip compaction for intervals without data (#15676 ) * Skip compaction for intervals with a single tombstone and no data	2024-01-16 12:31:36 +05:30
Sam Rash	072b16c6df	Fix SQL Innterval.of() error message (#15454 ) Better error message for poorly constructed intervals	2024-01-15 22:34:35 -06:00
Kashif Faraz	18d2a8957f	Refactor: Cleanup test impls of ServiceEmitter (#15683 )	2024-01-15 17:37:00 +05:30
Gian Merlino	cccf13ea82	Reverse, pull up lookups in the SQL planner. (#15626 ) * Reverse, pull up lookups in the SQL planner. Adds two new rules: 1) ReverseLookupRule, which eliminates calls to LOOKUP by doing reverse lookups. 2) AggregatePullUpLookupRule, which pulls up calls to LOOKUP above GROUP BY, when the lookup is injective. Adds configs `sqlReverseLookup` and `sqlPullUpLookup` to control whether these rules fire. Both are enabled by default. To minimize the chance of performance problems due to many keys mapping to the same value, ReverseLookupRule refrains from reversing a lookup if there are more keys than `inSubQueryThreshold`. The rationale for using this setting is that reversal works by generating an IN, and the `inSubQueryThreshold` describes the largest IN the user wants the planner to create. * Add additional line. * Style. * Remove commented-out lines. * Fix tests. * Add test. * Fix doc link. * Fix docs. * Add one more test. * Fix tests. * Logic, test updates. * - Make FilterDecomposeConcatRule more flexible. - Make CalciteRulesManager apply reduction rules til fixpoint. * Additional tests, simplify code.	2024-01-12 00:06:31 -08:00
Kashif Faraz	f445ba4d6b	Audit API DELETE datasource (markAllSegmentsAsUnused) (#15653 ) Changes: - Add audit for `DELETE /druid/coordinator/v1/datasources/{datasourceName}` - Minor refactor	2024-01-11 09:43:32 +05:30
PANKAJ KUMAR	047c7340ab	Adding retries to update the metadata store instead of failure (#15141 ) Currently, If 2 tasks are consuming from the same partitions, try to publish the segment and update the metadata, the second task can fail because the end offset stored in the metadata store doesn't match with the start offset of the second task. We can fix this by retrying instead of failing. AFAIK apart from the above issue, the metadata mismatch can happen in 2 scenarios: - when we update the input topic name for the data source - when we run 2 replicas of ingestion tasks(1 replica will publish and 1 will fail as the first replica has already updated the metadata). Implemented the comparable function to compare the last committed end offset and new Sequence start offset. And return a specific error msg for this. Add retry logic on indexers to retry for this specific error msg. Updated the existing test case.	2024-01-10 12:30:54 +05:30
Rishabh Singh	71f5307277	Eliminate Periodic Realtime Segment Metadata Queries: Task Now Publish Schema for Seamless Coordinator Updates (#15475 ) The initial step in optimizing segment metadata was to centralize the construction of datasource schema in the Coordinator (#14985). Subsequently, our goal is to eliminate the requirement for regularly executing queries to obtain segment schema information. This task encompasses addressing both realtime and finalized segments. This modification specifically addresses the issue with realtime segments. Tasks will now routinely communicate the schema for realtime segments during the segment announcement process. The Coordinator will identify the schema alongside the segment announcement and subsequently update the schema for realtime segments in the metadata cache.	2024-01-10 08:55:56 +05:30
Pranav	747d973752	Skip waiting for first lookup version to get initialized (#15598 )	2024-01-09 13:18:39 -08:00
Abhishek Agarwal	468b99e608	Enable query request queuing by default when total laning is turned on. (#15440 ) This PR enables the flag by default to queue excess query requests in the jetty queue. Still keeping the flag so that it can be turned off if necessary. But the flag will be removed in the future.	2024-01-09 07:54:26 +05:30
Jonathan Wei	5d1e66b8f9	Allow broker to use catalog for datasource schemas for SQL queries (#15469 ) * Allow broker to use catalog for datasource schemas * More PR comments * PR comments	2024-01-08 13:46:08 -06:00
AmatyaAvadhanula	63bfb3e6c9	Handle half-eternity intervals while fetching segments with created dates (#15608 ) * Handle all intervals while fetching segments with created dates	2024-01-08 12:07:11 +05:30
George Shiqi Wu	9fe67958be	Increase ServerConnector accept queue size (#15596 ) * Allow overwriting ServerConnector accept queue size * Use a single config * Fix spacing * fix spacing * fixed value * read value from environment * fix spacing * Unpack value before reading * check somaxconn on linux only	2024-01-05 12:04:15 -05:00
Gian Merlino	e40b96e026	Reverse lookup fixes and enhancements. (#15611 ) * Reverse lookup fixes and enhancements. 1) Add a "mayIncludeUnknown" parameter to DimFilter#optimize. This is important because otherwise the reverse-lookup optimization is done improperly when the "in" filter appears under a "not", and the lookup extractionFn may return null for some possible values of the filtered column. The "includeUnknown" test cases in InDimFilterTest illustrate the difference in behavior. 2) Enhance InDimFilter#optimizeLookup to handle "mayIncludeUnknown", and to be able to do a reverse lookup in a wider variety of cases. 3) Make "unapply" protected in LookupExtractor, and move callers to "unapplyAll". The main reason is that MapLookupExtractor, a common implementation, lacks a reverse mapping and therefore does a scan of the map for each call to "unapply". For performance sake these calls need to be batched. * Remove optimize call from BloomDimFilter. * Follow the law. * Fix tests. * Fix imports. * Switch function. * Fix tests. * More tests.	2024-01-03 13:28:44 -08:00
Abhishek Radhakrishnan	f0f428274a	Prometheus config property doc fixup (#15613 ) * Minor fixes * Update docs/development/extensions-contrib/prometheus.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> --------- Co-authored-by: Charles Smith <techdocsmith@gmail.com>	2024-01-02 16:28:42 -08:00
kaisun2000	a5e9b14be0	Add delay before the peon drops the segments after publishing them (#15373 ) Currently in the realtime ingestion (Kafka/Kinesis) case, after publishing the segments, upon acknowledgement from the coordinator that the segments are already placed in some historicals, the peon would unannounce the segments (basically saying the segments are not in this peon anymore to the whole cluster) and drop the segments from cache and sink timeline in one shot. The in transit queries from the brokers that still thinks the segments are in the peon can get a NullPointer exception when the peon is unsetting the hydrants in the sinks. The fix would let the peon to wait for a configurable delay period before dropping segments, remove segments from cache etc after the peon unannounce the segments. This delayed approach is similar to how the historicals handle segments moving out.	2024-01-02 11:08:28 +05:30
Kashif Faraz	9f568858ef	Add logging implementation for AuditManager and audit more endpoints (#15480 ) Changes - Add `log` implementation for `AuditManager` alongwith `SQLAuditManager` - `LoggingAuditManager` simply logs the audit event. Thus, it returns empty for all `fetchAuditHistory` calls. - Add new config `druid.audit.manager.type` which can take values `log`, `sql` (default) - Add new config `druid.audit.manager.logLevel` which can take values `DEBUG`, `INFO`, `WARN`. This gets activated only if `type` is `log`. - Remove usage of `ConfigSerde` from `AuditManager` as audit is not just limited to configs - Add `AuditSerdeHelper` for a single implementation of serialization/deserialization of audit payload and other utility methods.	2023-12-19 13:14:04 +05:30
Kashif Faraz	feeb4f0fb0	Allocate pending segments at latest committed version (#15459 ) The segment allocation algorithm reuses an already allocated pending segment if the new allocation request is made for the same parameters: datasource sequence name same interval same value of skipSegmentLineageCheck (false for batch append, true for streaming append) same previous segment id (used only when skipSegmentLineageCheck = false) The above parameters can thus uniquely identify a pending segment (enforced by the UNIQUE constraint on the sequence_name_prev_id_sha1 column in druid_pendingSegments metadata table). This reuse is done in order to allow replica tasks (in case of streaming ingestion) to use the same set of segment IDs. allow re-run of a failed batch task to use the same segment ID and prevent unnecessary allocations	2023-12-14 16:18:39 +05:30
AmatyaAvadhanula	91ca8e73d6	Skip compaction for datasources with partial-eternity segments (#15542 ) This PR builds on #13304 to skip compaction for datasources with segments that have their interval start or end coinciding with Eternity interval end-points. This is needed in order to prevent an issue similar to #13208 as the Coordinator tries to iterate over a large number of intervals when trying to compact an interval with infinite start or end.	2023-12-12 15:06:45 +05:30
zachjsh	ab7d9bc6ec	Add api for Retrieving unused segments (#15415 ) ### Description This pr adds an api for retrieving unused segments for a particular datasource. The api supports pagination by the addition of `limit` and `lastSegmentId` parameters. The resulting unused segments are returned with optional `sortOrder`, `ASC` or `DESC` with respect to the matching segments `id`, `start time`, and `end time`, or not returned in any guarenteed order if `sortOrder` is not specified `GET /druid/coordinator/v1/datasources/{dataSourceName}/unusedSegments?interval={interval}&limit={limit}&lastSegmentId={lastSegmentId}&sortOrder={sortOrder}` Returns a list of unused segments for a datasource in the cluster contained within an optionally specified interval. Optional parameters for limit and lastSegmentId can be given as well, to limit results and enable paginated results. The results may be sorted in either ASC, or DESC order depending on specifying the sortOrder parameter. `dataSourceName`: The name of the datasource `interval`: the specific interval to search for unused segments for. `limit`: the maximum number of unused segments to return information about. This property helps to support pagination `lastSegmentId`: the last segment id from which to search for results. All segments returned are > this segment lexigraphically if sortOrder is null or ASC, or < this segment lexigraphically if sortOrder is DESC. `sortOrder`: Specifies the order with which to return the matching segments by start time, end time. A null value indicates that order does not matter. This PR has: - [x] been self-reviewed. - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.) - [x] added documentation for new or modified features or behaviors. - [ ] a release note entry in the PR description. - [x] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links. - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md) - [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader. - [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met. - [ ] added integration tests. - [x] been tested in a test Druid cluster.	2023-12-11 16:32:18 -05:00
Abhishek Radhakrishnan	96be82a3e6	Clean up duty for non-overlapping eternity tombstones (#15281 ) * Add initial draft of MarkDanglingTombstonesAsUnused duty. * Use overshadowed segments instead of all used segments. * Add unit test for MarkDanglingSegmentsAsUnused duty. * Add mock call * Simplify code. * Docs * shorter lines formatting * metric doc * More tests, refactor and fix up some logic. * update javadocs; other review comments. * Make numCorePartitions as 0 in the TombstoneShardSpec. * fix up test * Add tombstone core partition tests * Update docs/design/coordinator.md Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com> * review comment * Minor cleanup * Only consider tombstones with 0 core partitions * Need to register the test shard type to make jackson happy * test comments * checkstyle * fixup misc typos in comments * Update logic to use overshadowed segments * minor cleanup * Rename duty to eternity tombstone instead of dangling. Add test for full eternity tombstone. * Address review feedback. --------- Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>	2023-12-11 08:57:15 -08:00
Clint Wylie	557f3f6f57	add array column type support to EXTEND operator (#15458 )	2023-12-06 23:21:35 -08:00

1 2 3 4 5 ...

4212 Commits