Commit Graph

4311 Commits

Author SHA1 Message Date
Kashif Faraz 15d27f340d
Fix fetch of task location in SpecificTaskServiceLocator (#16462)
* Fix fetch of task location in SpecificTaskServiceLocator

* Resolve future if exception occurs while invoking API

* Remove unused import
2024-05-20 12:35:04 +05:30
George Shiqi Wu ed9881df88
Cleanup logic from handoff API (#16457)
* Cleanup logic from handoff API

* Fix test

* Fix checkstyle

* Update docs
2024-05-16 08:42:44 -07:00
Akshat Jain ddfd62d9a9
Disable loading lookups by default in CompactionTask (#16420)
This PR updates CompactionTask to not load any lookups by default, unless transformSpec is present.

If transformSpec is present, we will make the decision based on context values, loading all lookups by default. This is done to ensure backward compatibility since transformSpec can reference lookups.
If transform spec is not present and no context value is passed, we donot load any lookup.

This behavior can be overridden by supplying lookupLoadingMode and lookupsToLoad in the task context.
2024-05-15 11:39:23 +05:30
kaisun2000 91cd07d892
Add logging to reveal reason to persist the hydrants (#16409) 2024-05-15 08:39:29 +05:30
George Shiqi Wu c1bf4fed90
API for stopping streaming tasks early (#16310)
* Try stopping task early

* Fix checkstyle

* Add unit test

* Add a couple more tests

* PR changes

* Use notice

* fix checkstyle

* PR changes

* Update indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/supervisor/SeekableStreamSupervisor.java

Co-authored-by: Suneet Saldanha <suneet@apache.org>

* Update indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/supervisor/SeekableStreamSupervisor.java

Co-authored-by: Suneet Saldanha <suneet@apache.org>

* Change payload

* Remove quotes

---------

Co-authored-by: Suneet Saldanha <suneet@apache.org>
2024-05-14 06:39:50 -07:00
Igor Berman d0f3fdab37
Allow using different lock types for kill task, remove markAsUnused parameter (#16362)
Changes:
- Remove deprecated `markAsUnused` parameter from `KillUnusedSegmentsTask`
- Allow `kill` task to use `REPLACE` lock when `useConcurrentLocks` is true
- Use `EXCLUSIVE` lock by default
2024-05-10 06:37:36 +05:30
Rishabh Singh a6ebb963c7
Fix NPE in SegmentSchemaCache (#16404)
Verify that schema backfill count metric is emitted for each datasource.
    Fix potential NPE in SegmentSchemaCache#markMetadataQueryResultPublished.
2024-05-09 11:13:53 +05:30
Alberic Liu 92fb0ff718
upgrade mysql:mysql-connector-java to 8.2.0 (#16024)
* upgrade mysql:mysql-connector-java to 8.2.0

* fix the check errors

* remove unused comment
2024-05-06 21:58:37 +08:00
Abhishek Radhakrishnan 2a638d77d9
Remove stale references to coordinator dynamic config killAllDataSources. (#16387)
This parameter has been removed for awhile now as of Druid 0.23.0
https://github.com/apache/druid/pull/12187.

The code was only used in tests to verify that serialization works.
Now remove all references to avoid any confusion.
2024-05-05 08:48:56 +05:30
zachjsh fb7c84fb5d
Catalog clustering keys fixes (#16351)
* * add another catalog clustering columns unit test

* * dissallow clusterKeys with descending order

* * make more clear that clustering is re-written into ingest node
whether a catalog table or not

* * when partitionedBy is stored in catalog, user shouldnt need to specify
it in order to specify clustering

* * fix intellij inspection failure
2024-05-03 14:02:56 -04:00
Rishabh Singh c61c3785a0
Followup changes to 15817 (Segment schema publishing and polling) (#16368)
* Fix build

* Nit changes in KillUnreferencedSegmentSchema

* Replace reference to the abbreviation SMQ with Metadata Query, rename inTransit maps in schema cache

* nitpicks

* Remove reference to smq abbreviation from integration-tests

* Remove reference to smq abbreviation from integration-tests

* minor change

* Update index.md

* Add delimiter while computing schema fingerprint hash
2024-05-03 19:13:52 +05:30
AmatyaAvadhanula 5fae20d287
Do not allocate ids conflicting with existing segment ids (#16380)
* Do not allocate ids conflicting with existing segment ids

* Parameterized tests

* Add doc and retain test for coverage
2024-05-03 19:09:48 +05:30
Kashif Faraz e5b40b0b8c
Miscellaneous cleanup of load queue references (#16367)
Changes:
- Rename `DataSegmentChangeRequestAndStatus` to `DataSegmentChangeResponse`
- Rename `SegmentLoadDropHandler.Status` to `SegmentChangeStatus`
- Remove method `CoordinatorRunStats.getSnapshotAndReset()` as it was used only in
load queue peon implementations. Using an atomic reference is much simpler.
- Remove `ServerTestHelper.MAPPER`. Use existing `TestHelper.makeJsonMapper()` instead.
2024-05-02 15:59:50 +05:30
Gian Merlino 5d1950d451
MSQ controller: Support in-memory shuffles; towards JVM reuse. (#16168)
* MSQ controller: Support in-memory shuffles; towards JVM reuse.

This patch contains two controller changes that make progress towards a
lower-latency MSQ.

First, support for in-memory shuffles. The main feature of in-memory shuffles,
as far as the controller is concerned, is that they are not fully buffered. That
means that whenever a producer stage uses in-memory output, its consumer must run
concurrently. The controller determines which stages run concurrently, and when
they start and stop.

"Leapfrogging" allows any chain of sort-based stages to use in-memory shuffles
even if we can only run two stages at once. For example, in a linear chain of
stages 0 -> 1 -> 2 where all do sort-based shuffles, we can use in-memory shuffling
for each one while only running two at once. (When stage 1 is done reading input
and about to start writing its output, we can stop 0 and start 2.)

1) New OutputChannelMode enum attached to WorkOrders that tells workers
   whether stage output should be in memory (MEMORY), or use local or durable
   storage.

2) New logic in the ControllerQueryKernel to determine which stages can use
   in-memory shuffling (ControllerUtils#computeStageGroups) and to launch them
   at the appropriate time (ControllerQueryKernel#createNewKernels).

3) New "doneReadingInput" method on Controller (passed down to the stage kernels)
   which allows stages to transition to POST_READING even if they are not
   gathering statistics. This is important because it enables "leapfrogging"
   for HASH_LOCAL_SORT shuffles, and for GLOBAL_SORT shuffles with 1 partition.

4) Moved result-reading from ControllerContext#writeReports to new QueryListener
   interface, which ControllerImpl feeds results to row-by-row while the query
   is still running. Important so we can read query results from the final
   stage using an in-memory channel.

5) New class ControllerQueryKernelConfig holds configs that control kernel
   behavior (such as whether to pipeline, maximum number of concurrent stages,
   etc). Generated by the ControllerContext.

Second, a refactor towards running workers in persistent JVMs that are able to
cache data across queries. This is helpful because I believe we'll want to reuse
JVMs and cached data for latency reasons.

1) Move creation of WorkerManager and TableInputSpecSlicer to the
   ControllerContext, rather than ControllerImpl. This allows managing workers and
   work assignment differently when JVMs are reusable.

2) Lift the Controller Jersey resource out from ControllerChatHandler to a
   reusable resource.

3) Move memory introspection to a MemoryIntrospector interface, and introduce
   ControllerMemoryParameters that uses it. This makes it easier to run MSQ in
   process types other than Indexer and Peon.

Both of these areas will have follow-ups that make similar changes on the
worker side.

* Address static checks.

* Address static checks.

* Fixes.

* Report writer tests.

* Adjustments.

* Fix reports.

* Review updates.

* Adjust name.

* Small changes.
2024-04-30 21:30:27 -07:00
AmatyaAvadhanula 42e99bf912
Add new index on datasource and task_allocator_id for pending segments (#16355)
* Add pending segments index on datasource and task_allocator_id

* Use both datasource and task_allocator_id in queries
2024-04-30 15:48:16 +05:30
Laksh Singla e695e52d3f
Improve code flow in the First/Last vector aggregators and unify the numeric aggregators with the String implementations (#16230)
This PR fixes the first and last vector aggregators and improves their readability. Following changes are introduced

    The folding is broken in the vectorized versions. We consider time before checking the folded object.

    If the numerical aggregator gets passed any other object type for some other reason (like String), then the aggregator considers it to be folded, even though it shouldn’t be. We should convert these objects to the desired type, and aggregate them properly.

    The aggregators must properly use generics. This would minimize the ClassCastException issues that can happen with mixed segment types. We are unifying the string first/last aggregators with numeric versions as well.

    The aggregators must aggregate null values (https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/query/aggregation/first/StringFirstLastUtils.java#L55-L56 ). The aggregator should only ignore pairs with time == null, and not value == null

    Time nullity is ignored when trying to vectorize the data.

    String versions initialized with DateTimes.MIN that is equal to Long.MIN / 2. This can cause incorrect results in case the user enters a custom time column. NOTE: This is still present because it would require a larger refactor in all of the versions.

    There is a difference in what users might expect from the results because the code flow is changed (for example, the direction of the for loops, etc), however, this will only change the results, and not the contract set by first/last aggregators, which is that if multiple values have the same timestamp, then any of them can get picked.

    If the column is non-existent, the users might expect a change in the timestamp from DateTime.MAX to Long.MAX, because the code incorrectly used DateTime.MAX to initialize the aggregator, however, in case of a custom timestamp column, this might not be the case. The SQL query might be prohibited from using any Long since it requires a cast to the timestamp function that can fail, but AFAICT native queries don't have such limitations.
2024-04-30 15:13:14 +05:30
Kashif Faraz aa46314971
Remove usage of skife from DruidCoordinatorConfig (#15705)
* Remove usage of skife from DruidCoordinatorConfig

* Remove old config class

* Address static checks

* Fix tests

* Remove unnecessary mocks

* Fix config typos

* Fix config condition

* Fix test, spotbug check

* Move validation to DruidCoordinatorConfig

* Move DruidCoordinatorConfig to different package

* Fix validation of killunusedconfig

* Simplify and fix KillSupervisorsCustomDuty

* Address review comments

* Fix new tests

* Add KillUnusedSchemasConfig

* Remove KillUnusedSchemasConfig

* Minor renames
2024-04-29 11:37:13 -07:00
Adithya Chakilam f8015eb02a
Add config lagAggregate to LagBasedAutoScalerConfig (#16334)
Changes:
- Add new config `lagAggregate` to `LagBasedAutoScalerConfig`
- Add field `aggregateForScaling` to `LagStats`
- Use the new field/config to determine which aggregate to use to compute lag
- Remove method `Supervisor.computeLagForAutoScaler()`
2024-04-29 22:20:41 +05:30
Akshat Jain 9d2cae40c3
Add support for selective loading of lookups in the task layer (#16328)
Changes:
- Add `LookupLoadingSpec` to support 3 modes of lookup loading: ALL, NONE, ONLY_REQUIRED
- Add method `Task.getLookupLoadingSpec()`
- Do not load any lookups for `KillUnusedSegmentsTask`
2024-04-29 07:19:59 +05:30
Andreas Maechler 9cd1890855
Fix log count (#16341) 2024-04-26 14:04:19 -07:00
Adarsh Sanjeev 9a2d7c28bc
Prepare master branch for 31.0.0 release (#16333) 2024-04-26 09:22:43 +05:30
AmatyaAvadhanula 31eee7d51e
Check for handoff of upgraded segments (#16162)
Changes:
1) Check for handoff of upgraded realtime segments.
2) Drop sink only when all associated realtime segments have been abandoned.
3) Delete pending segments upon commit to prevent unnecessary upgrades and
partition space exhaustion when a concurrent replace happens. This also prevents
potential data duplication.
4) Register pending segment upgrade only on those tasks to which the segment is associated.
2024-04-25 22:03:38 +05:30
Rishabh Singh e30790e013
Introduce Segment Schema Publishing and Polling for Efficient Datasource Schema Building (#15817)
Issue: #14989

The initial step in optimizing segment metadata was to centralize the construction of datasource schema in the Coordinator (#14985). Thereafter, we addressed the problem of publishing schema for realtime segments (#15475). Subsequently, our goal is to eliminate the requirement for regularly executing queries to obtain segment schema information.

This is the final change which involves publishing segment schema for finalized segments from task and periodically polling them in the Coordinator.
2024-04-24 22:22:53 +05:30
Laksh Singla b9bbde5c0a
Fix deadlock that can occur while merging group by results (#15420)
This PR prevents such a deadlock from happening by acquiring the merge buffers in a single place and passing it down to the runner that might need it.
2024-04-22 14:10:44 +05:30
Adithya Chakilam cff5d1e369
Add method Supervisor.computeLagForAutoScaler (#16314)
Tries to address the comments made on #16284 after merged.

Changes:
- Remove method `Supervisor.getLagMetric()`
- Add method `Supervisor.computeLagForAutoScaler()`
- Remove classes `LagMetric` and `LagMetricTest`
2024-04-20 07:57:50 +05:30
zachjsh 3f2dd46ede
Catalog table should not need explicit segment granularity set (#16278)
* * fix

* * fix

* * address review comments

* * fix

* * simplify tests

* * fix complex type nullability issue

* * fix and update test

* * address review comments

* * address test review comments

* * fix checkstyle

* * fix checkstyle

* * fix failing test
2024-04-17 11:46:24 -04:00
Adithya Chakilam 34237bc112
Consider max lag for kinesis while autoscaling (#16284)
* Consider max lag for kinesis while autoscaling

* add test for coverage

* test folder
2024-04-17 15:05:05 +05:30
aho135 4fa377c7fd
Improve logging for lookups (#16287) 2024-04-17 10:20:09 +05:30
AmatyaAvadhanula f3d69f30e6
Associate pending segments with the tasks that requested them (#16144)
Changes:
- Add column `task_allocator_id` to `pendingSegments` metadata table.
- Add column `upgraded_from_segment_id` to `pendingSegments` metadata table.
- Add interface `PendingSegmentAllocatingTask` and implement it by all tasks which
can allocate pending segments.
- Use `taskAllocatorId` to identify the task (and its sub-tasks or replicas) to which
a pending segment has been allocated.
- Perform active cleanup of pending segments in `TaskLockbox` once there are no
active tasks for the corresponding task allocator id.
- When committing APPEND segments, also commit all upgraded pending segments
corresponding to that task allocator id.
- When committing REPLACE segments, upgrade all overlapping pending segments in
the same transaction.
2024-04-17 09:06:31 +05:30
AmatyaAvadhanula ad6bd62140
Handle task location fetch from overlord during rolling upgrades (#16227)
Bug:
#15724 introduced a bug where a rolling upgrade would cause all task locations
returned by the Overlord on an older version to be unknown.

Fix:
If the new API fails, fall back to single task status API which always returns a valid task location.
2024-04-16 21:01:37 +05:30
Kashif Faraz 81d7b6ebe1
Fix OverlordClient to read reports as a concrete `ReportMap` (#16226)
Follow up to #16217 

Changes:
- Update `OverlordClient.getReportAsMap()` to return `TaskReport.ReportMap`
- Move the following classes to `org.apache.druid.indexer.report` in the `druid-processing` module
  - `TaskReport`
  - `KillTaskReport`
  - `IngestionStatsAndErrorsTaskReport`
  - `TaskContextReport`
  - `TaskReportFileWriter`
  - `SingleFileTaskReportFileWriter`
  - `TaskReportSerdeTest`
- Remove `MsqOverlordResourceTestClient` as it had only one method
which is already present in `OverlordResourceTestClient` itself
2024-04-15 08:00:59 +05:30
Abhishek Radhakrishnan 041d0bff5e
Set default `KillUnusedSegments` duty to coordinator's indexing period & `killTaskSlotRatio` to 0.1 (#16247)
The default value for druid.coordinator.kill.period (if unspecified) has changed from P1D to the value of druid.coordinator.period.indexingPeriod. Operators can choose to override druid.coordinator.kill.period and that will take precedence over the default behavior.
The default value for the coordinator dynamic config killTaskSlotRatio is updated from 1.0 to 0.1. This ensures that that kill tasks take up only 1 task slot right out-of-the-box instead of taking up all the task slots.

* Remove stale comment and inline canDutyRun()

* druid.coordinator.kill.period defaults to druid.coordinator.period.indexingPeriod if not set.

- Remove the default P1D value for druid.coordinator.kill.period. Instead default
  druid.coordinator.kill.period to whatever value druid.coordinator.period.indexingPeriod is set
  to if the former config isn't specified.
- If druid.coordinator.kill.period is set, the value will take precedence over
  druid.coordinator.period.indexingPeriod

* Update server/src/test/java/org/apache/druid/server/coordinator/DruidCoordinatorConfigTest.java

* Fix checkstyle error

* Clarify comment

* Update server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinatorConfig.java

* Put back canDutyRun()

* Default killTaskSlotsRatio to 0.1 instead of 1.0 (all slots)

* Fix typo DEFAULT_MAX_COMPACTION_TASK_SLOTS

* Remove unused test method.

* Update default value of killTaskSlotsRatio in docs and web-console default mock

* Move initDuty() after params and config setup.
2024-04-14 18:56:17 -07:00
Abhishek Radhakrishnan 75fb57ed6e
Update error messages when supervisor's checkpoint state is invalid (#16208)
* Update error message when topic messages.

Suggest resetting the supervisor when the topic changes instead of changing
the supervisor name which is actually making a new supervisor.

* Update server/src/main/java/org/apache/druid/metadata/IndexerSQLMetadataStorageCoordinator.java

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

* Cleanup

* Remove log and include oldCommitMetadataFromDb

* Fix test

---------

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
2024-04-03 10:34:17 -07:00
Abhishek Radhakrishnan cf9a3bdc14
Fix up error handling in unusedSegments API. (#16206)
Changes:
- Handle exceptions in the API and map them to a `Response` object with the appropriate error code.
- Replace `AuthorizationUtils.filterAuthorizedResources()` with `DatasourceResourceFilter`.
The endpoint is annotated consistent with other usages.
- Update `DatasourceResourceFilter` to remove the lambda and update javadocs.
The usages information is self-evident with an IDE.
- Adjust the invalid interval exception message.
- Break up the large unit test `testGetUnusedSegmentsInDataSource()` into smaller unit tests
for each test case. Also, validate the error codes.
2024-03-27 12:31:21 +05:30
Abhishek Radhakrishnan 95595ba4f5
Fix handling an empty list of versions (#16198)
* Differentiate null and empty lists of segment IDs and versions.

Treat them differently so the. Segment IDs and versions can be An empty list,
in which case, the queries should just not return anything. Versions are optional, so
they can be null, which just indicates nothing, so the queries should return segments with
all possible versions. Segment IDs cannot be null as indicated by the absence of @Nullable
annotation.

* Update javadocs and add empty versions test to kill task.

* Add test for RetrieveSegmentsActions as well.
2024-03-25 17:51:24 -07:00
Abhishek Radhakrishnan a70e28a3c2
Parameterize segment IDs (#16174)
* Add parameterized segment IDs.

* Refactor into one common method.

* Refactor getConditionForIntervalsAndMatchMode - pass in only what's needed.

* Minor cleanup.
2024-03-22 08:20:59 -07:00
Arun Ramani c72e69a8c8
MetricsModule: inject DataSourceTaskIdHolder early (#16140)
* Explicitly bind ServiceStatusMonitor

* Correct fix
2024-03-21 16:14:41 -07:00
Kashif Faraz 352902156a
Fix mark segment unused when overshadowed by zero replica segment (#16181)
Bug:
In the `MarkOvershadowedSegmentsAsUnused` duty, the coordinator marks a segment
as unused if it is overshadowed by a segment currently being served by a historical or broker.
But it is possible to have segments that are eligible for a load rule but require zero replicas
to be loaded. (Such segments can be queried only using the MSQ engine).
If such a zero-replica segment overshadows any other segment, the overshadowed segment will
never be marked as unused and will continue to exist in the metadata store as a dangling segment.

Fix:
- In a coordinator run, keep track of segments that are eligible for a load rule but require zero replicas
- Allow the zero-replicas segments to overshadow old segments and hence mark the latter as unused

Other changes:
- Add simulation test to verify new behaviour. This test fails with the current code.
- Clean up javadocs
2024-03-21 12:56:59 +05:30
Rushikesh Bankar 3d8b0ffae8
Add indexer level task metrics to provide more visibility in the task distribution (#15991)
Changes:

Add the following indexer level task metrics:
- `worker/task/running/count`
- `worker/task/assigned/count`
- `worker/task/completed/count`

These metrics will provide more visibility into the tasks distribution across indexers
(We often see a task skew issue across indexers and with this issue it would be easier
to catch the imbalance)
2024-03-21 11:08:01 +05:30
AmatyaAvadhanula 488d376209
Optimize isOvershadowed when there is a unique minor version for an interval (#15952)
* Optimize isOvershadowed for intervals with timechunk locking
2024-03-20 19:30:00 +05:30
Abhishek Radhakrishnan fa8e511492
Add versions to `markUsed` and `markUnused` APIs (#16141)
* Mark used and unused APIs by versions.

* remove the conditional invocations.

* isValid() and test updates.

* isValid() and tests.

* Remove warning logs for invalid user requests. Also, downgrade visibility.

* Update resp message, etc.

* tests and some cleanup.

* Docs draft

* Clarify docs

* Update server/src/main/java/org/apache/druid/server/http/DataSourcesResource.java

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

* Review comments

* Remove default interface methods only used in tests and update docs.

* Clarify javadocs and @Nullable.

* Add more tests.

* Parameterized versions.

---------

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
2024-03-19 09:22:25 -07:00
Zoltan Haindrich 0a42342cef
Update Calcite*Test to use junit5 (#16106)
* Update Calcite*Test to use junit5

* change the way temp dirs are handled
* add openrewrite workflow to safeguard upgrade
* replace junitparamrunner with standard junit5 parametered tests
* update a few rules to junit5 api
* lots of boring changes

* cleanup QueryLogHook

* cleanup

* fix compile error: ARRAYS_DATASOURCE

* fix test

* remove enclosed

* empty

+TEST:TDigestSketchSqlAggregatorTest,HllSketchSqlAggregatorTest,DoublesSketchSqlAggregatorTest,ThetaSketchSqlAggregatorTest,ArrayOfDoublesSketchSqlAggregatorTest,BloomFilterSqlAggregatorTest,BloomDimFilterSqlTest,CatalogIngestionTest,CatalogQueryTest,FixedBucketsHistogramQuantileSqlAggregatorTest,QuantileSqlAggregatorTest,MSQArraysTest,MSQDataSketchesTest,MSQExportTest,MSQFaultsTest,MSQInsertTest,MSQLoadedSegmentTests,MSQParseExceptionsTest,MSQReplaceTest,MSQSelectTest,InsertLockPreemptedFaultTest,MSQWarningsTest,SqlMSQStatementResourcePostTest,SqlStatementResourceTest,CalciteSelectJoinQueryMSQTest,CalciteSelectQueryMSQTest,CalciteUnionQueryMSQTest,MSQTestBase,VarianceSqlAggregatorTest,SleepSqlTest,SqlRowTransformerTest,DruidAvaticaHandlerTest,DruidStatementTest,BaseCalciteQueryTest,CalciteArraysQueryTest,CalciteCorrelatedQueryTest,CalciteExplainQueryTest,CalciteExportTest,CalciteIngestionDmlTest,CalciteInsertDmlTest,CalciteJoinQueryTest,CalciteLookupFunctionQueryTest,CalciteMultiValueStringQueryTest,CalciteNestedDataQueryTest,CalciteParameterQueryTest,CalciteQueryTest,CalciteReplaceDmlTest,CalciteScanSignatureTest,CalciteSelectQueryTest,CalciteSimpleQueryTest,CalciteSubqueryTest,CalciteSysQueryTest,CalciteTableAppendTest,CalciteTimeBoundaryQueryTest,CalciteUnionQueryTest,CalciteWindowQueryTest,DecoupledPlanningCalciteJoinQueryTest,DecoupledPlanningCalciteQueryTest,DecoupledPlanningCalciteUnionQueryTest,DrillWindowQueryTest,DruidPlannerResourceAnalyzeTest,IngestTableFunctionTest,QueryTestRunner,SqlTestFrameworkConfig,SqlAggregationModuleTest,ExpressionsTest,GreatestExpressionTest,IPv4AddressMatchExpressionTest,IPv4AddressParseExpressionTest,IPv4AddressStringifyExpressionTest,LeastExpressionTest,TimeFormatOperatorConversionTest,CombineAndSimplifyBoundsTest,FiltrationTest,SqlQueryTest,CalcitePlannerModuleTest,CalcitesTest,DruidCalciteSchemaModuleTest,DruidSchemaNoDataInitTest,InformationSchemaTest,NamedDruidSchemaTest,NamedLookupSchemaTest,NamedSystemSchemaTest,RootSchemaProviderTest,SystemSchemaTest,CalciteTestBase,SqlResourceTest

* use @Nested

* add rule to remove enclosed; upgrade surefire

* remove enclosed

* cleanup

* add comment about surefire exclude
2024-03-19 04:05:12 -07:00
Abhishek Radhakrishnan 3b35fb768c
Bug fix: empty segment IDs cannot be both valid and invalid at the same time. (#16145)
Treat empty and null segment IDs as the same.
2024-03-18 00:47:32 -07:00
zachjsh f3d77fe684
Fix Cannot mark an unqueryable datasource's segments used / unused (#16127)
* * fix

* * address review comments

* * all remove the short-circuit for markUnused api

* * add test
2024-03-15 14:25:02 -07:00
Abhishek Radhakrishnan 3eefc47722
Refactor tests and code clean up (#16129)
* Add update() in TestDerbyConnectorRule

* use common function.

* fixup build.

* fixup indentations.

* Revert "fixup indentations."

This reverts commit a9d6b73e79.

* fixup indentataions.

* Remove Thread.sleep() by directly calling updateUsedStatusLastUpdated.

* another indentation slip.

* Move common segment initialization to setup().

* Fix for checkstyle.

* review comments: indentation fixes, type.

* Wrapper class for Segments table

* Add KillUnusedSegmentsTaskBuilder in test class

* Remove javadocs for self-explanatory methods.
2024-03-15 10:13:14 -07:00
Kashif Faraz 466057c61b
Remove deprecated DruidException, EntryExistsException (#14448)
Changes:
- Remove deprecated `DruidException` (old one) and `EntryExistsException`
- Use newly added comprehensive `DruidException` instead
- Update error message in `SqlMetadataStorageActionHandler` when max packet limit is violated.
- Factor out common code from several faults into `BaseFault`.
- Slightly update javadoc in `DruidException` to render it correctly
- Remove unused classes `SegmentToMove`, `SegmentToDrop`
- Move `ServletResourceUtils` from module `druid-processing` to `druid-server`
- Add utility method to build error Response from `DruidException`.
2024-03-15 21:29:11 +05:30
Kashif Faraz 82fced571b
Remove deprecated UnknownSegmentIdsException (#16112)
Changes
- Replace usages of `UnknownSegmentIdsException` with `DruidException`
- Add method `SqlMetadataQuery.retrieveSegments`
- Add new field `used` to `DataSegmentPlus`
2024-03-13 11:07:37 +05:30
Abhishek Radhakrishnan fb7bb0953d
Kill segments by versions (#15994)
* Kill task version support.

Kill tasks by default kill all versions of unused segments in the specified
interval. Users wanting to delete specific versions (for example, data compliance
reasons) and keep rest of the versions can specify the optional version in the
kill task payload.

* Formatting changes.

* Multi version tests in RetrieveSegmentsActionsTest

Sort of like method-level parameterized tests.

* Address review feedback

* Accept a list of versions instead of a single version.

Support multiple versions.

* Tests for multiple versions.

* Update docs

* Cleanup

* Address review comments.

Retain the old interface method and make it default and route it to
the method with nullable versions variant. Update usages to use the
default method where versions doesn't matter.

* Remove versions from retreive used segments action.

* Some updates.

* Apply suggestions from code review

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

* /s/actual/observed/g

* minor test cleanup

* WIP: Test fixes and updates. Also add test for kill by version with used load spec.

Checkpoint.

---------

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
2024-03-13 09:37:30 +05:30
George Shiqi Wu 94d2a28465
Add deep storage segment metric (#16072)
* Add new metric for deepStorage segments

* Add docs

* change metric name
2024-03-11 10:24:46 -04:00
Abhishek Radhakrishnan c7f1872bd1
Fixup KillUnusedSegmentsTest (#16094)
Changes:
- Use an actual SqlSegmentsMetadataManager instead of TestSqlSegmentsMetadataManager
- Simplify TestSegmentsMetadataManager
- Add a test for large interval segments.
2024-03-11 13:37:48 +05:30
Kashif Faraz 5f203725dd
Clean up SqlSegmentsMetadataManager and corresponding tests (#16044)
Changes:

Improve `SqlSegmentsMetadataManager`
- Break the loop in `populateUsedStatusLastUpdated` before going to sleep if there are no more segments to update
- Add comments and clean up logs

Refactor `SqlSegmentsMetadataManagerTest`
- Merge `SqlSegmentsMetadataManagerEmptyTest` into this test
- Add method `testPollEmpty`
- Shave a few seconds off of the tests by reducing poll duration
- Simplify creation of test segments
- Some renames here and there
- Remove unused methods
- Move `TestDerbyConnector.allowLastUsedFlagToBeNull` to this class

Other minor changes
- Add javadoc to `NoneShardSpec`
- Use lambda in `SqlSegmentMetadataPublisher`
2024-03-08 07:34:51 +05:30
AmatyaAvadhanula 5871b81a78
Fix race in BaseNodeRoleWatcher tests (#16064)
* Fix race in BaseNodeRoleWatcher tests

* Make non static
2024-03-07 13:41:16 -08:00
Laksh Singla 5f588fa45c
Fix bug while materializing scan's result to frames (#15987)
While converting Sequence<ScanResultValue> to Sequence<Frames>, when maxSubqueryBytes is enabled, we batch the results to prevent creating a single frame per ScanResultValue. Batching requires peeking into the actual value, and checking if the row signature of the scan result’s value matches that of the previous value.

Since we can do this indefinitely (in the worst case all of them have the same signature), we keep fetching them and accumulating them in a list (on the heap). We don’t really know how much to batch before we actually write the value as frames.

The PR modifies the batching logic to not accumulate the results in an intermediary list
2024-03-07 17:11:44 +05:30
Parth Agrawal bf39c71d2a
Update protocol for MemcachedCache (#16035) 2024-03-06 22:28:11 -08:00
AmatyaAvadhanula c2841425f4
Handle uninitialized cache in Node role watchers (#15726)
BaseNodeRoleWatcher counts down cacheInitialized after a timeout, but also sets some flag that it was a timed-out initialization. and call nodeViewInitializationTimedOut (new method on listeners) instead of nodeViewInitialized. Then listeners can do what is most appropriate with this information.
2024-03-06 16:00:24 +05:30
Gian Merlino 930655ff18
Move retries into DataSegmentPusher implementations. (#15938)
* Move retries into DataSegmentPusher implementations.

The individual implementations know better when they should and should
not retry. They can also generate better error messages.

The inspiration for this patch was a situation where EntityTooLarge was
generated by the S3DataSegmentPusher, and retried uselessly by the
retry harness in PartialSegmentMergeTask.

* Fix missing var.

* Adjust imports.

* Tests, comments, style.

* Remove unused import.
2024-03-04 10:36:21 -08:00
Sree Charan Manamala 820febf38c
Improved Connection Count server select strategy (#15975)
Updated the Direct Druid Client so as to make Connection Count Server Selector Strategy work more efficiently.
If creating connection to a node is slow, then that slowness wouldn't be accounted for if we count the open connections after sending the request. So we increment the counter and then send the request.
2024-03-04 15:02:32 +05:30
George Shiqi Wu ef48aceff8
Fix segment/unavailable/count (#16020) 2024-03-01 15:38:27 -05:00
Sensor e0bce0ef90
Add pre-check for heavy debug logs (#15706)
Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
Co-authored-by: Benedict Jin <asdf2014@apache.org>
2024-02-29 12:58:14 +05:30
Adarsh Sanjeev d2c2036ea2
Optimize MSQ realtime queries (#15399)
Currently, while reading results from realtime tasks, requests are sent on a segment level. This is slightly wasteful, as when contacting a data servers, it is possible to transfer results for all segments which it is hosting, instead of only one segment at a time.

One change this PR makes is to group the segments on the basis of servers. This reduces the number of queries to data servers made. Since we don't have access to the number of rows for realtime segments, the grouping is done with a fixed estimated number of rows for each realtime segment.
2024-02-28 11:32:14 +05:30
Abhishek Radhakrishnan beccc401e1
Segments created in the same batch have the same `created_date` entry & rename metric (#15977)
* All segments stored in the same batch have the same created_date entry.

In the absence of a group_id column, this metadata would allow us to easily
reason about and troubleshoot ingestion-related issues.

* Rename metric name and code references to eligibleUnusedSegments.

Address review comment from https://github.com/apache/druid/pull/15941#discussion_r1503631992
2024-02-27 17:28:43 +05:30
Abhishek Radhakrishnan 38ecf980d0
Refactor and add tests and metric to KillUnusedSegments duty (auto-kill) (#15941)
* Kill duty and test improvements.

Initial commit with:
- Bug fixes - auto-kill can throw NPE when there are no datasources present and defaults mismatch.
- Add new stat for candidate segment intervals killed.
- Move a couple of debug logs to info logs for improved visibility (should only log once per kill period).
- Remove redundant checks for code readability.
- Updated tests from using mocks (also the mocks weren't using last updated timestamp) and
  add more test coverage for different config parameters.
- Add a couple of unit tests that are ignored for the eternity case to prove that
  the kill duty doesn't clean up segments with ALL grain or that end in DateTimes.MAX.
- Migrate Druid exception from user to operator persona.

* Address review comments.

* Remove unused methods.

* fix up format specifier and validate bad config tests.

* Consolidate the helpers a bit more and add another test.

* Update test names. Add javadoc placeholders for slightly involved tests.

* Add docs for metric kill/candidateUnusedSegments/count.

Also, rename to disambiguate.

* Comments.

* Apply logging suggestions from code review

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

* Review comments

- Clarify docs on eligibility.
- Add test for multiple segments in the same interval. Clarify comment.
- Remove log line from test.
- Remove lastUpdatedDate = now.plus(10) from test.

* minor cleanup.

* Clarify javadocs for getUnusedSegmentIntervals().

---------

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
2024-02-27 12:14:41 +05:30
Laksh Singla 17e4f3ac60
Refactor GroupBy and TopN code to relax the constraint of dimensions being comparable (#15559)
The code in the groupBy engine and the topN engine assume that the dimensions are comparable and can call dimA.compareTo(dimB) to sort the dimensions and group them together.
This works well for the primitive dimensions, because they are Comparable, however falls apart when the dimensions can be arrays (or in future scenarios complex columns). In cases when the dimensions are not comparable, Druid resorts to having a wrapper type ComparableStringArray and ComparableList, which is a Comparable, based on the list comparator.
2024-02-27 11:39:29 +05:30
AmatyaAvadhanula e2b7289dea
Try to fetch the task status for an active from memory (#15724)
* Reduce metadata calls to fetch the status for an active task
2024-02-26 13:53:05 +05:30
Bartosz Mikulski 45c26e8682
Fix Inspection Check in DirectDruidClientTest (#15857) 2024-02-07 02:56:26 -08:00
Bartosz Mikulski 43a1c96cd1
Fix query-cancellation-executor memory leak (#15754)
This PR fixes #15069 by resolving a memory leak caused by a thread leak in query-cancellation-executor.
2024-02-07 10:54:38 +05:30
Pramod Immaneni 59bca0951a
Parallelize storage of incremental segments (#13982)
During ingestion, incremental segments are created in memory for the different time chunks and persisted to disk when certain thresholds are reached (max number of rows, max memory, incremental persist period etc). In the case where there are a lot of dimension and metrics (1000+) it was observed that the creation/serialization of incremental segment file format for persistence and persisting the file took a while and it was blocking ingestion of new data. This affected the real-time ingestion. This serialization and persistence can be parallelized across the different time chunks. This update aims to do that.

The patch adds a simple configuration parameter to the ingestion tuning configuration to specify number of persistence threads. The default value is 1 if it not specified which makes it the same as it is today.
2024-02-07 10:43:05 +05:30
Rishabh Singh de959e513d
Add QueryLifecycle#authorize for grpc-query-extension (#15816)
Proposal #13469

Original PR #14024

A new method is being added in QueryLifecycle class to authorise a query based on authentication result.
This method is required since we authenticate the query by intercepting it in the grpc extension and pass down the authentication result.
2024-02-02 21:49:57 +05:30
Zoltan Haindrich f701197224
Enable ArrayListRowsAndColumns to StorageAdapter conversion (#15735) 2024-01-31 02:36:58 -05:00
AmatyaAvadhanula d9e8448c50
Close open segments when a newer segment with higher version is allocated (#15727) 2024-01-31 09:11:00 +05:30
zachjsh ae6afc0751
Extend unused segment metadata api response to include created date and last used updated time (#15738)
### Description

The unusedSegment api response was extended to include the original DataSegment object with the creation time and last used update time added to it. A new object `DataSegmentPlus` was created for this purpose, and the metadata queries used were updated as needed. 

example response:

```
[
  {
    "dataSegment": {
      "dataSource": "inline_data",
      "interval": "2023-01-02T00:00:00.000Z/2023-01-03T00:00:00.000Z",
      "version": "2024-01-25T16:06:42.835Z",
      "loadSpec": {
        "type": "local",
        "path": "/Users/zachsherman/projects/opensrc-druid/distribution/target/apache-druid-30.0.0-SNAPSHOT/var/druid/segments/inline_data/2023-01-02T00:00:00.000Z_2023-01-03T00:00:00.000Z/2024-01-25T16:06:42.835Z/0/index/"
      },
      "dimensions": "str_dim,double_measure1,double_measure2",
      "metrics": "",
      "shardSpec": {
        "type": "numbered",
        "partitionNum": 0,
        "partitions": 1
      },
      "binaryVersion": 9,
      "size": 1402,
      "identifier": "inline_data_2023-01-02T00:00:00.000Z_2023-01-03T00:00:00.000Z_2024-01-25T16:06:42.835Z"
    },
    "createdDate": "2024-01-25T16:06:44.196Z",
    "usedStatusLastUpdatedDate": "2024-01-25T16:07:34.909Z"
  }
]
```
2024-01-26 15:47:40 -05:00
Gian Merlino 01e9d963bd
Merge hydrant runners flatly for realtime queries. (#15757)
* Merge hydrant runners flatly for realtime queries.

Prior to this patch, we have two layers of mergeRunners for realtime
queries: one for each Sink (a logical segment) and one across all
Sinks. This is done because to keep metrics and results grouped by Sink,
given that each FireHydrant within a Sink has its own separate storage
adapter.

However, it costs extra memory usage due to the extra layer of
materialization. This is especially pronounced for groupBy queries,
which only use their merge buffers at the top layer of merging. The
lower layer of merging materializes ResultRows directly into the heap,
which can cause heap exhaustion if there are enough ResultRows.

This patch changes to a single layer of merging when bySegment: false,
just like Historicals. To accommodate that, segment metrics like
query/segment/time are now per-FireHydrant instead of per-Sink.

Two layers of merging are retained when bySegment: true. This isn't
common, because it's typically only used when segment level caching
is enabled on the Broker, which is off by default.

* Use SinkQueryRunners.

* Remove unused method.
2024-01-25 19:07:57 +08:00
Abhishek Agarwal 0ab2781a7f
Disable eager initialization for non-query connection requests (#15751) 2024-01-25 14:38:50 +05:30
Abhishek Radhakrishnan 06b228ff7c
Return a 503 status code instead of 400 during transient errors (#15756)
* Fix up HTTP status error code

* Keep the unsupported proxy as 400

* Tests and rename
2024-01-24 18:12:24 -08:00
Karan Kumar c4990f56d6
Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
Zoltan Haindrich d6a12c4389
Add ability to enable ResultCache in tests (#15465) 2024-01-22 09:02:59 -05:00
Abhishek Radhakrishnan 38c1def95a
Kill tasks honor the buffer period of unused segments (#15710)
* Kill tasks should honor the buffer period of unused segments.

- The coordinator duty KillUnusedSegments determines an umbrella interval
 for each datasource to determine the kill interval. There can be multiple unused
segments in an umbrella interval with different used_status_last_updated timestamps.
For example, consider an unused segment that is 30 days old and one that is 1 hour old. Currently
the kill task after the 30-day mark would kill both the unused segments and not retain the 1-hour
old one.

- However, when a kill task is instantiated with this umbrella interval, it’d kill
all the unused segments regardless of the last updated timestamp. We need kill
tasks and RetrieveUnusedSegmentsAction to honor the bufferPeriod to avoid killing
unused segments in the kill interval prematurely.

* Clarify default behavior in docs.

* test comments

* fix canDutyRun()

* small updates.

* checkstyle

* forbidden api fix

* doc fix, unused import, codeql scan error, and cleanup logs.

* Address review comments

* Rename maxUsedFlagLastUpdatedTime to maxUsedStatusLastUpdatedTime

This is consistent with the column name `used_status_last_updated`.

* Apply suggestions from code review

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

* Make period Duration type

* Remove older variants of runKilLTask() in OverlordClient interface

* Test can now run without waiting for canDutyRun().

* Remove previous variants of retrieveUnusedSegments from internal metadata storage coordinator interface.

Removes the following interface methods in favor of a new method added:
- retrieveUnusedSegmentsForInterval(String, Interval)
- retrieveUnusedSegmentsForInterval(String, Interval, Integer)

* Chain stream operations

* cleanup

* Pass in the lastUpdatedTime to markUnused test function and remove sleep.

---------

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
2024-01-18 22:23:50 -08:00
AmatyaAvadhanula a26defd64b
Clean up stale entries from upgradeSegments table (#15637)
* Clean up stale entries from upgradeSegments table
2024-01-17 20:49:52 +05:30
AmatyaAvadhanula 67720b60ae
Skip compaction for intervals without data (#15676)
* Skip compaction for intervals with a single tombstone and no data
2024-01-16 12:31:36 +05:30
Sam Rash 072b16c6df
Fix SQL Innterval.of() error message (#15454)
Better error message for poorly constructed intervals
2024-01-15 22:34:35 -06:00
Kashif Faraz 18d2a8957f
Refactor: Cleanup test impls of ServiceEmitter (#15683) 2024-01-15 17:37:00 +05:30
Gian Merlino cccf13ea82
Reverse, pull up lookups in the SQL planner. (#15626)
* Reverse, pull up lookups in the SQL planner.

Adds two new rules:

1) ReverseLookupRule, which eliminates calls to LOOKUP by doing
   reverse lookups.

2) AggregatePullUpLookupRule, which pulls up calls to LOOKUP above
   GROUP BY, when the lookup is injective.

Adds configs `sqlReverseLookup` and `sqlPullUpLookup` to control whether
these rules fire. Both are enabled by default.

To minimize the chance of performance problems due to many keys mapping to
the same value, ReverseLookupRule refrains from reversing a lookup if there
are more keys than `inSubQueryThreshold`. The rationale for using this setting
is that reversal works by generating an IN, and the `inSubQueryThreshold`
describes the largest IN the user wants the planner to create.

* Add additional line.

* Style.

* Remove commented-out lines.

* Fix tests.

* Add test.

* Fix doc link.

* Fix docs.

* Add one more test.

* Fix tests.

* Logic, test updates.

* - Make FilterDecomposeConcatRule more flexible.

- Make CalciteRulesManager apply reduction rules til fixpoint.

* Additional tests, simplify code.
2024-01-12 00:06:31 -08:00
Kashif Faraz f445ba4d6b
Audit API DELETE datasource (markAllSegmentsAsUnused) (#15653)
Changes:
- Add audit for `DELETE /druid/coordinator/v1/datasources/{datasourceName}`
- Minor refactor
2024-01-11 09:43:32 +05:30
PANKAJ KUMAR 047c7340ab
Adding retries to update the metadata store instead of failure (#15141)
Currently, If 2 tasks are consuming from the same partitions, try to publish the segment and update the metadata, the second task can fail because the end offset stored in the metadata store doesn't match with the start offset of the second task. We can fix this by retrying instead of failing.

AFAIK apart from the above issue, the metadata mismatch can happen in 2 scenarios:

- when we update the input topic name for the data source
- when we run 2 replicas of ingestion tasks(1 replica will publish and 1 will fail as the first replica has already updated the metadata).

Implemented the comparable function to compare the last committed end offset and new Sequence start offset. And return a specific error msg for this.

Add retry logic on indexers to retry for this specific error msg.

Updated the existing test case.
2024-01-10 12:30:54 +05:30
Rishabh Singh 71f5307277
Eliminate Periodic Realtime Segment Metadata Queries: Task Now Publish Schema for Seamless Coordinator Updates (#15475)
The initial step in optimizing segment metadata was to centralize the construction of datasource schema in the Coordinator (#14985). Subsequently, our goal is to eliminate the requirement for regularly executing queries to obtain segment schema information. This task encompasses addressing both realtime and finalized segments.

This modification specifically addresses the issue with realtime segments. Tasks will now routinely communicate the schema for realtime segments during the segment announcement process. The Coordinator will identify the schema alongside the segment announcement and subsequently update the schema for realtime segments in the metadata cache.
2024-01-10 08:55:56 +05:30
Pranav 747d973752
Skip waiting for first lookup version to get initialized (#15598) 2024-01-09 13:18:39 -08:00
Abhishek Agarwal 468b99e608
Enable query request queuing by default when total laning is turned on. (#15440)
This PR enables the flag by default to queue excess query requests in the jetty queue. Still keeping the flag so that it can be turned off if necessary. But the flag will be removed in the future.
2024-01-09 07:54:26 +05:30
Jonathan Wei 5d1e66b8f9
Allow broker to use catalog for datasource schemas for SQL queries (#15469)
* Allow broker to use catalog for datasource schemas

* More PR comments

* PR comments
2024-01-08 13:46:08 -06:00
AmatyaAvadhanula 63bfb3e6c9
Handle half-eternity intervals while fetching segments with created dates (#15608)
* Handle all intervals while fetching segments with created dates
2024-01-08 12:07:11 +05:30
George Shiqi Wu 9fe67958be
Increase ServerConnector accept queue size (#15596)
* Allow overwriting ServerConnector accept queue size

* Use a single config

* Fix spacing

* fix spacing

* fixed value

* read value from environment

* fix spacing

* Unpack value before reading

* check somaxconn on linux only
2024-01-05 12:04:15 -05:00
Gian Merlino e40b96e026
Reverse lookup fixes and enhancements. (#15611)
* Reverse lookup fixes and enhancements.

1) Add a "mayIncludeUnknown" parameter to DimFilter#optimize. This is important
   because otherwise the reverse-lookup optimization is done improperly when
   the "in" filter appears under a "not", and the lookup extractionFn may return
   null for some possible values of the filtered column. The "includeUnknown" test
   cases in InDimFilterTest illustrate the difference in behavior.

2) Enhance InDimFilter#optimizeLookup to handle "mayIncludeUnknown", and to be able
   to do a reverse lookup in a wider variety of cases.

3) Make "unapply" protected in LookupExtractor, and move callers to "unapplyAll".
   The main reason is that MapLookupExtractor, a common implementation, lacks a
   reverse mapping and therefore does a scan of the map for each call to "unapply".
   For performance sake these calls need to be batched.

* Remove optimize call from BloomDimFilter.

* Follow the law.

* Fix tests.

* Fix imports.

* Switch function.

* Fix tests.

* More tests.
2024-01-03 13:28:44 -08:00
Abhishek Radhakrishnan f0f428274a
Prometheus config property doc fixup (#15613)
* Minor fixes

* Update docs/development/extensions-contrib/prometheus.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

---------

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2024-01-02 16:28:42 -08:00
kaisun2000 a5e9b14be0
Add delay before the peon drops the segments after publishing them (#15373)
Currently in the realtime ingestion (Kafka/Kinesis) case, after publishing the segments, upon acknowledgement from the coordinator that the segments are already placed in some historicals, the peon would unannounce the segments (basically saying the segments are not in this peon anymore to the whole cluster) and drop the segments from cache and sink timeline in one shot.

The in transit queries from the brokers that still thinks the segments are in the peon can get a NullPointer exception when the peon is unsetting the hydrants in the sinks.

The fix would let the peon to wait for a configurable delay period before dropping segments, remove segments from cache etc after the peon unannounce the segments.

This delayed approach is similar to how the historicals handle segments moving out.
2024-01-02 11:08:28 +05:30
Kashif Faraz 9f568858ef
Add logging implementation for AuditManager and audit more endpoints (#15480)
Changes
- Add `log` implementation for `AuditManager` alongwith `SQLAuditManager`
- `LoggingAuditManager` simply logs the audit event. Thus, it returns empty for
all `fetchAuditHistory` calls.
- Add new config `druid.audit.manager.type` which can take values `log`, `sql` (default)
- Add new config `druid.audit.manager.logLevel` which can take values `DEBUG`, `INFO`, `WARN`.
This gets activated only if `type` is `log`.
- Remove usage of `ConfigSerde` from `AuditManager` as audit is not just limited to configs
- Add `AuditSerdeHelper` for a single implementation of serialization/deserialization of
audit payload and other utility methods.
2023-12-19 13:14:04 +05:30
Kashif Faraz feeb4f0fb0
Allocate pending segments at latest committed version (#15459)
The segment allocation algorithm reuses an already allocated pending segment if the new allocation request is made for the same parameters:

datasource
sequence name
same interval
same value of skipSegmentLineageCheck (false for batch append, true for streaming append)
same previous segment id (used only when skipSegmentLineageCheck = false)
The above parameters can thus uniquely identify a pending segment (enforced by the UNIQUE constraint on the sequence_name_prev_id_sha1 column in druid_pendingSegments metadata table).

This reuse is done in order to

allow replica tasks (in case of streaming ingestion) to use the same set of segment IDs.
allow re-run of a failed batch task to use the same segment ID and prevent unnecessary allocations
2023-12-14 16:18:39 +05:30
AmatyaAvadhanula 91ca8e73d6
Skip compaction for datasources with partial-eternity segments (#15542)
This PR builds on #13304 to skip compaction for datasources with segments that have their interval start or end coinciding with Eternity interval end-points.

This is needed in order to prevent an issue similar to #13208 as the Coordinator tries to iterate over a large number of intervals when trying to compact an interval with infinite start or end.
2023-12-12 15:06:45 +05:30
zachjsh ab7d9bc6ec
Add api for Retrieving unused segments (#15415)
### Description

This pr adds an api for retrieving unused segments for a particular datasource. The api supports pagination by the addition of `limit` and `lastSegmentId` parameters. The resulting unused segments are returned with optional `sortOrder`, `ASC` or `DESC` with respect to the matching segments `id`, `start time`, and `end time`, or not returned in any guarenteed order if `sortOrder` is not specified

`GET /druid/coordinator/v1/datasources/{dataSourceName}/unusedSegments?interval={interval}&limit={limit}&lastSegmentId={lastSegmentId}&sortOrder={sortOrder}`

Returns a list of unused segments for a datasource in the cluster contained within an optionally specified interval.
Optional parameters for limit and lastSegmentId can be given as well, to limit results and enable paginated results.
The results may be sorted in either ASC, or DESC order depending on specifying the sortOrder parameter.

`dataSourceName`: The name of the datasource
`interval`:                 the specific interval to search for unused segments for.
`limit`:                      the maximum number of unused segments to return information about. This property helps to
                                 support pagination
`lastSegmentId`:     the last segment id from which to search for results. All segments returned are > this segment 
                                 lexigraphically if sortOrder is null or ASC, or < this segment lexigraphically if sortOrder is DESC.
`sortOrder`:            Specifies the order with which to return the matching segments by start time, end time. A null
                                value indicates that order does not matter.

This PR has:

- [x] been self-reviewed.
   - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.)
- [x] added documentation for new or modified features or behaviors.
- [ ] a release note entry in the PR description.
- [x] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
- [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
- [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
- [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
- [ ] added integration tests.
- [x] been tested in a test Druid cluster.
2023-12-11 16:32:18 -05:00
Abhishek Radhakrishnan 96be82a3e6
Clean up duty for non-overlapping eternity tombstones (#15281)
* Add initial draft of MarkDanglingTombstonesAsUnused duty.

* Use overshadowed segments instead of all used segments.

* Add unit test for MarkDanglingSegmentsAsUnused duty.

* Add mock call

* Simplify code.

* Docs

* shorter lines formatting

* metric doc

* More tests, refactor and fix up some logic.

* update javadocs; other review comments.

* Make numCorePartitions as 0 in the TombstoneShardSpec.

* fix up test

* Add tombstone core partition tests

* Update docs/design/coordinator.md

Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>

* review comment

* Minor cleanup

* Only consider tombstones with 0 core partitions

* Need to register the test shard type to make jackson happy

* test comments

* checkstyle

* fixup misc typos in comments

* Update logic to use overshadowed segments

* minor cleanup

* Rename duty to eternity tombstone instead of dangling. Add test for full eternity tombstone.

* Address review feedback.

---------

Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
2023-12-11 08:57:15 -08:00
Clint Wylie 557f3f6f57
add array column type support to EXTEND operator (#15458) 2023-12-06 23:21:35 -08:00
Rishabh Singh 6a64f72c67
Lookup on incomplete partition set in SegmentMetadataQuerySegmentWalker (#15496)
Description
With CentralizedDatasourceSchema (#14989) feature enabled, metadata for appended segments was not being refreshed. This caused numRows to be 0 for the new segments and would probably cause the datasource schema to not include columns from the new segments.

Analysis
The problem turned out in the new QuerySegmentWalker implementation in the Coordinator. It first finds the segment to be queried in the Coordinator timeline. Then it creates a new timeline of the segments present in the timeline.
The problem was that it is looking up complete partition set in the new timeline. Since the appended segments by themselves do not make a complete partition set, no SegmentMetadataQuery were executed.
2023-12-06 15:25:28 +05:30