Commit Graph

2256 Commits

Author SHA1 Message Date
AmatyaAvadhanula f3d69f30e6
Associate pending segments with the tasks that requested them (#16144)
Changes:
- Add column `task_allocator_id` to `pendingSegments` metadata table.
- Add column `upgraded_from_segment_id` to `pendingSegments` metadata table.
- Add interface `PendingSegmentAllocatingTask` and implement it by all tasks which
can allocate pending segments.
- Use `taskAllocatorId` to identify the task (and its sub-tasks or replicas) to which
a pending segment has been allocated.
- Perform active cleanup of pending segments in `TaskLockbox` once there are no
active tasks for the corresponding task allocator id.
- When committing APPEND segments, also commit all upgraded pending segments
corresponding to that task allocator id.
- When committing REPLACE segments, upgrade all overlapping pending segments in
the same transaction.
2024-04-17 09:06:31 +05:30
Kashif Faraz 81d7b6ebe1
Fix OverlordClient to read reports as a concrete `ReportMap` (#16226)
Follow up to #16217 

Changes:
- Update `OverlordClient.getReportAsMap()` to return `TaskReport.ReportMap`
- Move the following classes to `org.apache.druid.indexer.report` in the `druid-processing` module
  - `TaskReport`
  - `KillTaskReport`
  - `IngestionStatsAndErrorsTaskReport`
  - `TaskContextReport`
  - `TaskReportFileWriter`
  - `SingleFileTaskReportFileWriter`
  - `TaskReportSerdeTest`
- Remove `MsqOverlordResourceTestClient` as it had only one method
which is already present in `OverlordResourceTestClient` itself
2024-04-15 08:00:59 +05:30
YongGang da9feb4430
Introduce TaskContextReport for reporting task context (#16041)
Changes:
- Add `TaskContextEnricher` interface to improve task management and monitoring
- Invoke `enrichContext` in `TaskQueue.add()` whenever a new task is submitted to the Overlord
- Add `TaskContextReport` to write out task context information in reports
2024-04-12 08:57:49 +05:30
Vishesh Garg 3d595cfab1
Add storeCompactionState flag support to msq (#15965)
Compaction in the native engine by default records the state of compaction for each segment in the lastCompactionState segment field. This PR adds support for doing the same in the MSQ engine, targeted for future cases such as REPLACE and compaction done via MSQ.

Note that this PR doesn't implicitly store the compaction state for MSQ replace tasks; it is stored with flag "storeCompactionState": true in the query context.
2024-04-09 16:47:47 +05:30
Abhishek Radhakrishnan 75fb57ed6e
Update error messages when supervisor's checkpoint state is invalid (#16208)
* Update error message when topic messages.

Suggest resetting the supervisor when the topic changes instead of changing
the supervisor name which is actually making a new supervisor.

* Update server/src/main/java/org/apache/druid/metadata/IndexerSQLMetadataStorageCoordinator.java

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

* Cleanup

* Remove log and include oldCommitMetadataFromDb

* Fix test

---------

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
2024-04-03 10:34:17 -07:00
AmatyaAvadhanula 218513ad55
Use created time from metadata store in list tasks (#16228) 2024-04-03 09:03:32 +05:30
Kashif Faraz 0de44d91f1
Cleanup serialiazation of TaskReportMap (#16217)
* Build task reports in AbstractBatchIndexTask

* Minor cleanup

* Apply suggestions from code review by @abhishekrb

Co-authored-by: Abhishek Radhakrishnan <abhishek.rb19@gmail.com>

* Cleanup IndexTaskTest

* Fix formatting

* Fix coverage

* Cleanup serialization of TaskReport map

* Replace occurrences of Map<String, TaskReport>

* Return TaskReport.ReportMap for live reports, fix test comparisons

* Address test failures

---------

Co-authored-by: Abhishek Radhakrishnan <abhishek.rb19@gmail.com>
2024-04-01 11:53:24 -07:00
Adithya Chakilam 463010bb29
Populate segment stats for non-parallel compaction jobs (#16171)
* Populate segment stats for non-parallel compaction jobs

* fix

* add-tests

* comments

* update-test

* comments
2024-03-29 09:40:55 -04:00
Kashif Faraz 4df4896674
Refactor: Add common method in AbstractBatchIndexTask to create ingestion stats report (#16202)
Changes
-  No functional changes
- Add method `AbstractBatchIndexTask.buildIngestionStatsReport()` used in several batch tasks
- Add utility method `AbstractBatchIndexTask.addBuildSegmentStatsToReport()`
- Use boolean argument to represent a full report instead of the String `full` 
in internal methods. (REST API remains unchanged.)
- Rename `IngestionStatsAndErrorsTaskReportData` to `IngestionStatsAndErrors`
- Clean up some of the methods
2024-03-28 23:07:00 +05:30
Adithya Chakilam a65b2d4f41
Visibility into LagBased AutoScaler desired task count (#16199)
* Visibility into skipped scale notices

* comments

* change to emit always instead of just skips

* fix failing test

* comments

* Add couple more tests
2024-03-27 13:08:00 -04:00
Gian Merlino 58a8a23243
Avoid conversion to String in JsonReader, JsonNodeReader. (#15693)
* Avoid conversion to String in JsonReader, JsonNodeReader.

These readers were running UTF-8 decode on the provided entity to
convert it to a String, then parsing the String as JSON. The patch
changes them to parse the provided entity's input stream directly.

In order to preserve the nice error messages that include parse errors,
the readers now need to open the entity again on the error path, to
re-read the data. To make this possible, the InputEntity#open contract
is tightened to require the ability to re-open entities, and existing
InputEntity implementations are updated to allow re-opening.

This patch also renames JsonLineReaderBenchmark to JsonInputFormatBenchmark,
updates it to benchmark all three JSON readers, and adds a case that reads
fields out of the parsed row (not just creates it).

* Fixes for static analysis.

* Implement intermediateRowAsString in JsonReader.

* Enhanced JsonInputFormatBenchmark.

Renames JsonLineReaderBenchmark to JsonInputFormatBenchmark, and enhances it to
test various readers (JsonReader, JsonLineReader, JsonNodeReader) as well as
to test with/without field discovery.
2024-03-26 08:16:05 -07:00
Abhishek Radhakrishnan 95595ba4f5
Fix handling an empty list of versions (#16198)
* Differentiate null and empty lists of segment IDs and versions.

Treat them differently so the. Segment IDs and versions can be An empty list,
in which case, the queries should just not return anything. Versions are optional, so
they can be null, which just indicates nothing, so the queries should return segments with
all possible versions. Segment IDs cannot be null as indicated by the absence of @Nullable
annotation.

* Update javadocs and add empty versions test to kill task.

* Add test for RetrieveSegmentsActions as well.
2024-03-25 17:51:24 -07:00
Kashif Faraz e7dc00b86d
Refactor: Simplify creation input row filter predicate in various batch tasks (#16196)
Changes:
- Simplify method `AbstractBatchIndexTask.defaultRowFilter()` and rename
- Add method `allowNonNullWithinInputIntervalsOf()`
- Add javadocs
2024-03-26 04:54:07 +05:30
Kashif Faraz 82f443340d
Clean up TaskQueueTest (#16187)
Changes:
- Remove redundant code from `TaskQueueTest`
- Use lambdas in `TaskQueue`
- Simplify error message when `TaskQueue` is full
2024-03-25 09:40:01 +05:30
AmatyaAvadhanula cfa2a901b3
Redact passwords from tasks fetched from the TaskQueue (#16182)
* Redact passwords from tasks fetched from the TaskQueue
2024-03-23 14:22:11 +05:30
Rushikesh Bankar 3d8b0ffae8
Add indexer level task metrics to provide more visibility in the task distribution (#15991)
Changes:

Add the following indexer level task metrics:
- `worker/task/running/count`
- `worker/task/assigned/count`
- `worker/task/completed/count`

These metrics will provide more visibility into the tasks distribution across indexers
(We often see a task skew issue across indexers and with this issue it would be easier
to catch the imbalance)
2024-03-21 11:08:01 +05:30
Abhishek Radhakrishnan fa8e511492
Add versions to `markUsed` and `markUnused` APIs (#16141)
* Mark used and unused APIs by versions.

* remove the conditional invocations.

* isValid() and test updates.

* isValid() and tests.

* Remove warning logs for invalid user requests. Also, downgrade visibility.

* Update resp message, etc.

* tests and some cleanup.

* Docs draft

* Clarify docs

* Update server/src/main/java/org/apache/druid/server/http/DataSourcesResource.java

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

* Review comments

* Remove default interface methods only used in tests and update docs.

* Clarify javadocs and @Nullable.

* Add more tests.

* Parameterized versions.

---------

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
2024-03-19 09:22:25 -07:00
Zoltan Haindrich 0a42342cef
Update Calcite*Test to use junit5 (#16106)
* Update Calcite*Test to use junit5

* change the way temp dirs are handled
* add openrewrite workflow to safeguard upgrade
* replace junitparamrunner with standard junit5 parametered tests
* update a few rules to junit5 api
* lots of boring changes

* cleanup QueryLogHook

* cleanup

* fix compile error: ARRAYS_DATASOURCE

* fix test

* remove enclosed

* empty

+TEST:TDigestSketchSqlAggregatorTest,HllSketchSqlAggregatorTest,DoublesSketchSqlAggregatorTest,ThetaSketchSqlAggregatorTest,ArrayOfDoublesSketchSqlAggregatorTest,BloomFilterSqlAggregatorTest,BloomDimFilterSqlTest,CatalogIngestionTest,CatalogQueryTest,FixedBucketsHistogramQuantileSqlAggregatorTest,QuantileSqlAggregatorTest,MSQArraysTest,MSQDataSketchesTest,MSQExportTest,MSQFaultsTest,MSQInsertTest,MSQLoadedSegmentTests,MSQParseExceptionsTest,MSQReplaceTest,MSQSelectTest,InsertLockPreemptedFaultTest,MSQWarningsTest,SqlMSQStatementResourcePostTest,SqlStatementResourceTest,CalciteSelectJoinQueryMSQTest,CalciteSelectQueryMSQTest,CalciteUnionQueryMSQTest,MSQTestBase,VarianceSqlAggregatorTest,SleepSqlTest,SqlRowTransformerTest,DruidAvaticaHandlerTest,DruidStatementTest,BaseCalciteQueryTest,CalciteArraysQueryTest,CalciteCorrelatedQueryTest,CalciteExplainQueryTest,CalciteExportTest,CalciteIngestionDmlTest,CalciteInsertDmlTest,CalciteJoinQueryTest,CalciteLookupFunctionQueryTest,CalciteMultiValueStringQueryTest,CalciteNestedDataQueryTest,CalciteParameterQueryTest,CalciteQueryTest,CalciteReplaceDmlTest,CalciteScanSignatureTest,CalciteSelectQueryTest,CalciteSimpleQueryTest,CalciteSubqueryTest,CalciteSysQueryTest,CalciteTableAppendTest,CalciteTimeBoundaryQueryTest,CalciteUnionQueryTest,CalciteWindowQueryTest,DecoupledPlanningCalciteJoinQueryTest,DecoupledPlanningCalciteQueryTest,DecoupledPlanningCalciteUnionQueryTest,DrillWindowQueryTest,DruidPlannerResourceAnalyzeTest,IngestTableFunctionTest,QueryTestRunner,SqlTestFrameworkConfig,SqlAggregationModuleTest,ExpressionsTest,GreatestExpressionTest,IPv4AddressMatchExpressionTest,IPv4AddressParseExpressionTest,IPv4AddressStringifyExpressionTest,LeastExpressionTest,TimeFormatOperatorConversionTest,CombineAndSimplifyBoundsTest,FiltrationTest,SqlQueryTest,CalcitePlannerModuleTest,CalcitesTest,DruidCalciteSchemaModuleTest,DruidSchemaNoDataInitTest,InformationSchemaTest,NamedDruidSchemaTest,NamedLookupSchemaTest,NamedSystemSchemaTest,RootSchemaProviderTest,SystemSchemaTest,CalciteTestBase,SqlResourceTest

* use @Nested

* add rule to remove enclosed; upgrade surefire

* remove enclosed

* cleanup

* add comment about surefire exclude
2024-03-19 04:05:12 -07:00
Abhishek Agarwal 7d307df6e9
Fix metric emission in the segment generation phase (#16146)
Fix metric emission in the segment generation phase
2024-03-18 14:38:18 +05:30
Abhishek Radhakrishnan 3eefc47722
Refactor tests and code clean up (#16129)
* Add update() in TestDerbyConnectorRule

* use common function.

* fixup build.

* fixup indentations.

* Revert "fixup indentations."

This reverts commit a9d6b73e79.

* fixup indentataions.

* Remove Thread.sleep() by directly calling updateUsedStatusLastUpdated.

* another indentation slip.

* Move common segment initialization to setup().

* Fix for checkstyle.

* review comments: indentation fixes, type.

* Wrapper class for Segments table

* Add KillUnusedSegmentsTaskBuilder in test class

* Remove javadocs for self-explanatory methods.
2024-03-15 10:13:14 -07:00
Kashif Faraz 466057c61b
Remove deprecated DruidException, EntryExistsException (#14448)
Changes:
- Remove deprecated `DruidException` (old one) and `EntryExistsException`
- Use newly added comprehensive `DruidException` instead
- Update error message in `SqlMetadataStorageActionHandler` when max packet limit is violated.
- Factor out common code from several faults into `BaseFault`.
- Slightly update javadoc in `DruidException` to render it correctly
- Remove unused classes `SegmentToMove`, `SegmentToDrop`
- Move `ServletResourceUtils` from module `druid-processing` to `druid-server`
- Add utility method to build error Response from `DruidException`.
2024-03-15 21:29:11 +05:30
Abhishek Radhakrishnan fb7bb0953d
Kill segments by versions (#15994)
* Kill task version support.

Kill tasks by default kill all versions of unused segments in the specified
interval. Users wanting to delete specific versions (for example, data compliance
reasons) and keep rest of the versions can specify the optional version in the
kill task payload.

* Formatting changes.

* Multi version tests in RetrieveSegmentsActionsTest

Sort of like method-level parameterized tests.

* Address review feedback

* Accept a list of versions instead of a single version.

Support multiple versions.

* Tests for multiple versions.

* Update docs

* Cleanup

* Address review comments.

Retain the old interface method and make it default and route it to
the method with nullable versions variant. Update usages to use the
default method where versions doesn't matter.

* Remove versions from retreive used segments action.

* Some updates.

* Apply suggestions from code review

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

* /s/actual/observed/g

* minor test cleanup

* WIP: Test fixes and updates. Also add test for kill by version with used load spec.

Checkpoint.

---------

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
2024-03-13 09:37:30 +05:30
Abhishek Radhakrishnan 0a615f16de
Fix bug where numSegmentsKilled is reported incorrectly. Also, add a unit test. (#16103) 2024-03-12 10:02:54 +05:30
Abhishek Radhakrishnan 8084f2206b
Remove `@JsonIgnore` annotations for private members of `TaskAction` classes (#16099)
* Remove @JsonIgnore annotations for private members

* checkstyle fix - removed unused imports.
2024-03-12 00:12:36 +05:30
Vishesh Garg b1c1937e94
Change last update timestamp granularity of GCS objects from seconds to milliseconds (#16083)
The previously used GCS API client library returned last update time for objects directly in milliseconds. The new library returns it in OffsetDateTime format which was being converted to seconds and stored against the object. This fix converts the time back to ms before storing it.
2024-03-09 07:54:33 +05:30
George Shiqi Wu 40ebaf83c9
Fix bug with mmless ingestion and compaction tasks on azure (#16065)
* Update azure behavior to match s3

* Add test

* Cleanup logic

* fix checkstyle

* Add comment
2024-03-08 15:42:44 -05:00
Adithya Chakilam 564c44ed85
Add stats segmentsRead and segmentsPublished to compaction task reports (#15947)
Changes:
- Add visibility into number of segments read/published by each parallel compaction
- Add new fields `segmentsRead`, `segmentsPublished` to `IngestionStatsAndErrorsTaskReportData`
- Update `ParallelIndexSupervisorTask` to populate the new stats
2024-03-07 09:37:23 +05:30
Adithya Chakilam ae022cc0c9
fixup!: #15981 Missing completion reports on index_parallel tasks (#16042)
* initial commit

* comments

* typo

* comments

* comments

* remove var

* initialize global var early

* remove new line

* small test fix

* same fix another test
2024-03-06 13:58:34 -05:00
AmatyaAvadhanula c2841425f4
Handle uninitialized cache in Node role watchers (#15726)
BaseNodeRoleWatcher counts down cacheInitialized after a timeout, but also sets some flag that it was a timed-out initialization. and call nodeViewInitializationTimedOut (new method on listeners) instead of nodeViewInitialized. Then listeners can do what is most appropriate with this information.
2024-03-06 16:00:24 +05:30
Gian Merlino 930655ff18
Move retries into DataSegmentPusher implementations. (#15938)
* Move retries into DataSegmentPusher implementations.

The individual implementations know better when they should and should
not retry. They can also generate better error messages.

The inspiration for this patch was a situation where EntityTooLarge was
generated by the S3DataSegmentPusher, and retried uselessly by the
retry harness in PartialSegmentMergeTask.

* Fix missing var.

* Adjust imports.

* Tests, comments, style.

* Remove unused import.
2024-03-04 10:36:21 -08:00
Adithya Chakilam ec52f686c0
Fix compaction tasks reports getting overwritten (#15981)
* Fix compaction tasks reports geting overwrittened

* only skip for compactiont task

* address comments

* fix boolean

* move boolean flag to task rather than spec

* rename variable

* add docs, fix missing case

* Update docs/ingestion/tasks.md

* rename var

* add task report decode check in IT

* change assert
2024-03-04 10:10:17 -05:00
Sensor e0bce0ef90
Add pre-check for heavy debug logs (#15706)
Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
Co-authored-by: Benedict Jin <asdf2014@apache.org>
2024-02-29 12:58:14 +05:30
AmatyaAvadhanula 7c42e87db9
Concurrent replace should work with supervisors using concurrent locks (#15995)
* Concurrent replace should work with supervisors using concurrent locks

* Ignore supervisors with useConcurrentLocks set to false

* Apply feedback
2024-02-29 12:06:47 +05:30
AmatyaAvadhanula e2b7289dea
Try to fetch the task status for an active from memory (#15724)
* Reduce metadata calls to fetch the status for an active task
2024-02-26 13:53:05 +05:30
Zoltan Haindrich 06deda9415
ScanAndSort query fails with NPE for simple queries (#15914)
* some stuff

* add dummy fields

* draft-fix

* rename test

* cleanup

* add null

* cleanup

* cleanup

* add test

* updates

* move check tp constructore

* cleanup

* updates/etc

* fix some more

* add rowSignatureMode

* checkstyle/etc

* override

* missing msqIncompat

* fix test

* fixes

* undo

* updates

* remove param
2024-02-24 15:33:50 -08:00
Adithya Chakilam 1f443d218c
Enable partition stats on streaming task completion report (#15930)
Changes:
- Add visibility into number of records processed by each streaming task per partition
- Add field `recordsProcessed` to `IngestionStatsAndErrorsTaskReportData`
- Populate number of records processed per partition in `SeekableStreamIndexTaskRunner`
2024-02-23 16:29:03 +05:30
Gian Merlino 9c41827dba
Globally disable AUTO_CLOSE_JSON_CONTENT. (#15880)
* Globally disable AUTO_CLOSE_JSON_CONTENT.

This JsonGenerator feature is on by default. It causes problems with code
like this:

  try (JsonGenerator jg = ...) {
    jg.writeStartArray();
    for (x : xs) {
      jg.writeObject(x);
    }
    jg.writeEndArray();
  }

If a jg.writeObject call fails due to some problem with the data it's
reading, the JsonGenerator will write the end array marker automatically
when closed as part of the try-with-resources. If the generator is writing
to a stream where the reader does not have some other mechanism to realize
that an exception was thrown, this leads the reader to believe that the
array is complete when it actually isn't.

Prior to this patch, we disabled AUTO_CLOSE_JSON_CONTENT for JSON-wrapped
SQL result formats in #11685, which fixed an issue where such results
could be erroneously interpreted as complete. This patch fixes a similar
issue with task reports, and all similar issues that may exist elsewhere,
by disabling the feature globally.

* Update test.
2024-02-16 08:52:48 -08:00
YongGang 19ed5c863f
Enhance rolling Supervisor restarts at taskDuration (#15859) 2024-02-14 15:44:34 -08:00
Adarsh Sanjeev 514b3b4d01
Add export capabilities to MSQ with SQL syntax (#15689)
* Add test

* Parser changes to support export statements

* Fix builds

* Address comments

* Add frame processor

* Address review comments

* Fix builds

* Update syntax

* Webconsole workaround

* Refactor

* Refactor

* Change export file path

* Update docs

* Remove webconsole changes

* Fix spelling mistake

* Parser changes, add tests

* Parser changes, resolve build warnings

* Fix failing test

* Fix failing test

* Fix IT tests

* Add tests

* Cleanup

* Fix unparse

* Fix forbidden API

* Update docs

* Update docs

* Address review comments

* Address review comments

* Fix tests

* Address review comments

* Fix insert unparse

* Add external write resource action

* Fix tests

* Add resource check to overlord resource

* Fix tests

* Add IT

* Update syntax

* Update tests

* Update permission

* Address review comments

* Address review comments

* Address review comments

* Add tests

* Add check for runtime parameter for bucket and path

* Add check for runtime parameter for bucket and path

* Add tests

* Update docs

* Fix NPE

* Update docs, remove deadcode

* Fix formatting
2024-02-07 22:08:50 +05:30
Pramod Immaneni 59bca0951a
Parallelize storage of incremental segments (#13982)
During ingestion, incremental segments are created in memory for the different time chunks and persisted to disk when certain thresholds are reached (max number of rows, max memory, incremental persist period etc). In the case where there are a lot of dimension and metrics (1000+) it was observed that the creation/serialization of incremental segment file format for persistence and persisting the file took a while and it was blocking ingestion of new data. This affected the real-time ingestion. This serialization and persistence can be parallelized across the different time chunks. This update aims to do that.

The patch adds a simple configuration parameter to the ingestion tuning configuration to specify number of persistence threads. The default value is 1 if it not specified which makes it the same as it is today.
2024-02-07 10:43:05 +05:30
Sam Wheating 4c58856f10
Fix incorrect ordering of args in log statement (#15846) 2024-02-06 16:12:04 -08:00
AmatyaAvadhanula ef46d88200
Release unneeded append locks after acquiring a new superseding append lock (#15682)
* Fix segment transactional append when publishing with multiple overlapping locks
2024-01-30 16:51:56 +05:30
AmatyaAvadhanula 54d0e482dc
Consolidate RetrieveSegmentsToReplaceAction into RetrieveUsedSegmentsAction (#15699)
Consolidate RetrieveSegmentsToReplaceAction into RetrieveUsedSegmentsAction
2024-01-29 19:18:43 +05:30
Gian Merlino 01e9d963bd
Merge hydrant runners flatly for realtime queries. (#15757)
* Merge hydrant runners flatly for realtime queries.

Prior to this patch, we have two layers of mergeRunners for realtime
queries: one for each Sink (a logical segment) and one across all
Sinks. This is done because to keep metrics and results grouped by Sink,
given that each FireHydrant within a Sink has its own separate storage
adapter.

However, it costs extra memory usage due to the extra layer of
materialization. This is especially pronounced for groupBy queries,
which only use their merge buffers at the top layer of merging. The
lower layer of merging materializes ResultRows directly into the heap,
which can cause heap exhaustion if there are enough ResultRows.

This patch changes to a single layer of merging when bySegment: false,
just like Historicals. To accommodate that, segment metrics like
query/segment/time are now per-FireHydrant instead of per-Sink.

Two layers of merging are retained when bySegment: true. This isn't
common, because it's typically only used when segment level caching
is enabled on the Broker, which is off by default.

* Use SinkQueryRunners.

* Remove unused method.
2024-01-25 19:07:57 +08:00
Karan Kumar c4990f56d6
Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
Abhishek Radhakrishnan 38c1def95a
Kill tasks honor the buffer period of unused segments (#15710)
* Kill tasks should honor the buffer period of unused segments.

- The coordinator duty KillUnusedSegments determines an umbrella interval
 for each datasource to determine the kill interval. There can be multiple unused
segments in an umbrella interval with different used_status_last_updated timestamps.
For example, consider an unused segment that is 30 days old and one that is 1 hour old. Currently
the kill task after the 30-day mark would kill both the unused segments and not retain the 1-hour
old one.

- However, when a kill task is instantiated with this umbrella interval, it’d kill
all the unused segments regardless of the last updated timestamp. We need kill
tasks and RetrieveUnusedSegmentsAction to honor the bufferPeriod to avoid killing
unused segments in the kill interval prematurely.

* Clarify default behavior in docs.

* test comments

* fix canDutyRun()

* small updates.

* checkstyle

* forbidden api fix

* doc fix, unused import, codeql scan error, and cleanup logs.

* Address review comments

* Rename maxUsedFlagLastUpdatedTime to maxUsedStatusLastUpdatedTime

This is consistent with the column name `used_status_last_updated`.

* Apply suggestions from code review

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

* Make period Duration type

* Remove older variants of runKilLTask() in OverlordClient interface

* Test can now run without waiting for canDutyRun().

* Remove previous variants of retrieveUnusedSegments from internal metadata storage coordinator interface.

Removes the following interface methods in favor of a new method added:
- retrieveUnusedSegmentsForInterval(String, Interval)
- retrieveUnusedSegmentsForInterval(String, Interval, Integer)

* Chain stream operations

* cleanup

* Pass in the lastUpdatedTime to markUnused test function and remove sleep.

---------

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
2024-01-18 22:23:50 -08:00
Gian Merlino 764f41d959
Clear "lineSplittable" for JSON when using KafkaInputFormat. (#15692)
* Clear "lineSplittable" for JSON when using KafkaInputFormat.

JsonInputFormat has a "withLineSplittable" method that can be used to
control whether JSON is read line-by-line, or as a whole. The intent
is that in streaming ingestion, "lineSplittable" is false (although it
can be overridden by "assumeNewlineDelimited"), and in batch ingestion,
lineSplittable is true.

When a "json" format is wrapped by a "kafka" format, this isn't set
properly. This patch updates KafkaInputFormat to set this on an
underlying "json" format.

The tests for KafkaInputFormat were overriding the "lineSplittable"
parameter explicitly, which wasn't really fair, because that made them
unrealistic to what happens in production. Now they omit the parameter
and get the production behavior.

* Add test.

* Fix test coverage.
2024-01-18 03:22:41 -08:00
Gian Merlino d3d0c1c91e
Faster parsing: reduce String usage, list-based input rows. (#15681)
* Faster parsing: reduce String usage, list-based input rows.

Three changes:

1) Reworked FastLineIterator to optionally avoid generating Strings
   entirely, and reduce copying somewhat. Benefits the line-oriented
   JSON, CSV, delimited (TSV), and regex formats.

2) In the delimited (TSV) format, when the delimiter is a single byte,
   split on UTF-8 bytes directly.

3) In CSV and delimited (TSV) formats, use list-based input rows when
   the column list is provided upfront by the user.

* Fix style.

* Fix inspections.

* Restore validation.

* Remove fastutil-extra.

* Exception type.

* Fixes for error messages.

* Fixes for null handling.
2024-01-18 19:18:46 +08:00
AmatyaAvadhanula a26defd64b
Clean up stale entries from upgradeSegments table (#15637)
* Clean up stale entries from upgradeSegments table
2024-01-17 20:49:52 +05:30
AmatyaAvadhanula 6b951b94c0
Add new context parameter for using concurrent locks (#15684)
Changes:
- Add new task context flag useConcurrentLocks.
- This can be set for an individual task or at a cluster level using `druid.indexer.task.default.context`.
- When set to true, any appending task would use an APPEND lock and any other
ingestion task would use a REPLACE lock when using time chunk locking.
- If false (default), we fall back on the context flag taskLockType and then useSharedLock.
2024-01-16 12:43:39 +05:30
Kashif Faraz 18d2a8957f
Refactor: Cleanup test impls of ServiceEmitter (#15683) 2024-01-15 17:37:00 +05:30
Abhishek Radhakrishnan 08c01f1dae
Handle and map errors in delete pending segments API (#15673)
Changes:
- Handle exception in deletePendingSegments API and map to correct HTTP status code
- Clean up exception message using `DruidException`
- Add unit tests
2024-01-15 10:09:01 +05:30
PANKAJ KUMAR 047c7340ab
Adding retries to update the metadata store instead of failure (#15141)
Currently, If 2 tasks are consuming from the same partitions, try to publish the segment and update the metadata, the second task can fail because the end offset stored in the metadata store doesn't match with the start offset of the second task. We can fix this by retrying instead of failing.

AFAIK apart from the above issue, the metadata mismatch can happen in 2 scenarios:

- when we update the input topic name for the data source
- when we run 2 replicas of ingestion tasks(1 replica will publish and 1 will fail as the first replica has already updated the metadata).

Implemented the comparable function to compare the last committed end offset and new Sequence start offset. And return a specific error msg for this.

Add retry logic on indexers to retry for this specific error msg.

Updated the existing test case.
2024-01-10 12:30:54 +05:30
Rishabh Singh 71f5307277
Eliminate Periodic Realtime Segment Metadata Queries: Task Now Publish Schema for Seamless Coordinator Updates (#15475)
The initial step in optimizing segment metadata was to centralize the construction of datasource schema in the Coordinator (#14985). Subsequently, our goal is to eliminate the requirement for regularly executing queries to obtain segment schema information. This task encompasses addressing both realtime and finalized segments.

This modification specifically addresses the issue with realtime segments. Tasks will now routinely communicate the schema for realtime segments during the segment announcement process. The Coordinator will identify the schema alongside the segment announcement and subsequently update the schema for realtime segments in the metadata cache.
2024-01-10 08:55:56 +05:30
Kashif Faraz f7bd5ba4d3
Audit create/update of a supervisor spec (#15636)
Changes
- Audit create or update of a supervisor spec. The purpose of the audit is
to track which user made change to a supervisor and when.
- The audit entry does not contain the entire spec or even a diff of the changes
as this is already captured in the `druid_supervisors` metadata table.
2024-01-08 19:46:05 +05:30
AmatyaAvadhanula c41e99e10c
Do not allocate week granular segments unless requested (#15589)
* Do not allocate week granular segments unless explicitly requested
2024-01-05 12:14:52 +05:30
Kashif Faraz c937068625
Improve polling in segment allocation queue (#15590)
Description
When batchAllocationWaitTime is set to 0, the segment allocation queue is polled continuously even when it is empty. This would take up cpu cycles unnecessarily.

Some existing race conditions would also become more frequent when the batchAllocationWaitTime is 0. This PR tries to better address those race conditions as well.

Changes
Do not reschedule a poll if queue is empty
When a new batch is added to queue, schedule a poll
Simplify keyToBatch map
Handle race conditions better
As soon as a batch starts getting processed, do not add any more requests to it
2024-01-04 17:42:02 +05:30
Abhishek Radhakrishnan 9c7d7fc777
Allow empty inserts and replaces in MSQ. (#15495)
* Allow empty inserts and replace.

- Introduce a new query context failOnEmptyInsert which defaults to false.
- When this context is false (default), MSQE will now allow empty inserts and replaces.
- When this context is true, MSQE will throw the existing InsertCannotBeEmpty MSQ fault.
- For REPLACE ALL over an ALL grain segment, the query will generate a tombstone spanning eternity
which will be removed eventually be the coordinator.
- Add unit tests in MSQInsertTest, MSQReplaceTest to test the new default behavior (i.e., when failOnEmptyInsert = false)
- Update unit tests in MSQFaultsTest to test the non-default behavior (i.e., when failOnEmptyInsert = true)

* Ignore test to see if it's the culprit for OOM

* Add heap dump config

* Bump up -Xmx from 1500 MB to 2048 MB

* Add steps to tarball and collect hprof dump to GHA action

* put back mx to 1500MB to trigger the failure

* add the step to reusable unit test workflow as well

* Revert the temp heap dump & @Ignore changes since max heap size is increased

* Minor updates

* Review comments

1. Doc suggestions
2. Add tests for empty insert and replace queries with ALL grain and limit in the
   default failOnEmptyInsert mode (=false). Add similar tests to MSQFaultsTest with
   failOnEmptyInsert = true, so the query does fail with an InsertCannotBeEmpty fault.
3. Nullable annotation and javadocs

* Add comment
	replace_limit.patch
2024-01-02 13:05:51 -08:00
kaisun2000 a5e9b14be0
Add delay before the peon drops the segments after publishing them (#15373)
Currently in the realtime ingestion (Kafka/Kinesis) case, after publishing the segments, upon acknowledgement from the coordinator that the segments are already placed in some historicals, the peon would unannounce the segments (basically saying the segments are not in this peon anymore to the whole cluster) and drop the segments from cache and sink timeline in one shot.

The in transit queries from the brokers that still thinks the segments are in the peon can get a NullPointer exception when the peon is unsetting the hydrants in the sinks.

The fix would let the peon to wait for a configurable delay period before dropping segments, remove segments from cache etc after the peon unannounce the segments.

This delayed approach is similar to how the historicals handle segments moving out.
2024-01-02 11:08:28 +05:30
Kashif Faraz 9f568858ef
Add logging implementation for AuditManager and audit more endpoints (#15480)
Changes
- Add `log` implementation for `AuditManager` alongwith `SQLAuditManager`
- `LoggingAuditManager` simply logs the audit event. Thus, it returns empty for
all `fetchAuditHistory` calls.
- Add new config `druid.audit.manager.type` which can take values `log`, `sql` (default)
- Add new config `druid.audit.manager.logLevel` which can take values `DEBUG`, `INFO`, `WARN`.
This gets activated only if `type` is `log`.
- Remove usage of `ConfigSerde` from `AuditManager` as audit is not just limited to configs
- Add `AuditSerdeHelper` for a single implementation of serialization/deserialization of
audit payload and other utility methods.
2023-12-19 13:14:04 +05:30
Kashif Faraz feeb4f0fb0
Allocate pending segments at latest committed version (#15459)
The segment allocation algorithm reuses an already allocated pending segment if the new allocation request is made for the same parameters:

datasource
sequence name
same interval
same value of skipSegmentLineageCheck (false for batch append, true for streaming append)
same previous segment id (used only when skipSegmentLineageCheck = false)
The above parameters can thus uniquely identify a pending segment (enforced by the UNIQUE constraint on the sequence_name_prev_id_sha1 column in druid_pendingSegments metadata table).

This reuse is done in order to

allow replica tasks (in case of streaming ingestion) to use the same set of segment IDs.
allow re-run of a failed batch task to use the same segment ID and prevent unnecessary allocations
2023-12-14 16:18:39 +05:30
TestBoost 85af2c8340
only create used and unused segments once to make the test faster (#15533) 2023-12-12 09:31:04 +05:30
Clint Wylie 64fcb32bcf
add native 'array contains element' filter (#15366)
* add native arrayContainsElement filter to use array column element indexes
2023-11-29 03:33:00 -08:00
Clint Wylie 97623b408c
add optional 'castToType' parameter to 'auto' column schema (#15417)
* auto but.. with an expected type
2023-11-28 17:19:23 -08:00
George Shiqi Wu 3d1d26f824
Fix mmless ingestion and index tasks (#15372)
* Fix mmless ingestion and index tasks

* Move comment

* remove dup test
2023-11-28 10:06:07 -05:00
Karan Kumar a0188192de
Fixing failing compaction/parallel index jobs during upgrade due to new actions being available on the overlord. (#15430)
* Fixing failing compaction/parallel index jobs during upgrade due to new actions not available on the overlord.

* Fixing build

* Removing extra space.

* Fixing json getter.

* Review comments.
2023-11-25 13:50:29 +05:30
Kashif Faraz 67c7b6248c
Fix log typos, clean up some kill messages in SeekableStreamSupervisor (#15424)
Changes:
- Fix log `Got end of partition marker for partition [%s] from task [%s] in discoverTasks`
by fixing order of args
- Simplify in-line classes by using lambda
- Update kill task message from `Task [%s] failed to respond to [set end offsets]
 in a timely manner, killing task` to `Failed to set end offsets, killing task`
- Clean up tests
2023-11-24 16:09:10 +05:30
Vishesh Garg 4ab0b71513
Fix missing task failure error message on Overlord caused by MessageBodyWriter not found error on Middle Manager (#15412)
Fixes missing task failure error message on Overlord.

The error message was missing since TaskManagementResource#assignTask API wasn't annotated with @Produces(MediaType.APPLICATION_JSON) resulting in the response being treated as application/octet-stream, that in turn lead to MessageBodyWriter not found error on the middle manager. The exception is not logged on the middle manager itself since it happens even before entering the assignTask function -- while mapping arg Task -> MSQControllerTask.
2023-11-24 11:35:56 +05:30
Sachidananda Maharana 2f269fe065
Returning correct Response Code HTTP 429 when taskQueue reached maxSize (#15409)
Currently when we submit a task to druid and number of currently active tasks has already reached (druid.indexer.queue.maxSize) then 500 ISE is thrown as per shown in the screenshot in #15380.

This fix will return HTTP 429 Too Many Requests(with proper error message) instead of 500 ISE, when we submit a task and queueSize has reached.
2023-11-22 14:27:20 +05:30
Kashif Faraz 4ba3cf5221
Add test to verify sequence name of Kafka task (#15397)
* Add test to verify sequence name of Kafka and Kinesis tasks
2023-11-21 10:17:32 +05:30
AmatyaAvadhanula 77828bead4
Fetch active task payloads from memory (#15377)
The TaskQueue maintains a map of active task ids to tasks, which can be utilized to get active task payloads, before falling back to the metadata store.
2023-11-17 12:19:20 +05:30
AmatyaAvadhanula cdc192d38d
Prevent multiple attempts to publish segments for the same sequence (#14995)
* Prevent a race that may cause multiple attempts to publish segments for the same sequence
2023-11-16 14:21:26 +05:30
Krishna Anandan bedf246ed2
Fixing 1 flaky test in testAPIs() (#15375) 2023-11-15 18:38:20 +05:30
Abhishek Radhakrishnan 2e79fd56a7
MSQ generates tombstones honoring granularity specified in a `REPLACE` query. (#15243)
* MSQ generates tombstones honoring the query's granularity.

This change tweaks to only account for the infinite-interval tombstones.
For finite-interval tombstones, the MSQ query granualrity will be used
which is consistent with how MSQ works.

* more tests and some cleanup.

* checkstyle

* comment edits

* Throw TooManyBuckets fault based on review; add more tests.

* Add javadocs for both methods on reconciling the methods.

* review: Move testReplaceTombstonesWithTooManyBucketsThrowsException to MsqFaultsTest

* remove unused imports.

* Move TooManyBucketsException to indexing package for shared exception handling.

* lower max bucket for tests and fixup count

* Advance and count the iterator.

* checkstyle
2023-11-14 23:35:36 -08:00
George Shiqi Wu 130bfbfc6d
Revert "Separate task lifecycle from kubernetes/location lifecycle (#15133)" (#15346)
This reverts commit dc0b163e19.
2023-11-08 13:12:30 -05:00
George Shiqi Wu a8906b6ea0
Fix k8s task runner failure reporting (#15311)
* Fix k8s task runner failure reporting

* Fix reference

* add jsonignore

* PR changes
2023-11-03 21:28:46 -04:00
Clint Wylie 5d39b94149
allow compaction to work with spatial dimensions (#15321) 2023-11-03 11:27:50 -07:00
Gian Merlino d87d92bc43
Add system fields to input sources. (#15276)
* Add system fields to input sources.

Main changes:

1) The SystemField enum defines system fields "__file_uri", "__file_path",
   and "__file_bucket". They are associated with each input entity.

2) The SystemFieldInputSource interface can be added to any InputSource
   to make it system-field-capable. It sets up serialization of a list
   of configured "systemFields" in the JSON form of the input source, and
   provides a method getSystemFieldValue for computing the value of each
   system field. Cloud object, HDFS, HTTP, and Local now have this.

* Fix various LocalInputSource calls.

* Fix style stuff.

* Fixups.

* Fix tests and coverage.
2023-11-02 10:31:28 -07:00
AmatyaAvadhanula dc3213b05d
Fix used segment retrieval in Kill tasks (#15306)
Fix used segment retrieval in Kill tasks
2023-11-02 19:07:17 +05:30
Alexander Saydakov f1132d20c5
use datasketches-java 4.2.0 (#15257)
* use datasketches-java 4.2.0

* use exclusive mode

* fixed issues raised by CodeQL

* fixed issue raised by spotbugs

* fixed issues raised by intellij

* added missing import

* Update QuantilesSketchKeyCollector search mode and adjust tests.

* Update sizeOf functions and add unit tests

* Add unit tests

---------

Co-authored-by: AlexanderSaydakov <AlexanderSaydakov@users.noreply.github.com>
Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
Co-authored-by: Adarsh Sanjeev <adarshsanjeev@gmail.com>
2023-10-26 16:28:33 -07:00
AmatyaAvadhanula 65b69cded4
Filter pending segments upgraded with transactional replace (#15169)
* Filter pending segments upgraded with transactional replace

* Push sequence name filter to metadata query
2023-10-23 21:18:47 +05:30
AmatyaAvadhanula 33fdd770f7
Consider only supervisors with append lock for concurrent transactional replace (#15220)
A SegmentTransactionReplaceAction must only update the mapping of tasks with append locks that are running concurrently. To ensure this, we return the supervisor id only if it has the taskLockType as APPEND in its context.
2023-10-22 14:12:36 +05:30
AmatyaAvadhanula a8febd457c
A Replacing task must read segments created before it acquired its lock (#15085)
* Replacing tasks must read segments created before they acquired their locks
2023-10-19 11:13:07 +05:30
Karan Kumar 953ce79439
Add undocumented taskLockType to MSQ. (#15168)
Patch adds an undocumented parameter taskLockType to MSQ so that we can start enabling this feature for users who are interested in testing the new lock types.
2023-10-17 21:44:04 +05:30
George Shiqi Wu dc0b163e19
Separate task lifecycle from kubernetes/location lifecycle (#15133)
* Separate k8s and druid task lifecycles

* Remove extra log lines

* Fix unit tests

* fix unit tests

* Fix unit tests

* notify listeners on task completion

* Fix unit test

* unused var

* PR changes

* Fix unit tests

* Fix checkstyle

* PR changes
2023-10-17 08:17:43 -07:00
AmatyaAvadhanula d25caaefa4
Add support for streaming ingestion with concurrent replace (#15039)
Add support for streaming ingestion with concurrent replace

---------

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
2023-10-13 09:09:03 +05:30
Laksh Singla 5f86072456
Prepare master for Druid 29 (#15121)
Prepare master for Druid 29
2023-10-11 10:33:45 +05:30
AmatyaAvadhanula 40a6dc4631
Optimize used segment fetching in Kill tasks (#15107)
* Optimize used segment fetching in Kill tasks
2023-10-09 17:54:13 +05:30
Xavier Léauté adef2069b1
Make unit tests pass with Java 21 (#15014)
This change updates dependencies as needed and fixes tests to remove code incompatible with Java 21
As a result all unit tests now pass with Java 21.

* update maven-shade-plugin to 3.5.0 and follow-up to #15042
  * explain why we need to override configuration when specifying outputFile
  * remove configuration from dependency management in favor of explicit overrides in each module.
* update to mockito to 5.5.0 for Java 21 support when running with Java 11+
  * continue using latest mockito 4.x (4.11.0) when running with Java 8  
  * remove need to mock private fields
* exclude incorrectly declared mockito dependency from pac4j-oidc
* remove mocking of ByteBuffer, since sealed classes can no longer be mocked in Java 21
* add JVM options workaround for system-rules junit plugin not supporting Java 18+
* exclude older versions of byte-buddy from assertj-core
* fix for Java 19 changes in floating point string representation
* fix missing InitializedNullHandlingTest
* update easymock to 5.2.0 for Java 21 compatibility
* update animal-sniffer-plugin to 1.23
* update nl.jqno.equalsverifier to 3.15.1
* update exec-maven-plugin to 3.1.0
2023-10-03 22:41:21 -07:00
George Shiqi Wu 64754b6799
Allow users to pass task payload via deep storage instead of environment variable (#14887)
This change is meant to fix a issue where passing too large of a task payload to the mm-less task runner will cause the peon to fail to startup because the payload is passed (compressed) as a environment variable (TASK_JSON). In linux systems the limit for a environment variable is commonly 128KB, for windows systems less than this. Setting a env variable longer than this results in a bunch of "Argument list too long" errors.
2023-10-03 14:08:59 +05:30
YongGang 86087cee0a
Fix Peon not fail gracefully (#14880)
* fix Peon not fail gracefully

* move methods to Task interface

* fix checkstyle

* extract to interface

* check runThread nullability

* fix merge conflict

* minor refine

* minor refine

* fix unit test

* increase latch waiting time
2023-09-29 12:39:59 -07:00
AmatyaAvadhanula f7a549123b
Commit segments only when they are covered by active locks (#15027)
* Commit segments only when they are covered by active locks
2023-09-25 13:45:42 +05:30
AmatyaAvadhanula c62193c4d7
Add support for concurrent batch Append and Replace (#14407)
Changes:
- Add task context parameter `taskLockType`. This determines the type of lock used by a batch task.
- Add new task actions for transactional replace and append of segments
- Add methods StorageCoordinator.commitAppendSegments and commitReplaceSegments
- Upgrade segments to appropriate versions when performing replace and append
- Add new metadata table `upgradeSegments` to track segments that need to be upgraded
- Add tests
2023-09-25 07:06:37 +05:30
Kashif Faraz d7c152c82c
Add a TaskReport for "kill" tasks (#15023)
- Add `KillTaskReport` that contains stats for `numSegmentsKilled`,
`numBatchesProcessed`, `numSegmentsMarkedAsUnused`
- Fix bug where exception message had no formatter but was was still being passed some args.
- Add some comments regarding deprecation of `markAsUnused` flag.
2023-09-23 07:44:27 +05:30
Kashif Faraz 409bffe7f2
Rename IMSC.announceHistoricalSegments to commitSegments (#15021)
This commit pulls out some changes from #14407 to simplify that PR.

Changes:
- Rename `IndexerMetadataStorageCoordinator.announceHistoricalSegments` to `commitSegments`
- Rename the overloaded method to `commitSegmentsAndMetadata`
- Fix some typos
2023-09-21 16:19:03 +05:30
AmatyaAvadhanula 0e3df2d2e9
Clean up stale locks if segment allocation fails (#14966)
* Clean up stale locks if segment allocation fails due to an exception
2023-09-14 14:58:02 +05:30
Clint Wylie 891f0a3fe9
longer compatibility window for nested column format v4 (#14955)
changes:
* add back nested column v4 serializers
* 'json' schema by default still uses the newer 'nested common format' used by 'auto', but now has an optional 'formatVersion' property which can be specified to override format versions on native ingest jobs
* add system config to specify default column format stuff, 'druid.indexing.formats', and property 'druid.indexing.formats.nestedColumnFormatVersion' to specify system level preferred nested column format for friendly rolling upgrades from versions which do not support the newer 'nested common format' used by 'auto'
2023-09-12 14:07:53 -07:00
George Shiqi Wu f773d83914
Mixed task runner for migration to mm-less ingestion (#14918)
* save work

* Working

* Fix runner constructor

* Working runner

* extra log lines

* try using lifecycle for everything

* clean up configs

* cleanup /workers call

* Use a single config

* Allow selecting runner

* debug changes

* Work on composite task runner

* Unit tests running

* Add documentation

* Add some javadocs

* Fix spelling

* Use standard libraries

* code review

* fix

* fix

* use taskRunner as string

* checkstyl

---------

Co-authored-by: Suneet Saldanha <suneet@apache.org>
2023-09-11 18:09:46 -07:00
Kashif Faraz 289ee1e011
Refactor: Cleanup NoopTask (#14938)
Changes:
- Simplify static `create` methods for `NoopTask`
- Remove `FirehoseFactory`, `IsReadyResult`, `readyTime` from `NoopTask`
as these fields were not being used anywhere
- Update tests
2023-09-05 09:15:41 +05:30
panhongan d4e972e1e4
Add checking for new checkpoint (#14353)
Check that a checkpoint is non-empty before adding it to the checkpoint sequence 
in a SeekableStreamSupervisor
2023-09-04 13:18:55 +05:30