Description:
Task action audit logging was first deprecated and disabled by default in Druid 0.13, #6368.
As called out in the original discussion #5859, there are several drawbacks to persisting task action audit logs.
- Only usage of the task audit logs is to serve the API `/indexer/v1/task/{taskId}/segments`
which returns the list of segments created by a task.
- The use case is really narrow and no prod clusters really use this information.
- There can be better ways of obtaining this information, such as the metric
`segment/added/bytes` which reports both the segment ID and task ID
when a segment is committed by a task. We could also include committed segment IDs in task reports.
- A task persisting several segments would bloat up the audit logs table putting unnecessary strain
on metadata storage.
Changes:
- Remove `TaskAuditLogConfig`
- Remove method `TaskAction.isAudited()`. No task action is audited anymore.
- Remove `SegmentInsertAction` as it is not used anymore. `SegmentTransactionalInsertAction`
is the new incarnation which has been in use for a while.
- Deprecate `MetadataStorageActionHandler.addLog()` and `getLogs()`. These are not used anymore
but need to be retained for backward compatibility of extensions.
- Do not create `druid_taskLog` metadata table anymore.
* Initial support for bootstrap segments.
- Adds a new API in the coordinator.
- All processes that have storage locations configured (including tasks)
talk to the coordinator if they can, and fetch bootstrap segments from it.
- Then load the segments onto the segment cache as part of startup.
- This addresses the segment bootstrapping logic required by processes before
they can start serving queries or ingesting.
This patch also lays the foundation to speed up upgrades.
* Fail open by default if there are any errors talking to the coordinator.
* Add test for failure scenario and cleanup logs.
* Cleanup and add debug log
* Assert the events so we know the list exactly.
* Revert RunRules test.
The rules aren't evaluated if there are no clusters.
* Revert RunRulesTest too.
* Remove debug info.
* Make the API POST and update log.
* Fix up UTs.
* Throw 503 from MetadataResource; clean up exception handling and DruidException.
* Remove unused logger, add verification of metrics and docs.
* Update error message
* Update server/src/main/java/org/apache/druid/server/coordination/SegmentLoadDropHandler.java
Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
* Apply suggestions from code review
Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
* Adjust test metric expectations with the rename.
* Add BootstrapSegmentResponse container in the response for future extensibility.
* Rename to BootstrapSegmentsInfo for internal consistency.
* Remove unused log.
* Use a member variable for broadcast segments instead of segmentAssigner.
* Minor cleanup
* Add test for loadable bootstrap segments and clarify comment.
* Review suggestions.
---------
Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
* Simplify serialized form of JsonInputFormat.
Use JsonInclude for keepNullColumns, assumeNewlineDelimited, and
useJsonNodeReader. Because the default value of keepNullColumns is
variable, we store the original configured value rather than the
derived value, and include if the original value is nonnull.
* Fix test.
This PR logs the segment type and reason chosen. It also adds it to the query report, to be displayed in the UI.
This PR adds a new section to the reports, segmentReport. This contains the segment type created, if the query is an ingestion, and null otherwise.
* Mark used and unused APIs by versions.
* remove the conditional invocations.
* isValid() and test updates.
* isValid() and tests.
* Remove warning logs for invalid user requests. Also, downgrade visibility.
* Update resp message, etc.
* tests and some cleanup.
* Docs draft
* Clarify docs
* Update server/src/main/java/org/apache/druid/server/http/DataSourcesResource.java
Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
* Review comments
* Remove default interface methods only used in tests and update docs.
* Clarify javadocs and @Nullable.
* Add more tests.
* Parameterized versions.
---------
Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
Merging the work so far. @ektravel , @vogievetsky if there are additional improvements, let's track them & make another pr.
* Refactor streaming ingestion docs
* Update property definition
* Update after review
* Update known issues
* Move kinesis and kafka topics to ingestion, add redirects
* Saving changes
* Saving
* Add input format text
* Update after review
* Minor text edit
* Update example syntax
* Revert back to colon
* Fix merge conflicts
* Fix broken links
* Fix spelling error
* Undocument unused segments retrieval API.
* Mark API deprecated and unstable. Note that it'll be removed.
* Cleanup .spelling entries
* Remove the Unstable annotation
### Description
This pr adds an api for retrieving unused segments for a particular datasource. The api supports pagination by the addition of `limit` and `lastSegmentId` parameters. The resulting unused segments are returned with optional `sortOrder`, `ASC` or `DESC` with respect to the matching segments `id`, `start time`, and `end time`, or not returned in any guarenteed order if `sortOrder` is not specified
`GET /druid/coordinator/v1/datasources/{dataSourceName}/unusedSegments?interval={interval}&limit={limit}&lastSegmentId={lastSegmentId}&sortOrder={sortOrder}`
Returns a list of unused segments for a datasource in the cluster contained within an optionally specified interval.
Optional parameters for limit and lastSegmentId can be given as well, to limit results and enable paginated results.
The results may be sorted in either ASC, or DESC order depending on specifying the sortOrder parameter.
`dataSourceName`: The name of the datasource
`interval`: the specific interval to search for unused segments for.
`limit`: the maximum number of unused segments to return information about. This property helps to
support pagination
`lastSegmentId`: the last segment id from which to search for results. All segments returned are > this segment
lexigraphically if sortOrder is null or ASC, or < this segment lexigraphically if sortOrder is DESC.
`sortOrder`: Specifies the order with which to return the matching segments by start time, end time. A null
value indicates that order does not matter.
This PR has:
- [x] been self-reviewed.
- [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.)
- [x] added documentation for new or modified features or behaviors.
- [ ] a release note entry in the PR description.
- [x] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
- [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
- [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
- [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
- [ ] added integration tests.
- [x] been tested in a test Druid cluster.
In the current design, brokers query both data nodes and tasks to fetch the schema of the segments they serve. The table schema is then constructed by combining the schemas of all segments within a datasource. However, this approach leads to a high number of segment metadata queries during broker startup, resulting in slow startup times and various issues outlined in the design proposal.
To address these challenges, we propose centralizing the table schema management process within the coordinator. This change is the first step in that direction. In the new arrangement, the coordinator will take on the responsibility of querying both data nodes and tasks to fetch segment schema and subsequently building the table schema. Brokers will now simply query the Coordinator to fetch table schema. Importantly, brokers will still retain the capability to build table schemas if the need arises, ensuring both flexibility and resilience.
Currently, after an MSQ query, the web console is responsible for waiting for the segments to load. It does so by checking if there are any segments loading into the datasource ingested into, which can cause some issues, like in cases where the segments would never be loaded, or would end up waiting for other ingests as well.
This PR shifts this responsibility to the controller, which would have the list of segments created.
There is a current issue due to inconsistent metadata between worker and controller in MSQ. A controller can receive one set of segments, which are then marked as unused by, say, a compaction job. The worker would be unable to get the segment information as MetadataResource.
* Add supervisor /resetOffsets API.
- Add a new endpoint /druid/indexer/v1/supervisor/<supervisorId>/resetOffsets
which accepts DataSourceMetadata as a body parameter.
- Update logs, unit tests and docs.
* Add a new interface method for backwards compatibility.
* Rename
* Adjust tests and javadocs.
* Use CoreInjectorBuilder instead of deprecated makeInjectorWithModules
* UT fix
* Doc updates.
* remove extraneous debugging logs.
* Remove the boolean setting; only ResetHandle() and resetInternal()
* Relax constraints and add a new ResetOffsetsNotice; cleanup old logic.
* A separate ResetOffsetsNotice and some cleanup.
* Minor cleanup
* Add a check & test to verify that sequence numbers are only of type SeekableStreamEndSequenceNumbers
* Add unit tests for the no op implementations for test coverage
* CodeQL fix
* checkstyle from merge conflict
* Doc changes
* DOCUSAURUS code tabs fix. Thanks, Brian!
* Update to Calcite 1.35.0
* Update from.ftl for Calcite 1.35.0.
* Fixed tests in Calcite upgrade by doing the following:
1. Added a new rule, CoreRules.PROJECT_FILTER_TRANSPOSE_WHOLE_PROJECT_EXPRESSIONS, to Base rules
2. Refactored the CorrelateUnnestRule
3. Updated CorrelateUnnestRel accordingly
4. Fixed a case with selector filters on the left where Calcite was eliding the virtual column
5. Additional test cases for fixes in 2,3,4
6. Update to StringListAggregator to fail a query if separators are not propagated appropriately
* Refactored for testcases to pass after the upgrade, introduced 2 new data sources for handling filters and select projects
* Added a literalSqlAggregator as the upgraded Calcite involved changes to subquery remove rule. This corrected plans for 2 queries with joins and subqueries by replacing an useless literal dimension with a post agg. Additionally a test with COUNT DISTINCT and FILTER which was failing with Calcite 1.21 is added here which passes with 1.35
* Updated to latest avatica and updated code as SqlUnknownTimeStamp is now used in Calcite which needs to be resolved to a timestamp literal
* Added a wrapper segment ref to use for unnest and filter segment reference
* Remove chatAsync parameter, so chat is always async.
chatAsync has been made default in Druid 26. I have seen good
battle-testing of it in production, and am comfortable removing the
older sync client.
This was the last remaining usage of IndexTaskClient, so this patch
deletes all that stuff too.
* Remove unthrown exception.
* Remove unthrown exception.
* No more TimeoutException.
* Claim full support for Java 17.
No production code has changed, except the startup scripts.
Changes:
1) Allow Java 17 without DRUID_SKIP_JAVA_CHECK.
2) Include the full list of opens and exports on both Java 11 and 17.
3) Document that Java 17 is both supported and preferred.
4) Switch some tests from Java 11 to 17 to get better coverage on the
preferred version.
* Doc update.
* Update errorprone.
* Update docker_build_containers.sh.
* Update errorprone in licenses.yaml.
* Add some more run-javas.
* Additional run-javas.
* Update errorprone.
* Suppress new errorprone error.
* Add exports and opens in ForkingTaskRunner for Java 11+.
Test, doc changes.
* Additional errorprone updates.
* Update for errorprone.
* Restore old fomatting in LdapCredentialsValidator.
* Copy bin/ too.
* Fix Java 15, 17 build line in docker_build_containers.sh.
* Update busybox image.
* One more java command.
* Fix interpolation.
* IT commandline refinements.
* Switch to busybox 1.34.1-glibc.
* POM adjustments, build and test one IT on 17.
* Additional debugging.
* Fix silly thing.
* Adjust command line.
* Add exports and opens one more place.
* Additional harmonization of strong encapsulation parameters.