Commit Graph

13679 Commits

Author SHA1 Message Date
Abhishek Radhakrishnan 050b515355
Upgrade CodeQL from v2 to latest v3. (#15619) 2024-01-03 11:31:53 -08:00
Gian Merlino 01eec4a55e
New handling for COALESCE, SEARCH, and filter optimization. (#15609)
* New handling for COALESCE, SEARCH, and filter optimization.

COALESCE is converted by Calcite's parser to CASE, which is largely
counterproductive for us, because it ends up duplicating expressions.
In the current code we end up un-doing it in our CaseOperatorConversion.
This patch has a different approach:

1) Add CaseToCoalesceRule to convert CASE back to COALESCE earlier, before
   the Volcano planner runs, using CaseToCoalesceRule.

2) Add FilterDecomposeCoalesceRule to decompose calls like
   "f(COALESCE(x, y))" into "(x IS NOT NULL AND f(x)) OR (x IS NULL AND f(y))".
   This helps use indexes when available on x and y.

3) Add CoalesceLookupRule to push COALESCE into the third arg of LOOKUP.

4) Add a native "coalesce" function so we can convert 3+ arg COALESCE.

The advantage of this approach is that by un-doing the CASE to COALESCE
conversion earlier, we have flexibility to do more stuff with
COALESCE (like decomposition and pushing into LOOKUP).

SEARCH is an operator used internally by Calcite to represent matching
an argument against some set of ranges. This patch improves our handling
of SEARCH in two ways:

1) Expand NOT points (point "holes" in the range set) from SEARCH as
   `!(a || b)` rather than `!a && !b`, which makes it possible to convert
   them to a "not" of "in" filter later.

2) Generate those nice conversions for NOT points even if the SEARCH
   is not composed of 100% NOT points. Without this change, a SEARCH
   for "x NOT IN ('a', 'b') AND x < 'm'" would get converted like
   "x < 'a' OR (x > 'a' AND x < 'b') OR (x > 'b' AND x < 'm')".

One of the steps we take when generating Druid queries from Calcite
plans is to optimize native filters. This patch improves this step:

1) Extract common ANDed predicates in ConvertSelectorsToIns, so we can
   convert "(a && x = 'b') || (a && x = 'c')" into "a && x IN ('b', 'c')".

2) Speed up CombineAndSimplifyBounds and ConvertSelectorsToIns on
   ORs with lots of children by adjusting the logic to avoid calling
   "indexOf" and "remove" on an ArrayList.

3) Refactor ConvertSelectorsToIns to reduce duplicated code between the
   handling for "selector" and "equals" filters.

* Not so final.

* Fixes.

* Fix test.

* Fix test.
2024-01-03 08:56:22 -08:00
Gian Merlino b0e52c99bb
Fix ColumnSelectorColumnIndexSelector#getColumnCapabilities. (#15614)
* Fix ColumnSelectorColumnIndexSelector#getColumnCapabilities.

It was using virtualColumns.getColumnCapabilities, which only returns
capabilities for virtual columns, not regular columns. The effect of this
is that expression filters (and in some cases, arrayContainsElement filters)
would build value matchers rather than use indexes.

I think this has been like this since #12315, which added the
getColumnCapabilities method to BitmapIndexSelector, and included the same
implementation as exists in the code today.

This error is easy to make due to the design of virtualColumns.getColumnCapabilities,
so to help avoid it in the future, this patch renames the method to
getColumnCapabilitiesWithoutFallback to emphasize that it does not return
capabilities for regular columns.

* Make getColumnCapabilitiesWithoutFallback package-private.

* Fix expression filter bitmap usage.
2024-01-02 21:09:18 -08:00
sensor cfdea06857
Fix `used_flag_last_updated` to `used_status_last_updated` in upgrade-notes.md (#15601)
* Fix `used_flag_last_updated` to `used_status_last_updated` in upgrade-notes.md

* Update docs/release-info/upgrade-notes.md

Co-authored-by: Abhishek Radhakrishnan <abhishek.rb19@gmail.com>

---------

Co-authored-by: Benedict Jin <asdf2014@apache.org>
Co-authored-by: Abhishek Radhakrishnan <abhishek.rb19@gmail.com>
2024-01-03 11:48:07 +08:00
Abhishek Radhakrishnan f0f428274a
Prometheus config property doc fixup (#15613)
* Minor fixes

* Update docs/development/extensions-contrib/prometheus.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

---------

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2024-01-02 16:28:42 -08:00
Abhishek Radhakrishnan 9c7d7fc777
Allow empty inserts and replaces in MSQ. (#15495)
* Allow empty inserts and replace.

- Introduce a new query context failOnEmptyInsert which defaults to false.
- When this context is false (default), MSQE will now allow empty inserts and replaces.
- When this context is true, MSQE will throw the existing InsertCannotBeEmpty MSQ fault.
- For REPLACE ALL over an ALL grain segment, the query will generate a tombstone spanning eternity
which will be removed eventually be the coordinator.
- Add unit tests in MSQInsertTest, MSQReplaceTest to test the new default behavior (i.e., when failOnEmptyInsert = false)
- Update unit tests in MSQFaultsTest to test the non-default behavior (i.e., when failOnEmptyInsert = true)

* Ignore test to see if it's the culprit for OOM

* Add heap dump config

* Bump up -Xmx from 1500 MB to 2048 MB

* Add steps to tarball and collect hprof dump to GHA action

* put back mx to 1500MB to trigger the failure

* add the step to reusable unit test workflow as well

* Revert the temp heap dump & @Ignore changes since max heap size is increased

* Minor updates

* Review comments

1. Doc suggestions
2. Add tests for empty insert and replace queries with ALL grain and limit in the
   default failOnEmptyInsert mode (=false). Add similar tests to MSQFaultsTest with
   failOnEmptyInsert = true, so the query does fail with an InsertCannotBeEmpty fault.
3. Nullable annotation and javadocs

* Add comment
	replace_limit.patch
2024-01-02 13:05:51 -08:00
Parth Agrawal 8505e8a909
Provide default implementation for RowFunction evalDimension method (#15452)
The PR: #13947 introduced a function evalDimension() in the interface RowFunction.
There was no default implementation added for this interface which causes all the implementations and custom transforms to fail and require to implement their own version of evalDimension method. This PR adds a default implementation in the interface which allows the evalDimension to return value as a Singleton array of eval result.
2024-01-02 11:14:23 +05:30
kaisun2000 a5e9b14be0
Add delay before the peon drops the segments after publishing them (#15373)
Currently in the realtime ingestion (Kafka/Kinesis) case, after publishing the segments, upon acknowledgement from the coordinator that the segments are already placed in some historicals, the peon would unannounce the segments (basically saying the segments are not in this peon anymore to the whole cluster) and drop the segments from cache and sink timeline in one shot.

The in transit queries from the brokers that still thinks the segments are in the peon can get a NullPointer exception when the peon is unsetting the hydrants in the sinks.

The fix would let the peon to wait for a configurable delay period before dropping segments, remove segments from cache etc after the peon unannounce the segments.

This delayed approach is similar to how the historicals handle segments moving out.
2024-01-02 11:08:28 +05:30
Kashif Faraz cce539495d
[Flaky test] Fix basic auth integration test (#15561)
Database slowness while doing audits seems to be causing flakiness in auth ITs.

The failing test is almost always
`ITBasicAuthConfigurationTest.test_avaticaQuery_datasourceAndContextParamsUser`
but in some rare cases, other tests fail too. Alternately, this failing test has been seen to pass too.

It is most likely because the auth changes are not able to propagate in time from
the coordinator to other services.

Fix: Just log the audits rather than persisting them to database.
Most audits have been newly added and it is okay to not have them persisted.
Moreover, logging audits can also be more beneficial while debugging an IT.
2023-12-23 12:11:12 +05:30
AlbericByte a2e65e6a89
Support to pass dynamic values to timestamp Extract function (#15586)
Fixes #15072

Before this modification , the third parameter (timezone) require to be a Literal, it will throw a error when this parameter is column Identifier.
2023-12-21 11:57:52 +05:30
Clint Wylie 8a45efbf65
fix some null handling bugs with vector expression processors (#15587) 2023-12-19 08:14:17 -08:00
Kashif Faraz 9f568858ef
Add logging implementation for AuditManager and audit more endpoints (#15480)
Changes
- Add `log` implementation for `AuditManager` alongwith `SQLAuditManager`
- `LoggingAuditManager` simply logs the audit event. Thus, it returns empty for
all `fetchAuditHistory` calls.
- Add new config `druid.audit.manager.type` which can take values `log`, `sql` (default)
- Add new config `druid.audit.manager.logLevel` which can take values `DEBUG`, `INFO`, `WARN`.
This gets activated only if `type` is `log`.
- Remove usage of `ConfigSerde` from `AuditManager` as audit is not just limited to configs
- Add `AuditSerdeHelper` for a single implementation of serialization/deserialization of
audit payload and other utility methods.
2023-12-19 13:14:04 +05:30
Clint Wylie e373f62692
fix expression post aggregator array handling when grouping wrapper types leak (#15543)
* fix expression post aggregator array handling when grouping wrapper types leak
* more consistent expression function error messaging
2023-12-15 21:43:27 -08:00
Alexander T 90af71b371
router.sh is missing in Druid Distribution (#15547)
Every time we roll out a new version of Druid on our cluster, I recognize that the script for starting the router process is missing. So I added it =)
2023-12-15 10:42:04 -08:00
Jan Werner fa2c8edb5d
unpin snakeyaml, add suppressions and licenses (#15549)
* unpin snakeyaml globally, add suppressions and licenses
* pin snakeyaml in the specific modules that require version 1.x, update licenses and owasp suppression

This removes the pin of the Snakeyaml introduced in:  https://github.com/apache/druid/pull/14519
After the updates of io.kubernetes.java-client and io.confluent.kafka-clients, the only uses of the Snakeyaml 1.x are:
- in test scope, transitive dependency of jackson-dataformat-yaml🫙2.12.7
- in compile scope in contrib extension druid-cassandra-storage
- in compile scope in it-tests. 

With the dependency version un-pinned, io.kubernetes.java-client and io.confluent.kafka-clients bring Snakeyaml versions 2.0 and 2.2, consequently allowing to build a Druid distribution without the contrib-extension and free of vulnerable Snakeyaml versions.
2023-12-15 10:33:14 -08:00
Tom 901ebbb744
Allow for kafka emitter producer secrets to be masked in logs (#15485)
* Allow for kafka emitter producer secrets to be masked in logs instead of being visible

This change will allow for kafka producer config values that should be secrets to not show up in the logs.
This will enhance the security of the people who use the kafka emitter to use this if they want to.
This is opt in and will not affect prior configs for this emitter

* fix checkstyle issue

* change property name
2023-12-15 12:21:21 -05:00
Abhishek Radhakrishnan da6b3cbc51
Detect EXPLAIN PLAN queries in web-console (#15570) 2023-12-15 12:12:03 -05:00
Zoltan Haindrich 7552dc49fb
Reduce amount of expression objects created during evaluations (#15552)
I was looking into a query which was performing a bit poorly because the case_searched was touching more than 1 columns (if there is only 1 column there is a cache based evaluator).
While I was doing that I've noticed that there are a few simple things which could help a bit:

use a static TRUE/FALSE instead of creating a new object every time
create the ExprEval early for ConstantExpr -s (except the one for BigInteger which seem to have some odd contract)
return early from type autodetection
these changes mostly reduce the amount of garbage the query creates during case_searched evaluation; although ExpressionSelectorBenchmark shows some improvements ~15% - but my manual trials on the taxi dataset with 60M rows showed more improvements - probably due to the fact that these changes mostly only reduce gc pressure.
2023-12-15 16:11:59 +05:30
sensor c9be1cb4e8
Clean useless InterruptedException warn in ingestion task log (#15519)
* Clean useless InterruptedException warn in ingestion task log

* test coverage for the code change, manually close the scheduler thread to trigger Interrupt signal

---------

Co-authored-by: Qiong Chen <qiong.chen@shopee.com>
2023-12-15 11:18:53 +08:00
Abhishek Radhakrishnan 9deeb288c5
Update labeler config per v5 spec. (#15564) 2023-12-14 14:00:21 -05:00
Abhishek Radhakrishnan 7fa987dae9
Update labeler to v5 that includes fix where bot doesn't remove labels added by maintainers. (#15558) 2023-12-14 12:10:26 -05:00
Kashif Faraz feeb4f0fb0
Allocate pending segments at latest committed version (#15459)
The segment allocation algorithm reuses an already allocated pending segment if the new allocation request is made for the same parameters:

datasource
sequence name
same interval
same value of skipSegmentLineageCheck (false for batch append, true for streaming append)
same previous segment id (used only when skipSegmentLineageCheck = false)
The above parameters can thus uniquely identify a pending segment (enforced by the UNIQUE constraint on the sequence_name_prev_id_sha1 column in druid_pendingSegments metadata table).

This reuse is done in order to

allow replica tasks (in case of streaming ingestion) to use the same set of segment IDs.
allow re-run of a failed batch task to use the same segment ID and prevent unnecessary allocations
2023-12-14 16:18:39 +05:30
Vishesh Garg e43bb74c3a
Add MSQ Durable Storage Connector for Google Cloud Storage and change current Google Cloud Storage client library (#15398)
The PR addresses 2 things:

    Add MSQ durable storage connector for GCS
    Change GCS client library from the old Google API Client Library to the recommended Google Cloud Client Library. Ref: https://cloud.google.com/apis/docs/client-libraries-explained
2023-12-14 07:34:49 +05:30
AlbericByte 0436edae0c
fix rat and checkstyle issue (#15530)
* fix rat and checkstyle issue

* remove all checks for generated-sources and generated-test-sources
2023-12-14 09:33:01 +08:00
Soumyava 3e15522d6b
Round works correctly on system metadata columns (#15554) 2023-12-13 17:23:14 -08:00
Pranav 81fe855b6f
Update com.github.eirslett to fix bad zip issue (#15556) 2023-12-13 17:22:54 -08:00
Clint Wylie e55f6b6202
remove search auto strategy, estimateSelectivity of BitmapColumnIndex (#15550)
* remove search auto strategy, estimateSelectivity of BitmapColumnIndex

* more cleanup
2023-12-13 16:30:01 -08:00
Vadim Ogievetsky f770eeb8be
Web console: Update webpack-dev-server v3 to v4 (#15555)
* init

* update usage

* revert licenses.yaml

* move the audience-annotations outside of the web console block
2023-12-13 16:16:54 -08:00
zachjsh 857693f5cf
Decorate sampling response with system fields if specified (#15536)
* * decorate sampling response with system fields if specified

* * add unit test
2023-12-13 12:16:59 -08:00
Keerthana Srikanth f32dbd4131
Upgrade pac4j-oidc to 4.5.7 to address CVE-2021-44878 (#15522)
* Upgrade org.pac4j:pac4j-oidc to 4.5.5 to address CVE-2021-44878
* add CVE suppression and notes, since vulnerability scan still shows this CVE
* Add tests to improve coverage
2023-12-13 10:44:05 -08:00
Bartosz Mikulski 4670a7650f
Optional removal of metrics from Prometheus PushGateway on shutdown (#14935)
* Optional removal of metrics from Prometheus PushGateway on shutdown

* Make pushGatewayDeleteOnShutdown property nullable

* Add waitForShutdownDelay property

* Fix unit test

* Address PR comments

* Address PR comments

* Add explanation on why it is useful to have deletePushGatewayMetricsOnShutdown

* Fix spelling error

* Fix spelling error
2023-12-13 11:58:53 -05:00
Zoltan Haindrich 8bc7a5f3ac
Move codeql-config.yml out of the workflows folder (#15553)
Move codeql config file out of the workflows folder so github doesn't try
to run it and fail the github workflow run every time a branch is updated.
2023-12-13 08:37:01 -08:00
AmatyaAvadhanula 48a96f5d06
Better automatic offset reset for Kinesis ingestion (#15338)
Better automatic offset reset for Kinesis ingestion
2023-12-13 12:03:17 +05:30
Parth Agrawal 4ec9a0a7f7
Update Druid version in Tag in pom.xml (#15545)
This PR updates the tag present in pom.xml to match the druid version in pom.xml
This was last updated in 0da8ffc
It seems to me like this was missed in further Druid version upgrades.
2023-12-12 20:18:30 -08:00
Jan Werner 3c7dec56ca
update kubernetes java client to 19.0.0 and docker-java to 3.3.4 (#15449)
Update of direct dependencies:
* kubernetes java-client to 19.0.0
* docker-java-bom to 3.3.4

In order to update transitive dependencies:
* okio to 3.6.0
* bcjava to 1.76

To address CVES:
- CVE-2023-3635 in okio
- CVE-2023-33201 in bcjava

---------

Co-authored-by: Xavier Léauté <xvrl@apache.org>
2023-12-12 14:27:57 -08:00
Xavier Léauté debb6b401c
update core Apache Kafka dependencies to 3.6.1 (#15539)
Release notes: https://downloads.apache.org/kafka/3.6.1/RELEASE_NOTES.html
2023-12-12 14:24:57 -08:00
Soumyava 38f3cf9e65
Fixing a case where datatype mismatch was happenning in join (#15541) 2023-12-12 12:50:32 -08:00
AmatyaAvadhanula 91ca8e73d6
Skip compaction for datasources with partial-eternity segments (#15542)
This PR builds on #13304 to skip compaction for datasources with segments that have their interval start or end coinciding with Eternity interval end-points.

This is needed in order to prevent an issue similar to #13208 as the Coordinator tries to iterate over a large number of intervals when trying to compact an interval with infinite start or end.
2023-12-12 15:06:45 +05:30
Ankit Kothari 8735d023a1
Add experimental support for first/last for double/float/long #10702 (#14462)
Add experimental support for doubleLast, doubleFirst, FloatLast, FloatFirst, longLast and longFirst.
2023-12-12 11:36:51 +05:30
TestBoost 85af2c8340
only create used and unused segments once to make the test faster (#15533) 2023-12-12 09:31:04 +05:30
Clint Wylie e8fcf2cac8
minor doc adjustments (#15531) 2023-12-11 18:22:44 -08:00
Xavier Léauté 6f78049760
remove references to non-existant website maven module (#15540)
The website pom was removed as part of
https://github.com/apache/druid/pull/14411 so we no longer need to
reference it as a module and the profile can be removed.

Dependabot is currently failing trying to look for this module, so
removing it should also fix that.
2023-12-11 16:58:35 -08:00
zachjsh ab7d9bc6ec
Add api for Retrieving unused segments (#15415)
### Description

This pr adds an api for retrieving unused segments for a particular datasource. The api supports pagination by the addition of `limit` and `lastSegmentId` parameters. The resulting unused segments are returned with optional `sortOrder`, `ASC` or `DESC` with respect to the matching segments `id`, `start time`, and `end time`, or not returned in any guarenteed order if `sortOrder` is not specified

`GET /druid/coordinator/v1/datasources/{dataSourceName}/unusedSegments?interval={interval}&limit={limit}&lastSegmentId={lastSegmentId}&sortOrder={sortOrder}`

Returns a list of unused segments for a datasource in the cluster contained within an optionally specified interval.
Optional parameters for limit and lastSegmentId can be given as well, to limit results and enable paginated results.
The results may be sorted in either ASC, or DESC order depending on specifying the sortOrder parameter.

`dataSourceName`: The name of the datasource
`interval`:                 the specific interval to search for unused segments for.
`limit`:                      the maximum number of unused segments to return information about. This property helps to
                                 support pagination
`lastSegmentId`:     the last segment id from which to search for results. All segments returned are > this segment 
                                 lexigraphically if sortOrder is null or ASC, or < this segment lexigraphically if sortOrder is DESC.
`sortOrder`:            Specifies the order with which to return the matching segments by start time, end time. A null
                                value indicates that order does not matter.

This PR has:

- [x] been self-reviewed.
   - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.)
- [x] added documentation for new or modified features or behaviors.
- [ ] a release note entry in the PR description.
- [x] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
- [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
- [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
- [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
- [ ] added integration tests.
- [x] been tested in a test Druid cluster.
2023-12-11 16:32:18 -05:00
George Shiqi Wu 4152f1d147
Fix empty logs and status messages for mmless ingestion (#15527)
* Fix empty logs and status messages for mmless ingestion

* Add tests
2023-12-11 13:20:45 -05:00
Katya Macedo fc222377ae
[Docs] Document decode_base64_complex and decode_base64_utf8 functions (#15444) 2023-12-11 09:12:06 -08:00
Abhishek Radhakrishnan 96be82a3e6
Clean up duty for non-overlapping eternity tombstones (#15281)
* Add initial draft of MarkDanglingTombstonesAsUnused duty.

* Use overshadowed segments instead of all used segments.

* Add unit test for MarkDanglingSegmentsAsUnused duty.

* Add mock call

* Simplify code.

* Docs

* shorter lines formatting

* metric doc

* More tests, refactor and fix up some logic.

* update javadocs; other review comments.

* Make numCorePartitions as 0 in the TombstoneShardSpec.

* fix up test

* Add tombstone core partition tests

* Update docs/design/coordinator.md

Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>

* review comment

* Minor cleanup

* Only consider tombstones with 0 core partitions

* Need to register the test shard type to make jackson happy

* test comments

* checkstyle

* fixup misc typos in comments

* Update logic to use overshadowed segments

* minor cleanup

* Rename duty to eternity tombstone instead of dangling. Add test for full eternity tombstone.

* Address review feedback.

---------

Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
2023-12-11 08:57:15 -08:00
Katya Macedo 099a9825d1
[Docs] Add a release notes template (#15333)
* Add release notes template

* Update spellcheck
2023-12-11 11:35:16 +05:30
Clint Wylie 42f2496b7d
fix bug with nested empty array fields (#15532) 2023-12-09 12:20:21 -08:00
Rishabh Singh 54df235026
Lazily build Filter in FilteredAggregatorFactory to avoid parsing exceptions in Router (#15526)
Query with lookups in FilteredAggregator fails with this exception in router,

Cannot construct instance of `org.apache.druid.query.aggregation.FilteredAggregatorFactory`, problem: Lookup [campaigns_lookup[campaignId][is_sold][autodsp]] not found at [Source: (org.eclipse.jetty.server.HttpInputOverHTTP); line: 1, column: 913] (through reference chain: org.apache.druid.query.groupby.GroupByQuery["aggregations"]->java.util.ArrayList[1])
T
he problem is that constructor of FilteredAggregatorFactory is actually validating if the lookup exists in this statement dimFilter.toFilter().
This is failing on the router, which is to be expected, because, the router isn’t assigned any lookups.
The fix is to move to a lazy initialisation of the filter object in the constructor.
2023-12-09 12:18:37 +05:30
Clint Wylie e7c8f2e208
lift restriction of array_to_mv to only support direct column access (#15528) 2023-12-08 16:27:17 -08:00