Commit Graph

13897 Commits

Author SHA1 Message Date
Vadim Ogievetsky 84adb9255e
Web console: Fix spec conversion, expose failOnEmptyInsert (#15627)
* Spec converter should dedupe the columns

* Add "Fail on empty insert" setting to QueryContext toggles
2024-01-08 12:05:12 -08:00
Jonathan Wei 5d1e66b8f9
Allow broker to use catalog for datasource schemas for SQL queries (#15469)
* Allow broker to use catalog for datasource schemas

* More PR comments

* PR comments
2024-01-08 13:46:08 -06:00
317brian 141c214b46
docs: add note about finalizeaggregations for sql-based ingestion (#15631) 2024-01-08 10:06:10 -08:00
Kashif Faraz f7bd5ba4d3
Audit create/update of a supervisor spec (#15636)
Changes
- Audit create or update of a supervisor spec. The purpose of the audit is
to track which user made change to a supervisor and when.
- The audit entry does not contain the entire spec or even a diff of the changes
as this is already captured in the `druid_supervisors` metadata table.
2024-01-08 19:46:05 +05:30
AmatyaAvadhanula 63bfb3e6c9
Handle half-eternity intervals while fetching segments with created dates (#15608)
* Handle all intervals while fetching segments with created dates
2024-01-08 12:07:11 +05:30
Clint Wylie c221a2634b
overhaul DruidPredicateFactory to better handle 3VL (#15629)
* overhaul DruidPredicateFactory to better handle 3VL

fixes some bugs caused by some limitations of the original design of how DruidPredicateFactory interacts with 3-value logic. The primary impacted area was with how filters on values transformed with expressions or extractionFn which turn non-null values into nulls, which were not possible to be modelled with the 'isNullInputUnknown' method

changes:
* adds DruidObjectPredicate to specialize string, array, and object based predicates instead of using guava Predicate
* DruidPredicateFactory now uses DruidObjectPredicate
* introduces DruidPredicateMatch enum, which all predicates returned from DruidPredicateFactory now use instead of booleans to indicate match. This means DruidLongPredicate, DruidFloatPredicate, DruidDoublePredicate, and the newly added DruidObjectPredicate apply methods all now return DruidPredicateMatch. This allows matchers and indexes
* isNullInputUnknown has been removed from DruidPredicateFactory

* rename, fix test

* adjust

* style

* npe

* more test

* fix default value mode to not match new test
2024-01-05 19:08:02 -08:00
sensor 62964e99b1
optimize CI workflow for doc updates (#15617)
* optimize CI workflow for doc updates

* Update .github/workflows/codeql.yml

Co-authored-by: Abhishek Radhakrishnan <abhishek.rb19@gmail.com>

* Update .github/workflows/codeql.yml

Co-authored-by: Abhishek Radhakrishnan <abhishek.rb19@gmail.com>

---------

Co-authored-by: Benedict Jin <asdf2014@apache.org>
Co-authored-by: Abhishek Radhakrishnan <abhishek.rb19@gmail.com>
2024-01-05 17:18:38 -08:00
Gian Merlino 0422d9d507
Fix redundant expansion in SearchOperatorConversion. (#15625)
This logic error causes sarg expansion to happen twice for IN or NOT IN points.
It doesn't affect the final generated native query, because the
redundant expansions gets combined. But it slows down planning, especially
for large NOT IN.
2024-01-05 12:42:12 -08:00
George Shiqi Wu 9fe67958be
Increase ServerConnector accept queue size (#15596)
* Allow overwriting ServerConnector accept queue size

* Use a single config

* Fix spacing

* fix spacing

* fixed value

* read value from environment

* fix spacing

* Unpack value before reading

* check somaxconn on linux only
2024-01-05 12:04:15 -05:00
AmatyaAvadhanula c41e99e10c
Do not allocate week granular segments unless requested (#15589)
* Do not allocate week granular segments unless explicitly requested
2024-01-05 12:14:52 +05:30
Charles Smith d8830b64fc
add style for table formatting to docs contribution (#15612)
Co-authored-by: Benedict Jin <asdf2014@apache.org>
Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
2024-01-04 14:02:32 -08:00
Kashif Faraz c937068625
Improve polling in segment allocation queue (#15590)
Description
When batchAllocationWaitTime is set to 0, the segment allocation queue is polled continuously even when it is empty. This would take up cpu cycles unnecessarily.

Some existing race conditions would also become more frequent when the batchAllocationWaitTime is 0. This PR tries to better address those race conditions as well.

Changes
Do not reschedule a poll if queue is empty
When a new batch is added to queue, schedule a poll
Simplify keyToBatch map
Handle race conditions better
As soon as a batch starts getting processed, do not add any more requests to it
2024-01-04 17:42:02 +05:30
Zoltan Haindrich b9679d0884
Run filter-into-join rule early for subqueries and disable project-filter rule (#15511)
FILTER_INTO_JOIN is mainly run along with the other rules with the Volcano planner; however if the query starts highly underdefined (join conditions in the where clauses) that generic query could give a lot of room for the other rules to play around with only enabled it for when the join uses subqueries for its inputs. 

PROJECT_FILTER rule is not that useful. and could increase planning times by providing new plans. This problem worsened after we started supporting inner joins with arbitrary join conditions in https://github.com/apache/druid/pull/15302
2024-01-04 15:33:45 +05:30
Gian Merlino 5c3391a084
Follow-ups to SEARCH and IN from #15609. (#15623)
- Rename ExprType to BaseType in CollectComparisons, since ExprType is a thing
  that exists elsewhere.
- Remove unused "notInRexNodes" from SearchOperatorConversion.
2024-01-03 22:38:12 -08:00
Clint Wylie f19ece146f
expression virtual column indexes (#15585)
* ExpressionVirtualColumn + indexes = bff. Expression virtual columns can now use indexes of the underlying columns similar to how expression filters
2024-01-03 21:00:39 -08:00
George Shiqi Wu 8e95cea8e5
Azure client upgrade to allow identity options (#15287)
* Include new dependencies

* Mostly implemented

* More azure fixes

* Tests passing

* Unit tests running

* Test running after removing storage exception

* Happy with coverage now

* Add more tests

* fix client factory

* cleanup from testing

* Remove old client

* update docs

* Exclude from spellcheck

* Add licenses

* Fix identity version

* Save work

* Add azure clients

* add licenses

* typos

* Add dependencies

* Exception is not thrown

* Fix intellij check

* Don't need to override

* specify length

* urldecode

* encode path

* Fix checks

* Revert urlencode changes

* Urlencode with azure library

* Update docs/development/extensions-core/azure.md

Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>

* PR changes

* Update docs/development/extensions-core/azure.md

Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>

* Deprecate AzureTaskLogsConfig.maxRetries

* Clean up azure retry block

* logic update to reuse clients

* fix comments

* Create container conditionally

* Fix key auth

* Remove container client logic

* Add some more testing

* Update comments

* Add a comment explaining client reuse

* Move logic to factory class

* use bom for dependency management

* fix license versions

---------

Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>
Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
2024-01-03 18:36:05 -05:00
Victoria Lim b8060fc93f
docs: Fix broken anchor links (#15621) 2024-01-03 15:28:27 -08:00
Gian Merlino e40b96e026
Reverse lookup fixes and enhancements. (#15611)
* Reverse lookup fixes and enhancements.

1) Add a "mayIncludeUnknown" parameter to DimFilter#optimize. This is important
   because otherwise the reverse-lookup optimization is done improperly when
   the "in" filter appears under a "not", and the lookup extractionFn may return
   null for some possible values of the filtered column. The "includeUnknown" test
   cases in InDimFilterTest illustrate the difference in behavior.

2) Enhance InDimFilter#optimizeLookup to handle "mayIncludeUnknown", and to be able
   to do a reverse lookup in a wider variety of cases.

3) Make "unapply" protected in LookupExtractor, and move callers to "unapplyAll".
   The main reason is that MapLookupExtractor, a common implementation, lacks a
   reverse mapping and therefore does a scan of the map for each call to "unapply".
   For performance sake these calls need to be batched.

* Remove optimize call from BloomDimFilter.

* Follow the law.

* Fix tests.

* Fix imports.

* Switch function.

* Fix tests.

* More tests.
2024-01-03 13:28:44 -08:00
Abhishek Radhakrishnan 050b515355
Upgrade CodeQL from v2 to latest v3. (#15619) 2024-01-03 11:31:53 -08:00
Gian Merlino 01eec4a55e
New handling for COALESCE, SEARCH, and filter optimization. (#15609)
* New handling for COALESCE, SEARCH, and filter optimization.

COALESCE is converted by Calcite's parser to CASE, which is largely
counterproductive for us, because it ends up duplicating expressions.
In the current code we end up un-doing it in our CaseOperatorConversion.
This patch has a different approach:

1) Add CaseToCoalesceRule to convert CASE back to COALESCE earlier, before
   the Volcano planner runs, using CaseToCoalesceRule.

2) Add FilterDecomposeCoalesceRule to decompose calls like
   "f(COALESCE(x, y))" into "(x IS NOT NULL AND f(x)) OR (x IS NULL AND f(y))".
   This helps use indexes when available on x and y.

3) Add CoalesceLookupRule to push COALESCE into the third arg of LOOKUP.

4) Add a native "coalesce" function so we can convert 3+ arg COALESCE.

The advantage of this approach is that by un-doing the CASE to COALESCE
conversion earlier, we have flexibility to do more stuff with
COALESCE (like decomposition and pushing into LOOKUP).

SEARCH is an operator used internally by Calcite to represent matching
an argument against some set of ranges. This patch improves our handling
of SEARCH in two ways:

1) Expand NOT points (point "holes" in the range set) from SEARCH as
   `!(a || b)` rather than `!a && !b`, which makes it possible to convert
   them to a "not" of "in" filter later.

2) Generate those nice conversions for NOT points even if the SEARCH
   is not composed of 100% NOT points. Without this change, a SEARCH
   for "x NOT IN ('a', 'b') AND x < 'm'" would get converted like
   "x < 'a' OR (x > 'a' AND x < 'b') OR (x > 'b' AND x < 'm')".

One of the steps we take when generating Druid queries from Calcite
plans is to optimize native filters. This patch improves this step:

1) Extract common ANDed predicates in ConvertSelectorsToIns, so we can
   convert "(a && x = 'b') || (a && x = 'c')" into "a && x IN ('b', 'c')".

2) Speed up CombineAndSimplifyBounds and ConvertSelectorsToIns on
   ORs with lots of children by adjusting the logic to avoid calling
   "indexOf" and "remove" on an ArrayList.

3) Refactor ConvertSelectorsToIns to reduce duplicated code between the
   handling for "selector" and "equals" filters.

* Not so final.

* Fixes.

* Fix test.

* Fix test.
2024-01-03 08:56:22 -08:00
Gian Merlino b0e52c99bb
Fix ColumnSelectorColumnIndexSelector#getColumnCapabilities. (#15614)
* Fix ColumnSelectorColumnIndexSelector#getColumnCapabilities.

It was using virtualColumns.getColumnCapabilities, which only returns
capabilities for virtual columns, not regular columns. The effect of this
is that expression filters (and in some cases, arrayContainsElement filters)
would build value matchers rather than use indexes.

I think this has been like this since #12315, which added the
getColumnCapabilities method to BitmapIndexSelector, and included the same
implementation as exists in the code today.

This error is easy to make due to the design of virtualColumns.getColumnCapabilities,
so to help avoid it in the future, this patch renames the method to
getColumnCapabilitiesWithoutFallback to emphasize that it does not return
capabilities for regular columns.

* Make getColumnCapabilitiesWithoutFallback package-private.

* Fix expression filter bitmap usage.
2024-01-02 21:09:18 -08:00
sensor cfdea06857
Fix `used_flag_last_updated` to `used_status_last_updated` in upgrade-notes.md (#15601)
* Fix `used_flag_last_updated` to `used_status_last_updated` in upgrade-notes.md

* Update docs/release-info/upgrade-notes.md

Co-authored-by: Abhishek Radhakrishnan <abhishek.rb19@gmail.com>

---------

Co-authored-by: Benedict Jin <asdf2014@apache.org>
Co-authored-by: Abhishek Radhakrishnan <abhishek.rb19@gmail.com>
2024-01-03 11:48:07 +08:00
Abhishek Radhakrishnan f0f428274a
Prometheus config property doc fixup (#15613)
* Minor fixes

* Update docs/development/extensions-contrib/prometheus.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

---------

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2024-01-02 16:28:42 -08:00
Abhishek Radhakrishnan 9c7d7fc777
Allow empty inserts and replaces in MSQ. (#15495)
* Allow empty inserts and replace.

- Introduce a new query context failOnEmptyInsert which defaults to false.
- When this context is false (default), MSQE will now allow empty inserts and replaces.
- When this context is true, MSQE will throw the existing InsertCannotBeEmpty MSQ fault.
- For REPLACE ALL over an ALL grain segment, the query will generate a tombstone spanning eternity
which will be removed eventually be the coordinator.
- Add unit tests in MSQInsertTest, MSQReplaceTest to test the new default behavior (i.e., when failOnEmptyInsert = false)
- Update unit tests in MSQFaultsTest to test the non-default behavior (i.e., when failOnEmptyInsert = true)

* Ignore test to see if it's the culprit for OOM

* Add heap dump config

* Bump up -Xmx from 1500 MB to 2048 MB

* Add steps to tarball and collect hprof dump to GHA action

* put back mx to 1500MB to trigger the failure

* add the step to reusable unit test workflow as well

* Revert the temp heap dump & @Ignore changes since max heap size is increased

* Minor updates

* Review comments

1. Doc suggestions
2. Add tests for empty insert and replace queries with ALL grain and limit in the
   default failOnEmptyInsert mode (=false). Add similar tests to MSQFaultsTest with
   failOnEmptyInsert = true, so the query does fail with an InsertCannotBeEmpty fault.
3. Nullable annotation and javadocs

* Add comment
	replace_limit.patch
2024-01-02 13:05:51 -08:00
Parth Agrawal 8505e8a909
Provide default implementation for RowFunction evalDimension method (#15452)
The PR: #13947 introduced a function evalDimension() in the interface RowFunction.
There was no default implementation added for this interface which causes all the implementations and custom transforms to fail and require to implement their own version of evalDimension method. This PR adds a default implementation in the interface which allows the evalDimension to return value as a Singleton array of eval result.
2024-01-02 11:14:23 +05:30
kaisun2000 a5e9b14be0
Add delay before the peon drops the segments after publishing them (#15373)
Currently in the realtime ingestion (Kafka/Kinesis) case, after publishing the segments, upon acknowledgement from the coordinator that the segments are already placed in some historicals, the peon would unannounce the segments (basically saying the segments are not in this peon anymore to the whole cluster) and drop the segments from cache and sink timeline in one shot.

The in transit queries from the brokers that still thinks the segments are in the peon can get a NullPointer exception when the peon is unsetting the hydrants in the sinks.

The fix would let the peon to wait for a configurable delay period before dropping segments, remove segments from cache etc after the peon unannounce the segments.

This delayed approach is similar to how the historicals handle segments moving out.
2024-01-02 11:08:28 +05:30
Kashif Faraz cce539495d
[Flaky test] Fix basic auth integration test (#15561)
Database slowness while doing audits seems to be causing flakiness in auth ITs.

The failing test is almost always
`ITBasicAuthConfigurationTest.test_avaticaQuery_datasourceAndContextParamsUser`
but in some rare cases, other tests fail too. Alternately, this failing test has been seen to pass too.

It is most likely because the auth changes are not able to propagate in time from
the coordinator to other services.

Fix: Just log the audits rather than persisting them to database.
Most audits have been newly added and it is okay to not have them persisted.
Moreover, logging audits can also be more beneficial while debugging an IT.
2023-12-23 12:11:12 +05:30
AlbericByte a2e65e6a89
Support to pass dynamic values to timestamp Extract function (#15586)
Fixes #15072

Before this modification , the third parameter (timezone) require to be a Literal, it will throw a error when this parameter is column Identifier.
2023-12-21 11:57:52 +05:30
Clint Wylie 8a45efbf65
fix some null handling bugs with vector expression processors (#15587) 2023-12-19 08:14:17 -08:00
Kashif Faraz 9f568858ef
Add logging implementation for AuditManager and audit more endpoints (#15480)
Changes
- Add `log` implementation for `AuditManager` alongwith `SQLAuditManager`
- `LoggingAuditManager` simply logs the audit event. Thus, it returns empty for
all `fetchAuditHistory` calls.
- Add new config `druid.audit.manager.type` which can take values `log`, `sql` (default)
- Add new config `druid.audit.manager.logLevel` which can take values `DEBUG`, `INFO`, `WARN`.
This gets activated only if `type` is `log`.
- Remove usage of `ConfigSerde` from `AuditManager` as audit is not just limited to configs
- Add `AuditSerdeHelper` for a single implementation of serialization/deserialization of
audit payload and other utility methods.
2023-12-19 13:14:04 +05:30
Clint Wylie e373f62692
fix expression post aggregator array handling when grouping wrapper types leak (#15543)
* fix expression post aggregator array handling when grouping wrapper types leak
* more consistent expression function error messaging
2023-12-15 21:43:27 -08:00
Alexander T 90af71b371
router.sh is missing in Druid Distribution (#15547)
Every time we roll out a new version of Druid on our cluster, I recognize that the script for starting the router process is missing. So I added it =)
2023-12-15 10:42:04 -08:00
Jan Werner fa2c8edb5d
unpin snakeyaml, add suppressions and licenses (#15549)
* unpin snakeyaml globally, add suppressions and licenses
* pin snakeyaml in the specific modules that require version 1.x, update licenses and owasp suppression

This removes the pin of the Snakeyaml introduced in:  https://github.com/apache/druid/pull/14519
After the updates of io.kubernetes.java-client and io.confluent.kafka-clients, the only uses of the Snakeyaml 1.x are:
- in test scope, transitive dependency of jackson-dataformat-yaml🫙2.12.7
- in compile scope in contrib extension druid-cassandra-storage
- in compile scope in it-tests. 

With the dependency version un-pinned, io.kubernetes.java-client and io.confluent.kafka-clients bring Snakeyaml versions 2.0 and 2.2, consequently allowing to build a Druid distribution without the contrib-extension and free of vulnerable Snakeyaml versions.
2023-12-15 10:33:14 -08:00
Tom 901ebbb744
Allow for kafka emitter producer secrets to be masked in logs (#15485)
* Allow for kafka emitter producer secrets to be masked in logs instead of being visible

This change will allow for kafka producer config values that should be secrets to not show up in the logs.
This will enhance the security of the people who use the kafka emitter to use this if they want to.
This is opt in and will not affect prior configs for this emitter

* fix checkstyle issue

* change property name
2023-12-15 12:21:21 -05:00
Abhishek Radhakrishnan da6b3cbc51
Detect EXPLAIN PLAN queries in web-console (#15570) 2023-12-15 12:12:03 -05:00
Zoltan Haindrich 7552dc49fb
Reduce amount of expression objects created during evaluations (#15552)
I was looking into a query which was performing a bit poorly because the case_searched was touching more than 1 columns (if there is only 1 column there is a cache based evaluator).
While I was doing that I've noticed that there are a few simple things which could help a bit:

use a static TRUE/FALSE instead of creating a new object every time
create the ExprEval early for ConstantExpr -s (except the one for BigInteger which seem to have some odd contract)
return early from type autodetection
these changes mostly reduce the amount of garbage the query creates during case_searched evaluation; although ExpressionSelectorBenchmark shows some improvements ~15% - but my manual trials on the taxi dataset with 60M rows showed more improvements - probably due to the fact that these changes mostly only reduce gc pressure.
2023-12-15 16:11:59 +05:30
sensor c9be1cb4e8
Clean useless InterruptedException warn in ingestion task log (#15519)
* Clean useless InterruptedException warn in ingestion task log

* test coverage for the code change, manually close the scheduler thread to trigger Interrupt signal

---------

Co-authored-by: Qiong Chen <qiong.chen@shopee.com>
2023-12-15 11:18:53 +08:00
Abhishek Radhakrishnan 9deeb288c5
Update labeler config per v5 spec. (#15564) 2023-12-14 14:00:21 -05:00
Abhishek Radhakrishnan 7fa987dae9
Update labeler to v5 that includes fix where bot doesn't remove labels added by maintainers. (#15558) 2023-12-14 12:10:26 -05:00
Kashif Faraz feeb4f0fb0
Allocate pending segments at latest committed version (#15459)
The segment allocation algorithm reuses an already allocated pending segment if the new allocation request is made for the same parameters:

datasource
sequence name
same interval
same value of skipSegmentLineageCheck (false for batch append, true for streaming append)
same previous segment id (used only when skipSegmentLineageCheck = false)
The above parameters can thus uniquely identify a pending segment (enforced by the UNIQUE constraint on the sequence_name_prev_id_sha1 column in druid_pendingSegments metadata table).

This reuse is done in order to

allow replica tasks (in case of streaming ingestion) to use the same set of segment IDs.
allow re-run of a failed batch task to use the same segment ID and prevent unnecessary allocations
2023-12-14 16:18:39 +05:30
Vishesh Garg e43bb74c3a
Add MSQ Durable Storage Connector for Google Cloud Storage and change current Google Cloud Storage client library (#15398)
The PR addresses 2 things:

    Add MSQ durable storage connector for GCS
    Change GCS client library from the old Google API Client Library to the recommended Google Cloud Client Library. Ref: https://cloud.google.com/apis/docs/client-libraries-explained
2023-12-14 07:34:49 +05:30
AlbericByte 0436edae0c
fix rat and checkstyle issue (#15530)
* fix rat and checkstyle issue

* remove all checks for generated-sources and generated-test-sources
2023-12-14 09:33:01 +08:00
Soumyava 3e15522d6b
Round works correctly on system metadata columns (#15554) 2023-12-13 17:23:14 -08:00
Pranav 81fe855b6f
Update com.github.eirslett to fix bad zip issue (#15556) 2023-12-13 17:22:54 -08:00
Clint Wylie e55f6b6202
remove search auto strategy, estimateSelectivity of BitmapColumnIndex (#15550)
* remove search auto strategy, estimateSelectivity of BitmapColumnIndex

* more cleanup
2023-12-13 16:30:01 -08:00
Vadim Ogievetsky f770eeb8be
Web console: Update webpack-dev-server v3 to v4 (#15555)
* init

* update usage

* revert licenses.yaml

* move the audience-annotations outside of the web console block
2023-12-13 16:16:54 -08:00
zachjsh 857693f5cf
Decorate sampling response with system fields if specified (#15536)
* * decorate sampling response with system fields if specified

* * add unit test
2023-12-13 12:16:59 -08:00
Keerthana Srikanth f32dbd4131
Upgrade pac4j-oidc to 4.5.7 to address CVE-2021-44878 (#15522)
* Upgrade org.pac4j:pac4j-oidc to 4.5.5 to address CVE-2021-44878
* add CVE suppression and notes, since vulnerability scan still shows this CVE
* Add tests to improve coverage
2023-12-13 10:44:05 -08:00
Bartosz Mikulski 4670a7650f
Optional removal of metrics from Prometheus PushGateway on shutdown (#14935)
* Optional removal of metrics from Prometheus PushGateway on shutdown

* Make pushGatewayDeleteOnShutdown property nullable

* Add waitForShutdownDelay property

* Fix unit test

* Address PR comments

* Address PR comments

* Add explanation on why it is useful to have deletePushGatewayMetricsOnShutdown

* Fix spelling error

* Fix spelling error
2023-12-13 11:58:53 -05:00
Zoltan Haindrich 8bc7a5f3ac
Move codeql-config.yml out of the workflows folder (#15553)
Move codeql config file out of the workflows folder so github doesn't try
to run it and fail the github workflow run every time a branch is updated.
2023-12-13 08:37:01 -08:00