13368 Commits

Author SHA1 Message Date
Charles Smith
0403e48266
window functions docs (#14739)
* draft window functions

* Apply suggestions from code review

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* address comments

* remove default column

* Update docs/querying/sql-window-functions.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* Update docs/querying/sql-window-functions.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* fix ntile

* remove default header column

* code tics to remove spelling errors

* add known issues, add SUM example

* Apply suggestions from code review

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* address spelling

* remove extra chars

* add to sidebar, fix admonition

* Update sql-window-functions.md

accept suggestion, change admonition style

* update sidebar

* Delete Untitled.ipynb

rm unwanted file

* Update docs/querying/sql-window-functions.md

* Update docs/querying/sql-window-functions.md

* update context param, accept suggestions

* accept suggestions

* Apply suggestions from code review

* Fix known issues

* require GROUP BY, explain order of operation

* accept suggestions

* fix spelling

---------

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
2023-11-06 11:34:42 -08:00
Abhishek Radhakrishnan
2136dc3591
Batch segment retrieval from the metadata store (#15305)
* Add a unit test that fails when used segments with too many intervals are retrieved.

- This is a failing test case that needs to be ignored.

* Batch the intervals (use 100 as it's consistent with batching in other places).

* move the filtering inside the batch

* Account for limit cross the batch splits.

* Adjustments

* Fixup and add tests

* small refactor

* add more tests.

* remove wrapper.

* Minor edits

* assert out of range
2023-11-06 11:30:24 -08:00
Abhishek Agarwal
4b64a5693b
Move service specific JVM parameters to the right in tests (#15325)
Historical OOMs were not getting dumped into /shared/logs because common JVM flags will override service-specific JVM flags. This PR fixes that and also removes unnecessary overrides in historical.
2023-11-06 15:45:59 +05:30
Atul Mohan
ff7de49015
Consolidate and reduce dependency footprint for iceberg extension (#15280)
* Consolidate and reduce dependency footprint

* Fix dependency analysis
2023-11-06 12:17:32 +05:30
Rishabh Singh
8c802e4c9b
Relocating Table Schema Building: Shifting from Brokers to Coordinator for Improved Efficiency (#14985)
In the current design, brokers query both data nodes and tasks to fetch the schema of the segments they serve. The table schema is then constructed by combining the schemas of all segments within a datasource. However, this approach leads to a high number of segment metadata queries during broker startup, resulting in slow startup times and various issues outlined in the design proposal.

To address these challenges, we propose centralizing the table schema management process within the coordinator. This change is the first step in that direction. In the new arrangement, the coordinator will take on the responsibility of querying both data nodes and tasks to fetch segment schema and subsequently building the table schema. Brokers will now simply query the Coordinator to fetch table schema. Importantly, brokers will still retain the capability to build table schemas if the need arises, ensuring both flexibility and resilience.
2023-11-04 19:33:25 +05:30
George Shiqi Wu
a8906b6ea0
Fix k8s task runner failure reporting (#15311)
* Fix k8s task runner failure reporting

* Fix reference

* add jsonignore

* PR changes
2023-11-03 21:28:46 -04:00
Clint Wylie
5d39b94149
allow compaction to work with spatial dimensions (#15321) 2023-11-03 11:27:50 -07:00
Laksh Singla
0cc8839a60
Allow casted literal values in SQL functions accepting literals (Part 2) (#15316) 2023-11-03 21:22:19 +05:30
Tts-233
f39a778f7d
Fix 404 URL about native query (#15324) 2023-11-03 08:39:59 -07:00
Gian Merlino
98f1eb8ede
Use filters for pruning properly for hash-joins. (#15299)
* Use filters for pruning properly for hash-joins.

Native used them too aggressively: it might use filters for the RHS
to prune the LHS. MSQ used them not at all. Now, both use them properly,
pruning based on base (LHS) columns only.

* Fix tests.

* Fix style.

* Clear filterFields too.

* Update.
2023-11-03 07:29:16 -07:00
Karan Kumar
5036af6fb3
Doc fixes for query from deep storage and MSQ (#15313)
Minor updates to the documentation.

    Added prerequisites.
    Removed a known issue in MSQ since its no longer valid.

---------

Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
2023-11-03 10:52:20 +05:30
Adarsh Sanjeev
9576fd3141
HllSketch Merge Aggregator optimizations (#15162)
* Null byte serde for empty sketches

* Cache for HllSketchMerge

* Check for empty sketches

* Address review comments

* Revert changes to HllSketchHolder

* Handle null sketch holders instead of null sketches

* Add unit test for MSQ HllSketch

* Add comments

* Fix style
2023-11-03 11:01:22 +08:00
cristian-popa
fb260f3e41
docs: LDAP trust store property clarification (#15028)
Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
2023-11-02 13:00:08 -07:00
Gian Merlino
d87d92bc43
Add system fields to input sources. (#15276)
* Add system fields to input sources.

Main changes:

1) The SystemField enum defines system fields "__file_uri", "__file_path",
   and "__file_bucket". They are associated with each input entity.

2) The SystemFieldInputSource interface can be added to any InputSource
   to make it system-field-capable. It sets up serialization of a list
   of configured "systemFields" in the JSON form of the input source, and
   provides a method getSystemFieldValue for computing the value of each
   system field. Cloud object, HDFS, HTTP, and Local now have this.

* Fix various LocalInputSource calls.

* Fix style stuff.

* Fixups.

* Fix tests and coverage.
2023-11-02 10:31:28 -07:00
AmatyaAvadhanula
dc3213b05d
Fix used segment retrieval in Kill tasks (#15306)
Fix used segment retrieval in Kill tasks
2023-11-02 19:07:17 +05:30
Clint Wylie
d261587f4a
explicit outputType for ExpressionPostAggregator, better documentation for the differences between arrays and mvds (#15245)
* better documentation for the differences between arrays and mvds
* add outputType to ExpressionPostAggregator to make docs true
* add output coercion if outputType is defined on ExpressionPostAgg
* updated post-aggregations.md to be consistent with aggregations.md and filters.md and use tables
2023-11-02 00:31:37 -07:00
Adarsh Sanjeev
22443ab87e
Fix an issue with passing order by and limit to realtime tasks (#15301)
While running queries on real time tasks using MSQ, there is an issue with queries with certain order by columns.

If the query specifies a non time column, the query is planned as it is supported by MSQ. However, this throws an exception when passed to real time tasks once as the native query stack does not support it. This PR resolves this by removing the ordering from the query before contacting real time tasks.

    Fixes a bug with MSQ while reading data from real time tasks with non time ordering
2023-11-02 11:38:26 +05:30
Laksh Singla
b82ad59dfe
Better logging in ServiceClientImpl (#15269)
ServiceClientImpl logs the cause of every retry, even though we are retrying the connection attempt. This leads to slight pollution in the logs because a lot of the time, the reason for retrying is the same. This is seen primarily in MSQ, when the worker task hasn't launched yet however controller attempts to connect to the worker task, which can lead to scary-looking messages (with INFO log level), even though they are normal.
This PR changes the logging logic to log every 10 (arbitrary number) retries instead of every retry, to reduce the pollution of the logs.
Note: If there are no retries left, the client returns an exception, which would get thrown up by the caller, and therefore this change doesn't hide any important information.
2023-11-02 11:32:49 +05:30
Gian Merlino
6b6d73b5d4
Use min of scheduler threads and server threads for subquery guardrails. (#15295)
* Use min of scheduler threads and server threads for subquery guardrails.

This allows more memory to be used for subqueries when the query scheduler
is configured to limit queries below the number of server threads. The patch
also refactors the code so SubqueryGuardrailHelper is provided by a Guice
Provider rather than being created by ClientQuerySegmentWalker, to achieve
better separation of concerns.

* Exclude provider from coverage.
2023-11-01 22:34:53 -07:00
Gian Merlino
37e158c2c4
Frames: consider writing singly-valued column when input column hasMultipleValues is UNKNOWN. (#15300)
* Frames: consider writing singly-valued column when input column hasMultipleValues is UNKNOWN.

Prior to this patch, columnar frames would always write multi-valued columns if
the input column had hasMultipleValues = UNKNOWN. This had the effect of flipping
UNKNOWN to TRUE when copying data into frames, which is problematic because TRUE
causes expressions to assume that string inputs must be treated as arrays.

We now avoid this by flipping UNKNOWN to FALSE if no multi-valuedness
is encountered, and flipping it to TRUE if multi-valuedness is encountered.

* Add regression test case.
2023-11-01 22:05:53 -07:00
Charles Smith
de557a62ad
Suggest adoption of Google Style guide (#14905)
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
2023-11-01 13:31:03 -07:00
Charles Smith
3860052de0
remove references to Jupyter notebooks within the Druid repo (#15143)
Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>
Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>
2023-11-01 13:17:06 -07:00
Katya Macedo
935050bf43
docs: Dynamic config cleanup (#15265)
Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
2023-11-01 11:22:33 -07:00
Sergio Ferragut
c9c3df204e
Redirect to new jupyter notebook project (#15136) 2023-11-01 08:38:40 -07:00
Laksh Singla
2ea7177f15
Allow casted literal values in SQL functions accepting literals (#15282)
Functions that accept literals also allow casted literals. This shouldn't have an impact on the queries that the user writes. It enables the SQL functions to accept explicit cast, which is required with JDBC.
2023-11-01 10:38:48 +05:30
George Shiqi Wu
49e0cba7ba
Fix dockerfile for druid image (#15264)
Fixes docker image build issues with apache/druid.
2023-11-01 09:55:54 +05:30
317brian
436ded3d78
docs: durable storage azure cleanup (#15120)
Co-authored-by: Laksh Singla <lakshsingla@gmail.com>
2023-10-31 15:20:38 -07:00
Katya Macedo
a43ffbdf2b
[Docs] Improvements to JSON-based batch Ingestion page (#15286) 2023-10-31 14:50:45 -07:00
317brian
87695410ac
docs: blurb about msq union all (#15223) 2023-10-31 14:15:38 -07:00
Suneet Saldanha
e6b7c36e74
LoadRules with 0 replicas should be treated as handoff complete (#15274)
* LoadRules with 0 replicas should be treated as handoff complete

* fix it

* pr feedback

* fixit
2023-10-30 10:42:58 -07:00
George Shiqi Wu
3173093415
Handle status failures for streaming supervisors (#15174)
* Cleanup logic

* newline

* remove whitespace

* Fix log message

* Add test class

* PR changes
2023-10-30 10:21:23 -07:00
Vishesh Garg
a27598a487
Segregate advance and advanceUninterruptibly flow in postJoinCursor to allow for interrupts in advance (#15222)
Currently advance function in postJoinCursor calls advanceUninterruptibly which in turn keeps calling baseCursor.advanceUninterruptibly until the post join condition matches, without checking for interrupts. This causes the CPU to hit 100% without getting a chance for query to be cancelled.

With this change, the call flow of advance and advanceUninterruptibly is separated out so that they call baseCursor.advance and baseCursor.advanceUninterruptibly in them, respectively, giving a chance for interrupts in the former case between successive calls to baseCursor.advance.
2023-10-30 14:39:15 +05:30
Ben Sykes
275c1ec64c
Fix error assuming a Complex Type that is a Number is a double (#15272)
* Fix error assuming a Complex Type that is a Number is a double
In the case where a complex type is a number, it may not be castable to double. It can safely be case as Number first to get to the doubleValue.
2023-10-30 09:52:52 +05:30
Vishesh Garg
039b05585c
Add worker status and duration metrics in live and task reports (#15180)
Add worker status and duration metrics in live and task reports for tracking.
2023-10-30 09:43:22 +05:30
Zoltan Haindrich
f4a74710e6
Process pure ordering changes with windowing operators (#15241)
- adds a new query build path: DruidQuery#toScanAndSortQuery which:
- builds a ScanQuery without considering the current ordering
- builds an operator to execute the sort
- fixes a null string to "null" literal string conversion in the frame serializer code
- fixes some DrillWindowQueryTest cases
- fix NPE in NaiveSortOperator in case there was no input
- enables back CoreRules.AGGREGATE_REMOVE
- adds a processing level OffsetLimit class and uses that instead of just the limit in the rac parts
- earlier window expressions on top of a subquery with an offset may have ignored the offset
2023-10-29 16:40:49 +05:30
317brian
737947754d
docs: add concurent compaction docs (#15218)
Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
2023-10-27 10:29:34 -07:00
kaisun2000
60c2ad597a
Enhance json parser error logging to better track Istio Proxy error message (#15176)
Currently the inter Druid communication via rest endpoints is based on json formatted payload. Upon parsing error, there is only a generic exception stating expected json token type and current json token type. There is no detailed error log about the content of the payload causing the violation.

In the micro-service world, the trend is to deploy the Druid servers in k8 with the mesh network. Often the istio proxy or other proxies is used to intercept the network connection between Druid servers. The proxy may give error messages for various reasons. These error messages are not expected by the json parser. The generic error message from Druid can be very misleading as the user may think the message is based on the response from the other Druid server.

For example, this is an example of mysterious error message

QueryInterruptedException{msg=Next token wasn't a START_ARRAY, was[VALUE_STRING] from url[http://xxxxx:8088/druid/v2/], code=Unknown exception, class=org.apache.druid.java.util.common.IAE, host=xxxxx:8088}"

While the context of the message is the following from the proxy when it can't tunnel the network connection.

pstream connect error or disconnect/reset before header

So this very simple PR is just to enhance the logging and get the real underlying message printed out. This would save a lot of head scratching time if Druid is deployed with mesh network.

Co-authored-by: Kai Sun <kai.sun@salesforce.com>
2023-10-27 14:20:19 +05:30
Laksh Singla
7c8e841362
Suppress CVE's in master (#15231) 2023-10-27 09:29:18 +05:30
Simon Hofbauer
e9b7e4a0eb
fix JSON flaky tests (#15261)
Co-authored-by: simonh5 <simonh5@illinois.edu>
2023-10-26 20:27:09 -07:00
Alexander Saydakov
f1132d20c5
use datasketches-java 4.2.0 (#15257)
* use datasketches-java 4.2.0

* use exclusive mode

* fixed issues raised by CodeQL

* fixed issue raised by spotbugs

* fixed issues raised by intellij

* added missing import

* Update QuantilesSketchKeyCollector search mode and adjust tests.

* Update sizeOf functions and add unit tests

* Add unit tests

---------

Co-authored-by: AlexanderSaydakov <AlexanderSaydakov@users.noreply.github.com>
Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
Co-authored-by: Adarsh Sanjeev <adarshsanjeev@gmail.com>
2023-10-26 16:28:33 -07:00
David Christle
fc0b940f78
Document the allowed range of announcer maxBytesPerNode (#15063) 2023-10-26 14:51:01 -07:00
Pranav
e7b8e6569b
Updating plugin which has fix for corrupt nodejs pkg (#15259) 2023-10-25 21:49:58 -07:00
Zoltan Haindrich
f48263bbb3
Report function name for unknown exceptions during execution (#14987)
* provide function name when unknown exceptions are encountered

* fix keywords/etc

* fix keywrod order - regex excercise

* add test

* add check&fix keywords

* decoupledIgnore

* Revert "decoupledIgnore"

This reverts commit e922c820a7d563ca49c9c686644bed967c42cb4b.

* unpatch Function

* move to a different location

* checkstyle
2023-10-25 13:37:30 -07:00
YongGang
7a25ee4fd9
Ability to send task types to k8s or worker task runner (#15196)
* Ability to send task types to k8s or worker task runner

* add more tests

* use runnerStrategy to determine task runner

* minor refine

* refine runner strategy config

* move workerType config to upper level

* validate config when application start
2023-10-25 09:55:56 -07:00
Laksh Singla
207398a47d
Initialize null handling in CompressedBigDecimalAggregatorTimeseriesTestBase to fix failing test(#15252) 2023-10-25 20:26:46 +05:30
Adarsh Sanjeev
c5fa649ea5
Rename segment load wait parameter (#15251) 2023-10-25 18:08:37 +05:30
Zoltan Haindrich
6784e9c507
Fix summary row issues in case postaggregations are happening (#15232)
* fix-1/2

* add message v1

* extend test to cover for IOB issue

* move stuff around

* change message

* fix testcase string

* compute postaggs (thank you Clint!)

* enable feature for test

* ignore tests in msq

---------

Co-authored-by: Soumyava Das <soumyava@users.noreply.github.com>
2023-10-24 20:33:59 -07:00
Soumyava
06f40a0019
remove calcite AggregateRemoveRule to fix nested group by query with order by in outer query (#15237)
* Fixing nested group by query with order by in outer query

* Adding examples
2023-10-24 15:30:13 -07:00
Clint Wylie
4149c9422c
cleanup temp files for nested column serializer (#15236)
* cleanup temp files for nested column serializer

* fix style

* fix tests in default value mode
2023-10-24 15:30:00 -07:00
Abhishek Radhakrishnan
63e3e9531d
Update S3 retry logic to account for the underlying cause in case of IOException (#15238)
* Update S3 retry logic based on the underlying cause in case of IOException.

4xx and other errors wrapped in IOException for instance aren't retriable.

* Fix CI
2023-10-24 15:04:42 -07:00