Commit Graph

14474 Commits

Author SHA1 Message Date
Adarsh Sanjeev 73ff9f9047
Convert MSQTerminalStageSpecFactory into an interface (#16996)
* Convert MSQTerminalStageSpecFactory into an interface

* Rename class and remove useless constructor
2024-09-06 09:56:35 +05:30
Virushade 476b205efa
Docs: Fix language in Schema Design docs (#17010) 2024-09-06 08:48:00 +05:30
Gian Merlino 175636b28f
Frame writers: Coerce numeric and array types in certain cases. (#16994)
This patch adds "TypeCastSelectors", which is used when writing frames to
perform two coercions:

- When a numeric type is desired and the underlying type is non-numeric or
  unknown, the underlying selector is wrapped, "getObject" is called and the
  result is coerced using "ExprEval.ofType". This differs from the prior
  behavior where the primitive methods like "getLong", "getDouble", etc, would
  be called directly. This fixes an issue where a column would be read as
  all-zeroes when its SQL type is numeric and its physical type is string, which
  can happen when evolving a column's type from string to number.

-  When an array type is desired, the underlying selector is wrapped,
   "getObject" is called, and the result is coerced to Object[]. This coercion
   replaces some earlier logic from #15917.
2024-09-05 17:20:00 -07:00
Edgar Melendrez c49dc83b22
[docs] batch 12: reduction functions (#16930)
* [docs] batch 12: reduction functions

* Update docs/querying/sql-functions.md

* Update docs/querying/sql-functions.md

* Update docs/querying/sql-functions.md

* applying suggestions

* Apply suggestions from code review

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

---------

Co-authored-by: Benedict Jin <asdf2014@apache.org>
Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>
2024-09-05 17:02:45 -07:00
Vadim Ogievetsky dc5c55a836
Web console: better tooltip when no size is available (#17008)
* better tooltip when no size is available

* better labels for columns

* fix label in segments view
2024-09-05 13:51:03 -07:00
Kashif Faraz ba6f804f48
Fix compaction status API response (#17006)
Description:
#16768 introduces new compaction APIs on the Overlord `/compact/status` and `/compact/progress`.
But the corresponding `OverlordClient` methods do not return an object compatible with the actual
endpoints defined in `OverlordCompactionResource`.

This patch ensures that the objects are compatible.

Changes:
- Add `CompactionStatusResponse` and `CompactionProgressResponse`
- Use these as the return type in `OverlordClient` methods and as the response entity in `OverlordCompactionResource`
- Add `SupervisorCleanupModule` bound on the Coordinator to perform cleanup of supervisors.
Without this module, Coordinator cannot deserialize compaction supervisors.
2024-09-05 23:22:01 +05:30
Jill Osborne b4d83a86c2
Middle Manager wording update in docs (#17005) 2024-09-05 10:25:30 -07:00
Rishabh Singh 40f38f0191
Remove migrated deep storage standard ITs (#16933) 2024-09-05 16:07:33 +05:30
Rishabh Singh 18a9a7570a
Log a small subset of segments to refresh for debugging Coordinator refresh logic (#16998)
* Log a small number of segments to refresh per datasource in the Coordinator

* review comments

* Update log message
2024-09-05 11:00:25 +05:30
Rishabh Singh 39161b0b23
Use vault.centos.org to build Hadoop docker image (#16999)
The Dockerfile for building hadoop image is broken due to Centos 7 EOL.
Fixed it as per https://serverfault.com/a/1161847.
2024-09-05 10:36:55 +05:30
Rishabh Singh 4e02e5b856
Remove alert for pre-existing new columns while merging realtime schema (#16989)
Currently, an alert is thrown while merging datasource schema with realtime
segment schema when the datasource schema already has update columns from the delta schema.

This isn't an error condition since the datasource schema can have those columns from a different segment.

One scenario in which this can occur is when multiple replicas for a task is run.
2024-09-05 07:58:24 +05:30
Hugh Evans 9162339fa8
Replace dsql instructions in example (#16977) 2024-09-04 12:45:58 -07:00
AmatyaAvadhanula bfbd21bce0
Revert "Add integration tests for concurrent append and replace (#16755)" (#17000)
This reverts commit 70bad948e3.
2024-09-04 23:36:49 +05:30
Katya Macedo 03c37b3143
Fix spelling (#17001) 2024-09-04 13:33:17 -04:00
Laksh Singla b698440bfe
suppress cve (#16997) 2024-09-04 19:37:23 +05:30
Vishesh Garg e28424ea25
Enable rollup on multi-value dimensions for compaction with MSQ engine (#16937)
Currently compaction with MSQ engine doesn't work for rollup on multi-value dimensions (MVDs), the reason being the default behaviour of grouping on MVD dimensions to unnest the dimension values; for instance grouping on `[s1,s2]` with aggregate `a` will result in two rows: `<s1,a>` and `<s2,a>`. 

This change enables rollup on MVDs (without unnest) by converting MVDs to Arrays before rollup using virtual columns, and then converting them back to MVDs using post aggregators. If segment schema is available to the compaction task (when it ends up downloading segments to get existing dimensions/metrics/granularity), it selectively does the MVD-Array conversion only for known multi-valued columns; else it conservatively performs this conversion for all `string` columns.
2024-09-04 16:28:04 +05:30
Gian Merlino 76b8c20f4d
Create fewer temporary maps when querying sys.segments. (#16981)
Eliminates two map creations (availableSegmentMetadata, partialSegmentDataMap).
The segmentsAlreadySeen set remains.
2024-09-03 20:04:44 -07:00
Clint Wylie 57bf053dc9
remove compiler warnings about unqualified calls to yield() (#16995) 2024-09-03 20:04:30 -07:00
Gian Merlino 57c4b552d9
Fix logical merge conflict in SuperSorterTest. (#16993)
Logical merge conflict between #16911 and #16914.
2024-09-03 16:14:59 -04:00
Hardik Bajaj 2ef936be40
Update Documentation on meregeBuffer/pendingRequests for Real-time nodes (#16992)
#15025 adds mergeBuffer/pendingRequests metric in QueryCountStatsMonitor. Since real-time nodes also use the same merge buffers for queries and have QueryCountStatsMonitor , the documentation is being updated to include this metric.
2024-09-04 00:25:09 +05:30
Gian Merlino 786c959e9e
MSQ: Add limitHint to global-sort shuffles. (#16911)
* MSQ: Add limitHint to global-sort shuffles.

This allows pushing down limits into the SuperSorter.

* Test fixes.

* Add limitSpec to ScanQueryKit. Fix SuperSorter tracking.
2024-09-03 09:05:29 -07:00
AmatyaAvadhanula 70bad948e3
Add integration tests for concurrent append and replace (#16755)
IT for streaming tasks with concurrent compaction
2024-09-03 14:58:15 +05:30
Sree Charan Manamala 619d8ef964
Window Functions : Numeric Arrays Frame Column Writers - fix class cast exception (#16983)
Fix ClassCastException in ArrayFrameCoulmnWriters
2024-09-03 11:44:52 +05:30
Akshat Jain ca17beb77b
Fix issues with window functions wrt virtual columns when using ARRAY_CONCAT_AGG aggregator (#16971)
* Fix issues with window functions wrt virtual columns when using ARRAY_CONCAT_AGG aggregator

* Address review comment

* Address review comment
2024-09-03 11:41:40 +05:30
Kashif Faraz a7349442e4
Fix computed value of maxSegmentsToMove when coordinator period is very low (#16984)
Bug: When coordinator period is less than 30s, `maxSegmentsToMove` is always computed as 0
irrespective of number of available threads.

Changes:
- Fix lower bound condition and set minimum value to 100.
- Add new test which fails without this fix
2024-09-02 14:35:29 +05:30
Zoltan Haindrich 32e8e074ae
Planning could have failed if UNION ALL operator was completely removed (#16946) 2024-09-02 04:37:10 -04:00
Jonathan Wei 088c33759d
Allow druid.azure.account to be nullable (#16960) 2024-09-02 12:05:51 +05:30
Kashif Faraz fe3d589ff9
Run compaction as a supervisor on Overlord (#16768)
Description
-----------
Auto-compaction currently poses several challenges as it:
1. may get stuck on a failing interval.
2. may get stuck on the latest interval if more data keeps coming into it.
3. always picks the latest interval regardless of the level of compaction in it.
4. may never pick a datasource if its intervals are not very recent.
5. requires setting an explicit period which does not cater to the changing needs of a Druid cluster.

This PR introduces various improvements to compaction scheduling to tackle the above problems.

Change Summary
--------------
1. Run compaction for a datasource as a supervisor of type `autocompact` on Overlord.
2. Make compaction policy extensible and configurable.
3. Track status of recently submitted compaction tasks and pass this info to policy.
4. Add `/simulate` API on both Coordinator and Overlord to run compaction simulations.
5. Redirect compaction status APIs to the Overlord when compaction supervisors are enabled.
2024-09-02 07:53:13 +05:30
Parag Jain 6eb42e8d5a
fix extraction of timeseries results from result level cache (#16895)
* fix extraction of timeseries results from result level cache

* remove unneded import

* add test
2024-09-01 00:25:55 +05:30
Virushade 0217c8c541
Change Inspection Profile to set "Method is identical to its super method" as error (#16976)
* Make IntelliJ's MethodIsIdenticalToSuperMethod an error

* Change codebase to follow new IntelliJ inspection

* Restore non-short-circuit boolean expressions to pass tests
2024-08-31 09:37:34 +05:30
Gian Merlino caf8ce3e0b
MSQ: Add CPU and thread usage counters. (#16914)
* MSQ: Add CPU and thread usage counters.

The main change adds "cpu" and "wall" counters. The "cpu" counter measures
CPU time (using JvmUtils.getCurrentThreadCpuTime) taken up by processors
in processing threads. The "wall" counter measures the amount of wall time
taken up by processors in those same processing threads. Both counters are
broken down by type of processor.

This patch also includes changes to support adding new counters. Due to an
oversight in the original design, older deserializers are not forwards-compatible;
they throw errors when encountering an unknown counter type. To manage this,
the following changes are made:

1) The defaultImpl NilQueryCounterSnapshot is added to QueryCounterSnapshot's
   deserialization configuration. This means that any unrecognized counter types
   will be read as "nil" by deserializers. Going forward, once all servers are
   on the latest code, this is enough to enable easily adding new counters.

2) A new context parameter "includeAllCounters" is added, which defaults to "false".
   When this parameter is set "false", only legacy counters are included. When set
   to "true", all counters are included. This is currently undocumented. In a future
   version, we should set the default to "true", and at that time, include a release
   note that people updating from versions prior to Druid 31 should set this to
   "false" until their upgrade is complete.

* Style, coverage.

* Fix.
2024-08-30 20:02:30 -07:00
Kashif Faraz d5b64ba2e3
Improve exception handling in extension druid-pac4j (#16979)
Changes:
- Simplify exception handling in `CryptoService` by just catching a `Exception`
- Throw a `DruidException` as the exception is user facing
- Log the exception for easier debugging
- Add a test to verify thrown exception
2024-08-30 12:32:49 +05:30
Vadim Ogievetsky 358d06abc1
Web console: expose forceSegmentSortByTime (#16967)
* no force time

* time UI

* update menus

* tweaks

* dont use bp5

* nicer values

* update snapshots

* similar engine lables

* update snaps
2024-08-29 09:58:15 -07:00
317brian 1d292c5a59
docs: fix the case of the s3 examples for msq (#16969) 2024-08-28 11:13:15 -07:00
Akshat Jain fbd305af0f
MSQ WF: Batch multiple PARTITION BY keys for processing (#16823)
Currently, if we have a query with window function having PARTITION BY xyz, and we have a million unique values for xyz each having 1 row, we'd end up creating a million individual RACs for processing, each having a single row. This is unnecessary, and we can batch the PARTITION BY keys together for processing, and process them only when we can't batch further rows to adhere to maxRowsMaterialized config.

The previous iteration of this PR was simplifying WindowOperatorQueryFrameProcessor to run all operators on all the rows instead of creating smaller RACs per partition by key. That approach was discarded in favor of the batching approach, and the details are summarized here: #16823 (comment).
2024-08-28 11:32:47 +05:30
Virushade 862ccda59b
Disable automatic search refresh every for keystroke (#16963)
* Disable refreshing page everytime keystroke is detected

* Lengthen input debounce time to 1s

* Run Prettier to pass stylecheck
2024-08-27 20:28:43 -07:00
Charles Smith e562dd3ac6
Docs: note on iceberg (#16955)
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
2024-08-27 14:27:23 -07:00
Jill Osborne 3e031b9dc2
Add dynamic query params example (#16964)
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
2024-08-27 14:27:13 -07:00
Vadim Ogievetsky 21dcf804eb
Web console: add ability to issue auxiliary queries to speed up data views (#16952)
* Add ability to issue auxiliary queries

* readonly supervisor

* return

* update snapshot

* fix classes
2024-08-27 13:38:30 -07:00
Pranav 0caf383102
Fix buffer capacity race condition in spatial (#16931) 2024-08-27 00:36:29 -07:00
Clint Wylie f8301a314f
generic block compressed complex columns (#16863)
changes:
* Adds new `CompressedComplexColumn`, `CompressedComplexColumnSerializer`, `CompressedComplexColumnSupplier` based on `CompressedVariableSizedBlobColumn` used by JSON columns
* Adds `IndexSpec.complexMetricCompression` which can be used to specify compression for the generic compressed complex column. Defaults to uncompressed because compressed columns are not backwards compatible.
* Adds new definition of `ComplexMetricSerde.getSerializer` which accepts an `IndexSpec` argument when creating a serializer. The old signature has been marked `@Deprecated` and has a default implementation that returns `null`, but it will be used by the default implementation of the new version if it is implemented to return a non-null value. The default implementation of the new method will use a `CompressedComplexColumnSerializer` if `IndexSpec.complexMetricCompression` is not null/none/uncompressed, or will use `LargeColumnSupportedComplexColumnSerializer` otherwise.
* Removed all duplicate generic implementations of `ComplexMetricSerde.getSerializer` and `ComplexMetricSerde.deserializeColumn` into default implementations `ComplexMetricSerde` instead of being copied all over the place. The default implementation of `deserializeColumn` will check if the first byte indicates that the new compression was used, otherwise will use the `GenericIndexed` based supplier.
* Complex columns with custom serializers/deserializers are unaffected and may continue doing whatever it is they do, either with specialized compression or whatever else, this new stuff is just to provide generic implementations built around `ObjectStrategy`.
* add ObjectStrategy.readRetainsBufferReference so CompressedComplexColumn only copies on read if required
* add copyValueOnRead flag down to CompressedBlockReader to avoid buffer duplicate if the value needs copied anyway
2024-08-27 00:34:41 -07:00
Gian Merlino ed3dbd6242
MSQ: Fix validation of time position in collations. (#16961)
* MSQ: Fix validation of time position in collations.

It is possible for the collation to refer to a field that isn't mapped,
such as when the DML includes "CLUSTERED BY some_function(some_field)".
In this case, the collation refers to a projected column that is not
part of the field mappings. Prior to this patch, that would lead to an
out of bounds list access on fieldMappings.

This patch fixes the problem by identifying the position of __time in
the fieldMappings first, rather than retrieving each collation field
from fieldMappings.

Fixes a bug introduced in #16849.

* Fix test. Better warning message.
2024-08-27 00:02:32 -07:00
Adarsh Sanjeev 3b88b57d70
Add configurable final stages to MSQ ingestion queries (#16699)
* Add a segmentMorphFactory to MSQ.

* Add test

* Make argument nullable

* Fix Guice issues

* Merge with master

* Remove extra information

* Fix tests

* Create a utils class

* Refactor segment generation

* Fix javadoc

* Refactor

* Refactor

* Fix injection
2024-08-27 11:35:48 +05:30
Adarsh Sanjeev 16517e348e
Suppress CVE (#16851) 2024-08-27 10:47:57 +05:30
Gian Merlino 5d2ed33b89
Place __time in signatures according to sort order. (#16958)
* Place __time in signatures according to sort order.

Updates a variety of places to put __time in row signatures according
to its position in the sort order, rather than always first, including:

- InputSourceSampler.
- ScanQueryEngine (in the default signature when "columns" is empty).
- Various StorageAdapters, which also have the effect of reordering
  the column order in segmentMetadata queries, and therefore in SQL
  schemas as well.

Follow-up to #16849.

* Fix compilation.

* Additional fixes.

* Fix.

* Fix style.

* Omit nonexistent columns from the row signature.

* Fix tests.
2024-08-26 21:45:51 -07:00
George Shiqi Wu 7ee7e194c4
Add supervisor log when task count is greater than partitions (#16948)
* Add log message when task count is higher than partitions

* newline

* fix ordering

* Add supervisor id

* Update indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/supervisor/SeekableStreamSupervisor.java

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

* Update indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/supervisor/SeekableStreamSupervisor.java

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

* Update indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/supervisor/SeekableStreamSupervisor.java

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

---------

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
2024-08-26 07:40:02 -07:00
Akshat Jain 72f8e79a42
Use multiple workers in MSQ WF drill test suite (#16949) 2024-08-26 11:34:40 +05:30
317brian 418da92228
docs: update query from deepstorage segment requirement (#16842)
Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>
Co-authored-by: Rishabh Singh <6513075+findingrish@users.noreply.github.com>
Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>
2024-08-23 11:59:29 -07:00
Hugh Evans 60d4317968
Linked back to query granularity docs (#16883)
* Linked back to query granularity docs

* Update ingestion-spec.md

clairfy about query granularities in the spec.

* Update docs/design/storage.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/ingestion/ingestion-spec.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/granularities.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Apply suggestions from code review

---------

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2024-08-23 08:44:19 -07:00
Gian Merlino 0603d5153d
Segments sorted by non-time columns. (#16849)
* Segments primarily sorted by non-time columns.

Currently, segments are always sorted by __time, followed by the sort
order provided by the user via dimensionsSpec or CLUSTERED BY. Sorting
by __time enables efficient execution of queries involving time-ordering
or granularity. Time-ordering is a simple matter of reading the rows in
stored order, and granular cursors can be generated in streaming fashion.

However, for various workloads, it's better for storage footprint and
query performance to sort by arbitrary orders that do not start with __time.
With this patch, users can sort segments by such orders.

For spec-based ingestion, users add "useExplicitSegmentSortOrder: true" to
dimensionsSpec. The "dimensions" list determines the sort order. To
define a sort order that includes "__time", users explicitly
include a dimension named "__time".

For SQL-based ingestion, users set the context parameter
"useExplicitSegmentSortOrder: true". The CLUSTERED BY clause is then
used as the explicit segment sort order.

In both cases, when the new "useExplicitSegmentSortOrder" parameter is
false (the default), __time is implicitly prepended to the sort order,
as it always was prior to this patch.

The new parameter is experimental for two main reasons. First, such
segments can cause errors when loaded by older servers, due to violating
their expectations that timestamps are always monotonically increasing.
Second, even on newer servers, not all queries can run on non-time-sorted
segments. Scan queries involving time-ordering and any query involving
granularity will not run. (To partially mitigate this, a currently-undocumented
SQL feature "sqlUseGranularity" is provided. When set to false the SQL planner
avoids using "granularity".)

Changes on the write path:

1) DimensionsSpec can now optionally contain a __time dimension, which
   controls the placement of __time in the sort order. If not present,
   __time is considered to be first in the sort order, as it has always
   been.

2) IncrementalIndex and IndexMerger are updated to sort facts more
   flexibly; not always by time first.

3) Metadata (stored in metadata.drd) gains a "sortOrder" field.

4) MSQ can generate range-based shard specs even when not all columns are
   singly-valued strings. It merely stops accepting new clustering key
   fields when it encounters the first one that isn't a singly-valued
   string. This is useful because it enables range shard specs on
   "someDim" to be created for clauses like "CLUSTERED BY someDim, __time".

Changes on the read path:

1) Add StorageAdapter#getSortOrder so query engines can tell how a
   segment is sorted.

2) Update QueryableIndexStorageAdapter, IncrementalIndexStorageAdapter,
   and VectorCursorGranularizer to throw errors when using granularities
   on non-time-ordered segments.

3) Update ScanQueryEngine to throw an error when using the time-ordering
  "order" parameter on non-time-ordered segments.

4) Update TimeBoundaryQueryRunnerFactory to perform a segment scan when
   running on a non-time-ordered segment.

5) Add "sqlUseGranularity" context parameter that causes the SQL planner
   to avoid using granularities other than ALL.

Other changes:

1) Rename DimensionsSpec "hasCustomDimensions" to "hasFixedDimensions"
   and change the meaning subtly: it now returns true if the DimensionsSpec
   represents an unchanging list of dimensions, or false if there is
   some discovery happening. This is what call sites had expected anyway.

* Fixups from CI.

* Fixes.

* Fix missing arg.

* Additional changes.

* Fix logic.

* Fixes.

* Fix test.

* Adjust test.

* Remove throws.

* Fix styles.

* Fix javadocs.

* Cleanup.

* Smoother handling of null ordering.

* Fix tests.

* Missed a spot on the merge.

* Fixups.

* Avoid needless Filters.and.

* Add timeBoundaryInspector to test.

* Fix tests.

* Fix FrameStorageAdapterTest.

* Fix various tests.

* Use forceSegmentSortByTime instead of useExplicitSegmentSortOrder.

* Pom fix.

* Fix doc.
2024-08-23 08:24:43 -07:00