Commit Graph

3156 Commits

Author SHA1 Message Date
Clint Wylie 45c020060c
better javadoc for ColumnIndexSupplier (#16663)
Updated javadoc for `ColumnIndexSupplier.as` to elaborate on the types of indexes callers might want to ask for from the method, as well as help implementors know what kinds of indexes they should implement to participate in filtering
2024-06-27 17:53:20 -07:00
Clint Wylie d86f25c74a
fix vector grouping expression deferred evaluation to only consider dictionary encoded strings as fixed width (#16666) 2024-06-27 16:19:16 -07:00
Gian Merlino dbed1b0f50
Defer more expressions in vectorized groupBy. (#16338)
* Defer more expressions in vectorized groupBy.

This patch adds a way for columns to provide GroupByVectorColumnSelectors,
which controls how the groupBy engine operates on them. This mechanism is used
by ExpressionVirtualColumn to provide an ExpressionDeferredGroupByVectorColumnSelector
that uses the inputs of an expression as the grouping key. The actual expression
evaluation is deferred until the grouped ResultRow is created.

A new context parameter "deferExpressionDimensions" allows users to control when
this deferred selector is used. The default is "fixedWidthNonNumeric", which is a
behavioral change from the prior behavior. Users can get the prior behavior by setting
this to "singleString".

* Fix style.

* Add deferExpressionDimensions to SqlExpressionBenchmark.

* Fix style.

* Fix inspections.

* Add more testing.

* Use valueOrDefault.

* Compute exprKeyBytes a bit lighter-weight.
2024-06-26 17:28:36 -07:00
Clint Wylie d4f2636325
fix greatest/least function non-vectorized processing to ignore null argument types (#16649) 2024-06-26 12:59:42 -07:00
Laksh Singla 71b3b5ab5d
Add query context parameter to remove null bytes when writing frames (#16579)
MSQ cannot process null bytes in string fields, and the current workaround is to remove them using the REPLACE function. 'removeNullBytes' context parameter has been added which sanitizes the input string fields by removing these null bytes.
2024-06-26 15:00:30 +05:30
Kashif Faraz d9bd02256a
Refactor: Rename UsedSegmentChecker and cleanup task actions (#16644)
Changes:
- Rename `UsedSegmentChecker` to `PublishedSegmentsRetriever`
- Remove deprecated single `Interval` argument from `RetrieveUsedSegmentsAction`
as it is now unused and has been deprecated since #1988 
- Return `Set` of segments instead of a `Collection` from `IndexerMetadataStorageCoordinator.retrieveUsedSegments()`
2024-06-26 10:48:59 +05:30
Tom 52c9929019
Column name in parse exceptions (#16529)
* first pass

* more changes

* fix tests and formatting

* fix kinesis failing tests

* fix kafka tests

* add dimension name to float parse errors

* double and convertToType handling of dimensionName can report parse errors with dimension name

* fix checkstyle issue

* fix tests

* more cases to have better parse exception messages

* fix test

* fix tests

* partially address comments

* annotate method parameter with nullable

* address comments

* fix tests

* let float, double, long dimensionIndexer pass dimensionName down to dimensionHandlerUtils

* fix compilation error and clean up formatting

* clean up whitespace

* address feedback. undo change, pass down report parse exception for convertToType

* fix test
2024-06-25 13:42:52 -07:00
Clint Wylie 37a50e6803
Remove index_realtime and index_realtime_appenderator tasks (#16602)
index_realtime tasks were removed from the documentation in #13107. Even
at that time, they weren't really documented per se— just mentioned. They
existed solely to support Tranquility, which is an obsolete ingestion
method that predates migration of Druid to ASF and is no longer being
maintained. Tranquility docs were also de-linked from the sidebars and
the other doc pages in #11134. Only a stub remains, so people with
links to the page can see that it's no longer recommended.

index_realtime_appenderator tasks existed in the code base, but were
never documented, nor as far as I am aware were they used for any purpose.

This patch removes both task types completely, as well as removes all
supporting code that was otherwise unused. It also updates the stub
doc for Tranquility to be firmer that it is not compatible. (Previously,
the stub doc said it wasn't recommended, and pointed out that it is
built against an ancient 0.9.2 version of Druid.)

ITUnionQueryTest has been migrated to the new integration tests framework and updated to use Kafka ingestion.

Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
2024-06-24 20:13:33 -07:00
Abhishek Radhakrishnan 7463589b07
Support for bootstrap segments (#16609)
* Initial support for bootstrap segments.

  - Adds a new API in the coordinator.
  - All processes that have storage locations configured (including tasks)
    talk to the coordinator if they can, and fetch bootstrap segments from it.
  - Then load the segments onto the segment cache as part of startup.
  - This addresses the segment bootstrapping logic required by processes before
    they can start serving queries or ingesting.

    This patch also lays the foundation to speed up upgrades.

* Fail open by default if there are any errors talking to the coordinator.

* Add test for failure scenario and cleanup logs.

* Cleanup and add debug log

* Assert the events so we know the list exactly.

* Revert RunRules test.

The rules aren't evaluated if there are no clusters.

* Revert RunRulesTest too.

* Remove debug info.

* Make the API POST and update log.

* Fix up UTs.

* Throw 503 from MetadataResource; clean up exception handling and DruidException.

* Remove unused logger, add verification of metrics and docs.

* Update error message

* Update server/src/main/java/org/apache/druid/server/coordination/SegmentLoadDropHandler.java

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

* Apply suggestions from code review

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

* Adjust test metric expectations with the rename.

* Add BootstrapSegmentResponse container in the response for future extensibility.

* Rename to BootstrapSegmentsInfo for internal consistency.

* Remove unused log.

* Use a member variable for broadcast segments instead of segmentAssigner.

* Minor cleanup

* Add test for loadable bootstrap segments and clarify comment.

* Review suggestions.

---------

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
2024-06-24 09:27:17 -07:00
Sree Charan Manamala 990fd5f5fb
Make use group iterator for all window frames & support for same bound kinds (#16603)
Fixes apache/druid#15739
2024-06-24 15:52:41 +02:00
Laksh Singla 00c96432af
Materialize scan results correctly when columns are not present in the segments (#16619)
Fixes a bug causing maxSubqueryBytes not to work when segments have missing columns.
2024-06-23 23:15:45 +05:30
Akshat Jain cd438b1918
Emit metrics for S3UploadThreadPool (#16616)
* Emit metrics for S3UploadThreadPool

* Address review comments

* Revert unnecessary formatting change

* Revert unnecessary formatting change in metrics.md file

* Address review comments

* Add metric for task duration

* Minor fix in metrics.md

* Add s3Key and uploadId in the log message

* Address review comments

* Create new instance of ServiceMetricEvent.Builder for thread safety

* Address review comments

* Address review comments
2024-06-21 11:36:47 +05:30
Adithya Chakilam 35709de549
CgroupCpuSetMonitor: Initialize the cgroup discoverer (#16621) 2024-06-20 10:23:59 -07:00
Abhishek Radhakrishnan b20c3dbadf
Fix malformed period throwing `ADMIN` persona error (#16626)
* Turn invalid periods into user-facing exception providing more context.

The current exception is targeting the ADMIN persona. Catch that and turn
it into a USER persona instead. Also, provide more context in the error
message.

* Review comment: pass the wrapping expression and stringify.

* Update processing/src/main/java/org/apache/druid/query/expression/ExprUtils.java

Co-authored-by: Clint Wylie <cjwylie@gmail.com>

---------

Co-authored-by: Clint Wylie <cjwylie@gmail.com>
2024-06-20 08:40:28 -07:00
Sree Charan Manamala 7ac0862287
Grouping Engine fix when a limit spec with different order by columns is applied (#16534) 2024-06-20 11:35:58 +02:00
Sam Rash a10310388f
Add Conditional Helpers to DruidException / InvalidInput (#16470)
Adds versions of 

DruidException.defensive(String, Object...)
InvalidInput.exception(String, Object...)
InvalidInput.exception(Throwable, String, Object...)

the versions add a boolean as the first arg and only create and throw
an exception if it's false. It can be used similar to
Preconditions.checkState/checkArgument
2024-06-18 14:05:43 +05:30
Virushade eb842d3dda
Remove redundant check on optional in BlockingQueueFrameChannel.Writable#isClosed (#16595)
* Remove redundant check on optional in BlockingQueueFrameChannel.Writable#isClosed

* Rollback mistake
2024-06-14 15:21:07 +05:30
Laksh Singla da1e293a57
Deserialize dimensions in group by queries to their respective types when reading from their serialized format (#16511)
* init

* tests, pair groupable

* framework change

* tests

* update benchmarks

* comments

* add javadoc for the jsonMapper

* remove extra deserialization

* add special serde for map based result rows

* revert unnecessary change

---------

Co-authored-by: asdf2014 <asdf2014@apache.org>
2024-06-14 16:27:47 +08:00
Zoltan Haindrich ac19b148c2
Upgrade calcite to 1.37.0 (#16504)
* contains Make a full copy of the parser and apply our modifications to it #16503
* some minor api changes pair/entry
* some unnecessary aggregation was removed from a set of queries in `CalciteSubqueryTest`
* `AliasedOperatorConversion` was detecting `CHAR_LENGTH` as not a function ; I've removed the check
  * the field it was using doesn't look maintained that much
  * the `kind` is passed for the created `SqlFunction` so I don't think this check is actually needed
* some decoupled test cases become broken - will be fixed later
* some aggregate related changes: due to the fact that SUM() and COUNT() of no inputs are different
* upgrade avatica to 1.25.0
* `CalciteQueryTest#testExactCountDistinctWithFilter` is now executable

Close apache/druid#16503
2024-06-13 08:47:50 +02:00
Clint Wylie fee509df2e
fix NestedDataColumnIndexerV4 to not report cardinality (#16507)
* fix NestedDataColumnIndexerV4 to not report cardinality
changes:
* fix issue similar to #16489 but for NestedDataColumnIndexerV4, which can report STRING type if it only processes a single type of values. this should be less common than the auto indexer problem
* fix some issues with sql benchmarks
2024-06-11 20:58:12 -07:00
Clint Wylie 3fb6ba22e8
fix expression column capabilities to not report dictionary encoded unless input is string (#16577) 2024-06-08 13:05:19 -07:00
Akshat Jain 03a38be446
Optimize S3 storage writing for MSQ durable storage (#16481)
* Optimise S3 storage writing for MSQ durable storage

* Get rid of static ConcurrentHashMap

* Fix static checks

* Fix tests

* Remove unused constructor parameter chunkValidation + relevant cleanup

* Assert etags as String instead of Integer

* Fix flaky test

* Inject executor service

* Make threadpool size dynamic based on number of cores

* Fix S3StorageDruidModuleTest

* Fix S3StorageConnectorProviderTest

* Fix injection issues

* Add S3UploadConfig to manage maximum number of concurrent chunks dynamically based on chunk size

* Address the minor review comments

* Refactor S3UploadConfig + ExecutorService into S3UploadManager

* Address review comments

* Make updateChunkSizeIfGreater() synchronized instead of recomputeMaxConcurrentNumChunks()

* Address the minor review comments

* Fix intellij-inspections check

* Refactor code to use futures for maxNumConcurrentChunks. Also use executor service with blocking queue for backpressure semantics.

* Update javadoc

* Get rid of cyclic dependency injection between S3UploadManager and S3OutputConfig

* Fix RetryableS3OutputStreamTest

* Remove unnecessary synchronization parts from RetryableS3OutputStream

* Update javadoc

* Add S3UploadManagerTest

* Revert back to S3StorageConnectorProvider extends S3OutputConfig

* Address Karan's review comments

* Address Kashif's review comments

* Change a log message to debug

* Address review comments

* Fix intellij-inspections check

* Fix checkstyle

---------

Co-authored-by: asdf2014 <asdf2014@apache.org>
2024-06-07 11:33:16 +05:30
Gian Merlino 277006446d
Fallback vectorization for FunctionExpr and BaseMacroFunctionExpr. (#16366)
* Fallback vectorization for FunctionExpr and BaseMacroFunctionExpr.

This patch adds FallbackVectorProcessor, a processor that adapts non-vectorizable
operations into vectorizable ones. It is used in FunctionExpr and BaseMacroFunctionExpr.

In addition:

- Identifiers are updated to offer getObjectVector for ARRAY and COMPLEX in addition
  to STRING. ExprEvalObjectVector is updated to offer ARRAY and COMPLEX as well.

- In SQL tests, cannotVectorize now fails tests if an exception is not thrown. This makes
  it easier to identify tests that can now vectorize.

- Fix a null-matcher bug in StringObjectVectorValueMatcher.

* Fix tests.

* Fixes.

* Fix tests.

* Fix test.

* Fix test.
2024-06-05 20:03:02 -07:00
Gian Merlino b837ce565b
Simplify serialized form of JsonInputFormat. (#15691)
* Simplify serialized form of JsonInputFormat.

Use JsonInclude for keepNullColumns, assumeNewlineDelimited, and
useJsonNodeReader. Because the default value of keepNullColumns is
variable, we store the original configured value rather than the
derived value, and include if the original value is nonnull.

* Fix test.
2024-06-05 20:01:14 -07:00
Gian Merlino 1040a29bc5
Fix capabilities reported by UnnestStorageAdapter. (#16551)
UnnestStorageAdapter and its cursors did not return capabilities correctly
for the output column. This patch fixes two problems:

1) UnnestStorageAdapter returned the capabilities of the unnest virtual
   column prior to unnesting. It should return the post-unnest capabilities.

2) UnnestColumnValueSelectorCursor passed through isDictionaryEncoded from
   the unnest virtual column. This is incorrect, because the dimension selector
   created by this class never has a dictionary. This is the cause of #16543.
2024-06-05 15:19:42 -07:00
Akshat Jain 6d7d2ffa63
Add interface method for returning canonical lookup name (#16557)
* Add interface method for returning canonical lookup name

* Address review comment

* Add test in LookupReferencesManagerTest for coverage check

* Add test in LookupSerdeModuleTest for coverage check
2024-06-05 14:33:18 -07:00
Abhishek Radhakrishnan b9ba286423
Fix task bootstrapping & simplify segment load/drop flows (#16475)
* Fix task bootstrap locations.

* Remove dependency of SegmentCacheManager from SegmentLoadDropHandler.

- The load drop handler code talks to the local cache manager via
SegmentManager.

* Clean up unused imports and stuff.

* Test fixes.

* Intellij inspections and test bind.

* Clean up dependencies some more

* Extract test load spec and factory to its own class.

* Cleanup test util

* Pull SegmentForTesting out to TestSegmentUtils.

* Fix up.

* Minor changes to infoDir

* Replace server announcer mock and verify that.

* Add tests.

* Update javadocs.

* Address review comments.

* Separate methods for download and bootstrap load

* Clean up return types and exception handling.

* No callback for loadSegment().

* Minor cleanup

* Pull out the test helpers into its own static class so it can have better state control.

* LocalCacheManager stuff

* Fix build.

* Fix build.

* Address some CI warnings.

* Minor updates to javadocs and test code.

* Address some CodeQL test warnings and checkstyle fix.

* Pass a Consumer<DataSegment> instead of boolean & rename variables.

* Small updates

* Remove one test constructor.

* Remove the other constructor that wasn't initializing fully and update usages.

* Cleanup withInfoDir() builder and unnecessary test hooks.

* Remove mocks and elaborate on comments.

* Commentary

* Fix a few Intellij inspection warnings.

* Suppress corePoolSize intellij-inspect warning.

The intellij-inspect tool doesn't seem to correctly inspect
lambda usages. See ScheduledExecutors.

* Update docs and add more tests.

* Use hamcrest for asserting order on expectation.

* Shutdown bootstrap exec.

* Fix checkstyle
2024-06-04 10:44:46 -07:00
Adithya Chakilam a9044ac235
Add cgroup cpu/mem/disk usage metrics (#16472)
* Add cgroup cpu/mem usage metrics

* checks

* comments

* docs fix

* add disk metrics

* fapi check

* checkstyle

* issues

* spelling

* change asserts

* checks

* use proc builder instead of runtime

* specify charset

* spotbug
2024-05-29 12:44:37 -07:00
Adarsh Sanjeev 21f725f33e
Add octet streaming of sketchs in MSQ (#16269)
There are a few issues with using Jackson serialization in sending datasketches between controller and worker in MSQ. This caused a blowup due to holding multiple copies of the sketch being stored.

This PR aims to resolve this by switching to deserializing the sketch payload without Jackson.

The PR adds a new query parameter used during communication between controller and worker while fetching sketches, "sketchEncoding".

    If the value of this parameter is OCTET, the sketch is returned as a binary encoding, done by ClusterByStatisticsSnapshotSerde.
    If the value is not the above, the sketch is encoded by Jackson as before.
2024-05-28 18:12:38 +05:30
Kashif Faraz 9d77ef04f4
Cleanup usages of stopwatch (#16478)
Changes:
- Remove synchronized methods from `Stopwatch`
- Access stopwatch methods in `ChangeRequestHttpSyncer` inside a lock
2024-05-27 23:08:46 +05:30
Clint Wylie 4e1de50e30
fix issue with auto column grouping (#16489)
* fix issue with auto column grouping
changes:
* fixes bug where AutoTypeColumnIndexer reports incorrect cardinality, allowing it to incorrectly use array grouper algorithm for realtime queries producing incorrect results for strings
* fixes bug where auto LONG and DOUBLE type columns incorrectly report not having null values, resulting in incorrect null handling when grouping

* fix test
2024-05-27 11:18:17 +05:30
zachjsh b0cc1ee84b
Add ability to turn off Druid Catalog specific validation done on catalog defined tables in Druid (#16465)
* * add property to enable / disable catalog validation and add tests

* * add integration tests for catalog validation disabled

* * add integration tests

* * remove debugging logs

* * fix forbidden api call
2024-05-23 13:19:51 -04:00
Pranav 204a25d3e6
Moving object contains to Bound for string/object matchers (#16241) 2024-05-23 16:56:04 +02:00
Gian Merlino eb410f712d
Use typecasting comparator for numeric "any" aggregations. (#16494)
This brings them in line with the behavior of other numeric aggregations.
It is important because otherwise ClassCastExceptions can arise if comparing
different numeric types that may arise from deserialization.
2024-05-22 12:38:51 -07:00
Gian Merlino 0fb09445a5
Fix ExpressionPredicateIndexSupplier numeric replace-with-default behavior. (#16448)
* Fix ExpressionPredicateIndexSupplier numeric replace-with-default behavior.

In replace-with-default mode, null numeric values from the index should be
interpreted as zeroes by expressions. This makes the index supplier more
consistent with the behavior of the selectors created by the expression
virtual column.

* Fix test case.
2024-05-15 15:11:47 +05:30
Gian Merlino 72432c2e78
Speed up SQL IN using SCALAR_IN_ARRAY. (#16388)
* Speed up SQL IN using SCALAR_IN_ARRAY.

Main changes:

1) DruidSqlValidator now includes a rewrite of IN to SCALAR_IN_ARRAY, when the size of
   the IN is above inFunctionThreshold. The default value of inFunctionThreshold
   is 100. Users can restore the prior behavior by setting it to Integer.MAX_VALUE.

2) SearchOperatorConversion now generates SCALAR_IN_ARRAY when converting to a regular
   expression, when the size of the SEARCH is above inFunctionExprThreshold. The default
   value of inFunctionExprThreshold is 2. Users can restore the prior behavior by setting
   it to Integer.MAX_VALUE.

3) ReverseLookupRule generates SCALAR_IN_ARRAY if the set of reverse-looked-up values is
   greater than inFunctionThreshold.

* Revert test.

* Additional coverage.

* Update docs/querying/sql-query-context.md

Co-authored-by: Benedict Jin <asdf2014@apache.org>

* New test.

---------

Co-authored-by: Benedict Jin <asdf2014@apache.org>
2024-05-14 08:09:27 -07:00
Sree Charan Manamala b8dd7478d0
Custom Calcite Rule to remove redundant references (#16402)
Custom calcite rule mimicking AggregateProjectMergeRule to extend support to expressions.
The current calcite rule return null in such cases.
In addition, this removes the redundant references.
2024-05-14 06:38:05 +02:00
Laksh Singla 4bfc186153
Support sorting on complex columns in MSQ (#16322)
MSQ sorts the columns in a highly specialized manner by byte comparisons. As such the values are serialized differently. This works well for the primitive types and primitive arrays, however complex types cannot be serialized specially.

This PR adds the support for sorting the complex columns by deserializing the value from the field and comparing it via the type strategy. This is a lot slower than the byte comparisons, however, it's the only way to support sorting on complex columns that can have arbitrary serialization not optimized for MSQ.

The primitives and the arrays are still compared via the byte comparison, therefore this doesn't affect the performance of the queries supported before the patch. If there's a sorting key with mixed complex and primitive/primitive array types, for example: longCol1 ASC, longCol2 ASC, complexCol1 DESC, complexCol2 DESC, stringCol1 DESC, longCol3 DESC, longCol4 ASC, the comparison will happen like:

    longCol1, longCol2 (ASC) - Compared together via byte-comparison, since both are byte comparable and need to be sorted in ascending order
    complexCol1 (DESC) - Compared via deserialization, cannot be clubbed with any other field
    complexCol2 (DESC) - Compared via deserialization, cannot be clubbed with any other field, even though the prior field was a complex column with the same order
    stringCol1, longCol3 (DESC) - Compared together via byte-comparison, since both are byte comparable and need to be sorted in descending order
    longCol4 (ASC) - Compared via byte-comparison, couldn't be coalesced with the previous fields as the direction was different

This way, we only deserialize the field wherever required
2024-05-13 15:07:05 +05:30
Igor Berman d0f3fdab37
Allow using different lock types for kill task, remove markAsUnused parameter (#16362)
Changes:
- Remove deprecated `markAsUnused` parameter from `KillUnusedSegmentsTask`
- Allow `kill` task to use `REPLACE` lock when `useConcurrentLocks` is true
- Use `EXCLUSIVE` lock by default
2024-05-10 06:37:36 +05:30
Laksh Singla dded473ac0
Fix another deadlock which can occur while acquiring merge buffers (#16372)
Fixes a deadlock while acquiring merge buffers
2024-05-08 14:33:15 +05:30
Adarsh Sanjeev f82cc34e5b
Maintain a connection while exporting results with MSQ (#16381)
* Maintain a connection while exporting results with MSQ

* Fix checkstyle

* Fix checkstyle

* Move initialization from constructor

* Add null check

* Address review comments
2024-05-08 11:34:20 +05:30
Alberic Liu 92fb0ff718
upgrade mysql:mysql-connector-java to 8.2.0 (#16024)
* upgrade mysql:mysql-connector-java to 8.2.0

* fix the check errors

* remove unused comment
2024-05-06 21:58:37 +08:00
Pranav b713a517f1
Fix the bug in Immutable RTree object strategy (#16389)
* Fix the bug in Immutable Node object strategy

* Adding comments in code
2024-05-06 14:37:29 +05:30
Gian Merlino 1b107ff695
QueryableIndex: Close columns after failed vector cursor setup. (#16365)
* QueryableIndex: Close columns after failed vector cursor setup.

If anything fails while setting up a vector cursor, the prior code in
QueryableIndex would not close its ColumnCache and would therefore leak
columns. Columns often contain references to buffers that must be closed.

* Fix style.
2024-05-03 12:58:40 -07:00
Rishabh Singh c61c3785a0
Followup changes to 15817 (Segment schema publishing and polling) (#16368)
* Fix build

* Nit changes in KillUnreferencedSegmentSchema

* Replace reference to the abbreviation SMQ with Metadata Query, rename inTransit maps in schema cache

* nitpicks

* Remove reference to smq abbreviation from integration-tests

* Remove reference to smq abbreviation from integration-tests

* minor change

* Update index.md

* Add delimiter while computing schema fingerprint hash
2024-05-03 19:13:52 +05:30
Gian Merlino 5d1950d451
MSQ controller: Support in-memory shuffles; towards JVM reuse. (#16168)
* MSQ controller: Support in-memory shuffles; towards JVM reuse.

This patch contains two controller changes that make progress towards a
lower-latency MSQ.

First, support for in-memory shuffles. The main feature of in-memory shuffles,
as far as the controller is concerned, is that they are not fully buffered. That
means that whenever a producer stage uses in-memory output, its consumer must run
concurrently. The controller determines which stages run concurrently, and when
they start and stop.

"Leapfrogging" allows any chain of sort-based stages to use in-memory shuffles
even if we can only run two stages at once. For example, in a linear chain of
stages 0 -> 1 -> 2 where all do sort-based shuffles, we can use in-memory shuffling
for each one while only running two at once. (When stage 1 is done reading input
and about to start writing its output, we can stop 0 and start 2.)

1) New OutputChannelMode enum attached to WorkOrders that tells workers
   whether stage output should be in memory (MEMORY), or use local or durable
   storage.

2) New logic in the ControllerQueryKernel to determine which stages can use
   in-memory shuffling (ControllerUtils#computeStageGroups) and to launch them
   at the appropriate time (ControllerQueryKernel#createNewKernels).

3) New "doneReadingInput" method on Controller (passed down to the stage kernels)
   which allows stages to transition to POST_READING even if they are not
   gathering statistics. This is important because it enables "leapfrogging"
   for HASH_LOCAL_SORT shuffles, and for GLOBAL_SORT shuffles with 1 partition.

4) Moved result-reading from ControllerContext#writeReports to new QueryListener
   interface, which ControllerImpl feeds results to row-by-row while the query
   is still running. Important so we can read query results from the final
   stage using an in-memory channel.

5) New class ControllerQueryKernelConfig holds configs that control kernel
   behavior (such as whether to pipeline, maximum number of concurrent stages,
   etc). Generated by the ControllerContext.

Second, a refactor towards running workers in persistent JVMs that are able to
cache data across queries. This is helpful because I believe we'll want to reuse
JVMs and cached data for latency reasons.

1) Move creation of WorkerManager and TableInputSpecSlicer to the
   ControllerContext, rather than ControllerImpl. This allows managing workers and
   work assignment differently when JVMs are reusable.

2) Lift the Controller Jersey resource out from ControllerChatHandler to a
   reusable resource.

3) Move memory introspection to a MemoryIntrospector interface, and introduce
   ControllerMemoryParameters that uses it. This makes it easier to run MSQ in
   process types other than Indexer and Peon.

Both of these areas will have follow-ups that make similar changes on the
worker side.

* Address static checks.

* Address static checks.

* Fixes.

* Report writer tests.

* Adjustments.

* Fix reports.

* Review updates.

* Adjust name.

* Small changes.
2024-04-30 21:30:27 -07:00
Laksh Singla e695e52d3f
Improve code flow in the First/Last vector aggregators and unify the numeric aggregators with the String implementations (#16230)
This PR fixes the first and last vector aggregators and improves their readability. Following changes are introduced

    The folding is broken in the vectorized versions. We consider time before checking the folded object.

    If the numerical aggregator gets passed any other object type for some other reason (like String), then the aggregator considers it to be folded, even though it shouldn’t be. We should convert these objects to the desired type, and aggregate them properly.

    The aggregators must properly use generics. This would minimize the ClassCastException issues that can happen with mixed segment types. We are unifying the string first/last aggregators with numeric versions as well.

    The aggregators must aggregate null values (https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/query/aggregation/first/StringFirstLastUtils.java#L55-L56 ). The aggregator should only ignore pairs with time == null, and not value == null

    Time nullity is ignored when trying to vectorize the data.

    String versions initialized with DateTimes.MIN that is equal to Long.MIN / 2. This can cause incorrect results in case the user enters a custom time column. NOTE: This is still present because it would require a larger refactor in all of the versions.

    There is a difference in what users might expect from the results because the code flow is changed (for example, the direction of the for loops, etc), however, this will only change the results, and not the contract set by first/last aggregators, which is that if multiple values have the same timestamp, then any of them can get picked.

    If the column is non-existent, the users might expect a change in the timestamp from DateTime.MAX to Long.MAX, because the code incorrectly used DateTime.MAX to initialize the aggregator, however, in case of a custom timestamp column, this might not be the case. The SQL query might be prohibited from using any Long since it requires a cast to the timestamp function that can fail, but AFAICT native queries don't have such limitations.
2024-04-30 15:13:14 +05:30
Laksh Singla 26d63e7b65
Prevent joining on nested arrays and complex types (#16349)
#16068 modified DimensionHandlerUtils to accept complex types to be dimensions. This had an unintended side effect of allowing complex types to be joined upon (which wasn't guarded explicitly, it doesn't work).
This PR modifies the IndexedTable to reject building the index on the complex types to prevent joining on complex types. The PR adds back the check in the same place, explicitly.
2024-04-30 11:36:53 +05:30
Adarsh Sanjeev fb63520de9
Add tests for ProcessorManager (#16327)
* Add tests for ProcessorManager
2024-04-30 09:35:26 +05:30
Gian Merlino db82adcdfd
SCALAR_IN_ARRAY: Optimization and behavioral follow-ups. (#16311)
* Four changes to scalar_in_array as follow-ups to #16306:

1) Align behavior for `null` scalars to the behavior of the native `in` and `inType` filters: return `true` if the array itself contains null, else return `null`.

2) Rename the class to more closely match the function name.

3) Add a specialization for constant arrays, where we build a `HashSet`.

4) Use `castForEqualityComparison` to properly handle cross-type comparisons.
   Additional tests verify comparisons between LONG and DOUBLE are now
   handled properly.

* Fix spelling.

* Adjustments from review.
2024-04-26 16:01:17 -07:00