Commit Graph

3110 Commits

Author SHA1 Message Date
Laksh Singla 114380749d
MSQ: Improve the parse exception errors and the handling of null UTF characters in Strings in Frames (#14398) 2023-06-26 18:14:29 +05:30
Laksh Singla 1647d5f4a0
Limit the subquery results by memory usage (#13952)
Users can now add a guardrail to prevent subquery’s results from exceeding the set number of bytes by setting druid.server.http.maxSubqueryRows in Broker's config or maxSubqueryRows in the query context. This feature is experimental for now and would default back to row-based limiting in case it fails to get the accurate size of the results consumed by the query.
2023-06-26 18:12:28 +05:30
Gian Merlino 970288067a
Fix flaky HttpEmitterConfigTest and ParametrizedUriEmitterConfigTest. (#14481)
Recently, we have seen flakiness in these two tests, apparently due to
computations based on Runtime.getRuntime().maxMemory() differing during
static initialization and in the actual tests. I can't think of a reason
why this would be happening, but anyway, this patch switches the tests to
use the statics instead of recomputing Runtime.getRuntime().maxMemory().
2023-06-23 16:27:11 -07:00
imply-cheddar 7e2cf35d7b
Fix compatibility issue with SqlTaskResource (#14466)
* Fix compatibility issue with SqlTaskResource

The DruidException changes broke the response format
for errors coming back from the SqlTaskResource, so fix those
2023-06-23 01:15:32 -07:00
Clint Wylie 31b9d5695d
Extend InitializedNullHandlingTest instead of NullHandlingTest (#14467)
NullHandlingTest is an actual test, it shouldn't be used as a base class
2023-06-22 15:01:50 +05:30
Hardik Bajaj 1ea9158a50
Added new SysMonitorOshi v0 using Oshi library (#14359)
Added a new monitor SysMonitorOshi to replace SysMonitor. The new monitor has a wider support for different machine architectures including ARM instances. Please switch to SysMonitorOshi as SysMonitor is now deprecated and will be removed in future releases.
2023-06-20 20:57:58 +05:30
Kashif Faraz 50461c3bd5
Enable smartSegmentLoading on the Coordinator (#13197)
This commit does a complete revamp of the coordinator to address problem areas:
- Stability: Fix several bugs, add capabilities to prioritize and cancel load queue items
- Visibility: Add new metrics, improve logs, revamp `CoordinatorRunStats`
- Configuration: Add dynamic config `smartSegmentLoading` to automatically set
optimal values for all segment loading configs such as `maxSegmentsToMove`,
`replicationThrottleLimit` and `maxSegmentsInNodeLoadingQueue`.

Changed classes:
- Add `StrategicSegmentAssigner` to make assignment decisions for load, replicate and move
- Add `SegmentAction` to distinguish between load, replicate, drop and move operations
- Add `SegmentReplicationStatus` to capture current state of replication of all used segments
- Add `SegmentLoadingConfig` to contain recomputed dynamic config values
- Simplify classes `LoadRule`, `BroadcastRule`
- Simplify the `BalancerStrategy` and `CostBalancerStrategy`
- Add several new methods to `ServerHolder` to track loaded and queued segments
- Refactor `DruidCoordinator`

Impact:
- Enable `smartSegmentLoading` by default. With this enabled, none of the following
dynamic configs need to be set: `maxSegmentsToMove`, `replicationThrottleLimit`,
`maxSegmentsInNodeLoadingQueue`, `useRoundRobinSegmentAssignment`,
`emitBalancingStats` and `replicantLifetime`.
- Coordinator reports richer metrics and produces cleaner and more informative logs
- Coordinator uses an unlimited load queue for all serves, and makes better assignment decisions
2023-06-19 14:27:35 +05:30
imply-cheddar cfd07a95b7
Errors take 3 (#14004)
Introduce DruidException, an exception whose goal in life is to be delivered to a user.

DruidException itself has javadoc on it to describe how it should be used.  This commit both introduces the Exception and adjusts some of the places that are generating exceptions to generate DruidException objects instead, as a way to show how the Exception should be used.

This work was a 3rd iteration on top of work that was started by Paul Rogers.  I don't know if his name will survive the squash-and-merge, so I'm calling it out here and thanking him for starting on this.
2023-06-19 01:11:13 -07:00
Adarsh Sanjeev 128133fadc
Add column replication_factor column to sys.segments table (#14403)
Description:
Druid allows a configuration of load rules that may cause a used segment to not be loaded
on any historical. This status is not tracked in the sys.segments table on the broker, which
makes it difficult to determine if the unavailability of a segment is expected and if we should
not wait for it to be loaded on a server after ingestion has finished.

Changes:
- Track replication factor in `SegmentReplicantLookup` during evaluation of load rules
- Update API `/druid/coordinator/v1metadata/segments` to return replication factor
- Add column `replication_factor` to the sys.segments virtual table and populate it in
`MetadataSegmentView`
- If this column is 0, the segment is not assigned to any historical and will not be loaded.
2023-06-18 10:02:21 +05:30
George Shiqi Wu 64af9bfe5b
Add groupId to metrics (#14402)
* Add group id as a dimension

* Revert changes

* Add to forking task runner

* Add missing metrics

* Fix indenting

* revert metrics

* Fix indentation
2023-06-16 09:28:16 -07:00
Clint Wylie 359bd63cc9
allow expression "best effort" type determination to better handle mixed type arrays (#14438) 2023-06-16 00:02:43 -07:00
Clint Wylie ff5ae4db6c
fix kafka input format reader schema discovery and partial schema discovery (#14421)
* fix kafka input format reader schema discovery and partial schema discovery to actually work right, by re-using dimension filtering logic of MapInputRowParser
2023-06-15 00:11:04 -07:00
Clint Wylie ca116cf886
adjust broker parallel merge to help managed blocking be more well behaved (#14427) 2023-06-15 00:10:31 -07:00
Pranav e426d370ea
Start with solo accumulator and empty partition (#14426)
* Starting parallel merge with solo accumulator and empty partitions

* shutshown pool in test
2023-06-14 16:20:48 -07:00
Clint Wylie 8454cc619a
auto columns fixes (#14422)
changes:
* auto columns no longer participate in generic 'null column' handling, this was a mistake to try to support and caused ingestion failures due to mismatched ColumnFormat, and will be replaced in the future with nested common format constant column functionality (not in this PR)
* fix bugs with auto columns which contain empty objects, empty arrays, or primitive types mixed with either of these empty constructs
* fix bug with bound filter when upper is null equivalent but is strict
2023-06-14 08:57:06 -07:00
Abhishek Radhakrishnan be5a6593a9
Reset `RuntimeInfo` to fix flaky test `ParametrizedUriEmitterConfigTest`. (#14405)
* Add injector so JVM settings are correctly set up and bound for the test.

* Add VisibleForTesting IDE annotation.

* spacing
2023-06-13 18:07:51 -07:00
Clint Wylie 61120dc49a
fix Kafka input format to throw ParseException if timestamp is missing (#14413) 2023-06-13 09:00:11 -07:00
Adarsh Sanjeev 267cbac6ff
Add logs for deleting files using storage connector (#14350)
* Add logs for deleting files using storage connector

* Address review comments

* Update log message format
2023-06-11 21:24:30 +05:30
Kashif Faraz 6e158704cb
Do not retry INSERT task into metadata if max_allowed_packet limit is violated (#14271)
Changes
- Add a `DruidException` which contains a user-facing error message, HTTP response code
- Make `EntryExistsException` extend `DruidException`
- If metadata store max_allowed_packet limit is violated while inserting a new task, throw
`DruidException` with response code 400 (bad request) to prevent retries
- Add `SQLMetadataConnector.isRootCausePacketTooBigException` with impl for MySQL
2023-06-10 12:15:44 +05:30
imply-cheddar 87149d5975
Remove AbstractIndex (#14388)
The class apparently only exists to add a toString()
method to Indexes, which basically just crashes any debugger
on any meaningfully sized index.  It's a pointless
abstract class that basically only causes pain.
2023-06-08 19:52:16 -07:00
Harini Rajendran 4ff6026d30
Adding SegmentMetadataEvent and publishing them via KafkaEmitter (#14281)
In this PR, we are enhancing KafkaEmitter, to emit metadata about published segments (SegmentMetadataEvent) into a Kafka topic. This segment metadata information that gets published into Kafka, can be used by any other downstream services to query Druid intelligently based on the segments published. The segment metadata gets published into kafka topic in json string format similar to other events.
2023-06-02 21:28:26 +05:30
zachjsh e75fb8e8e3
Account for data format and compression in MSQ auto taskAssignment (#14307)
### Description

This change allows for consideration of the input format and compression  when computing how to split the input files among available tasks, in MSQ ingestion, when considering the value of the  `maxInputBytesPerWorker` query context parameter. This query parameter allows users to control the maximum number of bytes, with granularity of input file / object, that ingestion tasks will be assigned to ingest. With this change, this context parameter now denotes the estimated weighted size in bytes of the input to split on, with consideration for input format and compression format, rather than the actual file size, reported by the file system.  We assume uncompressed newline delimited json as a baseline, with scaling factor of `1`. This means that when computing the byte weight that a file has towards the input splitting, we take the file size as is, if uncompressed json, 1:1. It was found during testing that gzip compressed json, and parquet, has scale factors of `4` and `8` respectively, meaning that each byte of data is weighted 4x and 8x respectively, when computing input splits. This weighted byte scaling is only considered for MSQ ingestion that uses either LocalInputSource or CloudObjectInputSource at the moment. The default value of the `maxInputBytesPerWorker` query context parameter has been updated from 10 GiB, to 512 MiB
2023-06-01 12:53:49 -07:00
Clint Wylie 4096f51f0b
add configurable ColumnTypeMergePolicy to SegmentMetadataCache (#14319)
This PR adds a new interface to control how SegmentMetadataCache chooses ColumnType when faced with differences between segments for SQL schemas which are computed, exposed as druid.sql.planner.metadataColumnTypeMergePolicy and adds a new 'least restrictive type' mode to allow choosing the type that data across all segments can best be coerced into and sets this as the default behavior.

This is a behavior change around when segment driven schema migrations take effect for the SQL schema. With latestInterval, the SQL schema will be updated as soon as the first job with the new schema has published segments, while using leastRestrictive, the schema will only be updated once all segments are reindexed to the new type. The benefit of leastRestrictive is that it eliminates a bunch of type coercion errors that can happen in SQL when types are varied across segments with latestInterval because the newest type is not able to correctly represent older data, such as if the segments have a mix of ARRAY and number types, or any other combinations that lead to odd query plans.
2023-05-24 20:32:51 +05:30
Soumyava 22ba457d29
Expr getCacheKey now delegates to children (#14287)
* Expr getCacheKey now delegates to children

* Removed the LOOKUP_EXPR_CACHE_KEY as we do not need it

* Adding an unit test

* Update processing/src/main/java/org/apache/druid/math/expr/Expr.java

Co-authored-by: Clint Wylie <cjwylie@gmail.com>

---------

Co-authored-by: Clint Wylie <cjwylie@gmail.com>
2023-05-23 14:49:38 -07:00
Abhishek Radhakrishnan a5e04d95a4
Add `TYPE_NAME` to the complex serde classes and replace the hardcoded names. (#14317)
* Add TYPE_NAME to the serde classes and reuse them instead of hardcoded strings.

* Static check fixes.
2023-05-23 00:54:47 -05:00
Clint Wylie d92b9fbfac
more resilient segment metadata, dont parallel merge internal segment metadata queries (#14296) 2023-05-17 04:12:55 -07:00
Clint Wylie b038a11280
fix issues with handling arrays with all null elements and arrays of booleans in strict mode (#14297) 2023-05-17 01:33:44 -07:00
Soumyava 96a3c00754
Fixing an issue with filtering on a single dimension by converting In… (#14277)
* Fixing an issue with filtering on a single dimension by converting In filter to a selector filter as needed with Filters.toFilter

* Adding a test so that any future refactoring does not break this behavior

* Made comment a bit more meaningful
2023-05-15 20:10:36 -07:00
imply-cheddar f9861808bc
Be able to load segments on Peons (#14239)
* Be able to load segments on Peons

This change introduces a new config on WorkerConfig
that indicates how many bytes of each storage
location to use for storage of a task.  Said config
is divided up amongst the locations and slots
and then used to set TaskConfig.tmpStorageBytesPerTask

The Peons use their local task dir and
tmpStorageBytesPerTask as their StorageLocations for
the SegmentManager such that they can accept broadcast
segments.
2023-05-12 16:51:00 -07:00
Kashif Faraz ba11b3d462
Refactor: Add OverlordDuty to replace OverlordHelper and align with CoordinatorDuty (#14235)
Changes:
- Replace `OverlordHelper` with `OverlordDuty` to align with `CoordinatorDuty`
  - Each duty has a `run()` method and defines a `Schedule` with an initial delay and period.
  - Update existing duties `TaskLogAutoCleaner` and `DurableStorageCleaner`
- Add utility class `Configs`
- Update log, error messages and javadocs
- Other minor style improvements
2023-05-12 22:39:56 +05:30
Clint Wylie 9875090bee
fix segment metadata queries for auto ingested columns that had all null values (#14262) 2023-05-11 20:58:06 -07:00
Soumyava f128b9b666
Updates to filter processing for inner query in Joins (#14237) 2023-05-11 17:21:41 +05:30
Clint Wylie a58cebe491
add array_to_mv function to convert arrays into mvds to assist with migration from mvds to arrays (#14236) 2023-05-11 04:43:28 -07:00
Kashif Faraz 64e6283eca
Do not allow retention rules to be null (#14223)
Changes:
- Do not allow retention rules for any datasource or cluster to be null
- Allow empty rules at the datasource level but not at the cluster level
- Add validation to ensure that `druid.manager.rules.defaultRule` is always set correctly
- Minor style refactors
2023-05-11 14:33:56 +05:30
Clint Wylie aaaff74740
fix npe regression in json_value when filtering non-existent paths (#14250)
* fix npe regression in json_value when filtering non-existent paths

* more coverage
2023-05-10 22:39:22 -07:00
Clint Wylie 6db11bfc60
suppress some cves and fix javadoc build when using java 17 (#14241) 2023-05-10 15:47:10 -07:00
Clint Wylie 8805d8d7db
fix issues with filtering nulls on values coerced to numeric types (#14139)
* fix issues with filtering nulls on values coerced to numeric types
* fix issues with 'auto' type numeric columns in default value mode
* optimize variant typed columns without nested data
* more tests for 'auto' type column ingestion
2023-05-08 13:19:02 -07:00
Clint Wylie a7a4bfd331
modify QueryScheduler to lazily acquire lanes when executing queries to avoid leaks (#14184)
This PR fixes an issue that could occur if druid.query.scheduler.numThreads is configured and any exception occurs after QueryScheduler.run has been called to create a Sequence. This would result in total and/or lane specific locks being acquired, but because the sequence was not actually being evaluated, the "baggage" which typically releases these locks was not being executed. An example of how this can happen is if a group-by having filter, which wraps and transforms this sequence happens to explode while wrapping the sequence. The end result is that the locks are acquired, but never released, eventually halting the ability to execute any queries.
2023-05-08 11:42:05 +05:30
Clint Wylie 90ea192d9c
fix bugs with auto encoded long vector deserializers (#14186)
This PR fixes an issue when using 'auto' encoded LONG typed columns and the 'vectorized' query engine. These columns use a delta based bit-packing mechanism, and errors in the vectorized reader would cause it to incorrectly read column values for some bit sizes (1 through 32 bits). This is a regression caused by #11004, which added the optimized readers to improve performance, so impacts Druid versions 0.22.0+.

While writing the test I finally got sad enough about IndexSpec not having a "builder", so I made one, and switched all the things to use it. Apologies for the noise in this bug fix PR, the only real changes are in VSizeLongSerde, and the tests that have been modified to cover the buggy behavior, VSizeLongSerdeTest and ExpressionVectorSelectorsTest. Everything else is just cleanup of IndexSpec usage.
2023-05-01 11:49:27 +05:30
Suneet Saldanha 84c11df980
Make LoggingEmitter more useful by using Markers (#14121)
* Make LoggingEmitter more useful

* Skip code coverage for facade classes

* fix spellcheck

* code review

* fix dependency

* logging.md

* fix checkstyle

* Add back jacoco version to main pom
2023-04-27 15:06:06 -07:00
Adarsh Sanjeev 5aa119dfda
Add retry to opening retrying stream (#14126)
* Add retry to opening retrying stream
* Add retry to S3Entity for network issues

* Fix tests and clean up code
2023-04-27 16:52:22 +05:30
Gian Merlino 42c8c84eb6
TimeBoundary: Use cursor when datasource is not a regular table. (#14151)
* TimeBoundary: Use cursor when datasource is not a regular table.

Fixes a bug where TimeBoundary could return incorrect results with
INNER Join or inline data.

* Addl Javadocs.
2023-04-26 17:00:13 -07:00
Gian Merlino 752475b799
Fix two concurrency issues with segment fetching. (#14042)
* Fix two concurrency issues with segment fetching.

1) SegmentLocalCacheManager: Fix a concurrency issue where certain directory
   cleanup happened outside of directoryWriteRemoveLock. This created the
   possibility that segments would be deleted by one thread, while being
   actively downloaded by another thread.

2) TaskDataSegmentProcessor (MSQ): Fix a concurrency issue when two stages
   in the same process both use the same segment. For example: a self-join
   using distributed sort-merge. Prior to this change, the two stages could
   delete each others' segments.

3) ReferenceCountingResourceHolder: increment() returns a new ResourceHolder,
   rather than a Releaser. This allows it to be passed to callers without them
   having to hold on to both the original ResourceHolder *and* a Releaser.

4) Simplify various interfaces and implementations by using ResourceHolder
   instead of Pair and instead of split-up fields.

* Add test.

* Fix style.

* Remove Releaser.

* Updates from master.

* Add some GuardedBys.

* Use the correct GuardedBy.

* Adjustments.
2023-04-25 20:49:27 -07:00
Gian Merlino 2dfb693d4c
Improved handling for zero-length intervals. (#14136)
* Improved handling for zero-length intervals.

1) Return an empty list from VersionedIntervalTimeline.lookup when
   provided with an empty interval. (The logic doesn't quite work when
   intervals are empty, which led to #14129.)

2) Don't return zero-length intervals from JodaUtils.condenseIntervals.

3) Detect "incorrect" comparator in JodaUtils.condenseIntervals, and
   recreate the SortedSet if needed. (Not strictly related to the theme
   of this patch. Just another thing in the same file.)

4) Remove unused method JodaUtils.containOverlappingIntervals.

Fixes #14129.

* Fix TimewarpOperatorTest.
2023-04-25 17:12:56 -07:00
Gian Merlino 89e7948159
MSQ: Subclass CalciteJoinQueryTest, other supporting changes. (#14105)
* MSQ: Subclass CalciteJoinQueryTest, other supporting changes.

The main change is the new tests: we now subclass CalciteJoinQueryTest
in CalciteSelectJoinQueryMSQTest twice, once for Broadcast and once for
SortMerge.

Two supporting production changes for default-value mode:

1) InputNumberDataSource is marked as concrete, to allow leftFilter to
   be pushed down to it.

2) In default-value mode, numeric frame field readers can now return nulls.
   This is necessary when stacking joins on top of joins: nulls must be
   preserved for semantics that match broadcast joins and native queries.

3) In default-value mode, StringFieldReader.isNull returns true on empty
   strings in addition to nulls. This is more consistent with the behavior
   of the selectors, which map empty strings to null as well in that mode.

As an effect of change (2), the InsertTimeNull change from #14020 (to
replace null timestamps with default timestamps) is reverted. IMO, this
is fine, as either behavior is defensible, and the change from #14020
hasn't been released yet.

* Adjust tests.

* Style fix.

* Additional tests.
2023-04-25 12:10:23 -07:00
Gian Merlino 73f050027b
MSQ: Preserve original ParseException when writing frames. (#14122) 2023-04-25 11:47:15 +05:30
Nicholas Lippis 9d4cc501f7
return task status reported by peon (#14040)
* return task status reported by peon

* Write TaskStatus to file in AbstractTask.cleanUp

* Get TaskStatus from task log

* Fix merge conflicts in AbstractTaskTest

* Add unit tests for TaskLogPusher, TaskLogStreamer, NoopTaskLogs to satisfy code coverage

* Add license headerss

* Fix style

* Remove unknown exception declarations
2023-04-24 12:05:39 -07:00
TSFenwick accd5536df
Allow for Log4J to be configured for peons but still ensure console logging is enforced (#14094)
* Allow for Log4J to be configured for peons but still ensure console logging is enforced

This change will allow for log4j to be configured for peons but require console logging is still
configured for them to ensure peon logs are saved to deep storage.

Also fixed the test ConsoleLoggingEnforcementTest to use a valid appender for the non console
Config as the previous config was incorrect and would never return a logger.

* fix checkstyle

* add warning to logger when it overwrites all loggers to be console

* optimize calls for altering logging config for ConsoleLoggingEnforcementConfigurationFactory

add getName to the druid logger class

* update docs, and error message

* edit docs to be more clear

* fix checkstyle issues

* CI fixes - LoggerTest code coverage and fix spelling issue for logging docs
2023-04-24 10:41:56 -07:00
Soumyava 8d60edcfcb
Updating segment map function for QueryDataSource to ensure group by … (#14112)
* Updating segment map function for QueryDataSource to ensure group by of group by of join data source gets into proper segment map function path

* Adding unit tests for the failed case

* There you go coverage bot, be happy now
2023-04-20 13:22:29 -07:00
Gian Merlino 9436ee8a63
Nicer error message for CSV with no properties. (#14093)
* Nicer error message for CSV with no properties.

* Take two.

* Adjustments from review, and test fixes.

* Fix test.

* Fix static check.
2023-04-18 12:52:02 -07:00
Clint Wylie e7d2e8b914
fix bug filtering nested columns with expression filters (#14096) 2023-04-17 14:21:32 -07:00
Gian Merlino facd82b493
Add HLLC tests for empty strings that don't pass. (#14085)
I believe the test case illustrates the cause of the problem in #13950.
2023-04-17 15:46:42 +05:30
Gian Merlino 0884a22c41
MSQ: Support for querying lookup and inline data directly. (#14048)
* MSQ: Support for querying lookup and inline data directly.

Main changes:

1) Add of LookupInputSpec and DataSourcePlan.forLookup.

2) Add InlineInputSpec, and modify of DataSourcePlan.forInline to use
   this instead of an ExternalInputSpec with JSON. This allows the inline
   data to act as the right-hand side of a join, if needed.

Supporting changes:

1) Modify JoinDataSource's leftFilter validation to be a little less
   strict: it's now OK with leftFilter being attached to any concrete
   leaf (no children) datasource, rather than requiring it be a table.
   This allows MSQ to create JoinDataSource with InputNumberDataSource
   as the base.

2) Add SegmentWranglerModule to CliIndexer, CliPeon. This allows them to
   query lookups and inline data directly.

* Updates based on CI.

* Additional tests.

* Style fix.

* Remove unused import.
2023-04-14 14:04:02 -07:00
Clint Wylie 179e2e8108
adjust useSchemaDiscovery to also include the behavior of includeAllDimensions to support partial schema declaration without having to set two flags (#14076) 2023-04-12 23:12:49 -07:00
Gian Merlino 81074411a9
MSQ: Support multiple result columns with the same name. (#14025)
* MSQ: Support multiple result columns with the same name.

This is allowed in SQL, and is supported by the regular SQL endpoint.
We retain a validation that INSERT ... SELECT does not allow multiple
columns with the same name, because column names in segments must be
unique.
2023-04-13 11:09:39 +05:30
Clint Wylie 9ed8beca5e
bug fixes and add support for boolean inputs to classic long dimension indexer (#14069)
changes:
* adds support for boolean inputs to the classic long dimension indexer, which plays nice with LONG being the semi official boolean type in Druid, and even nicer when druid.expressions.useStrictBooleans is set to true, since the sampler when using the new 'auto' schema when 'useSchemaDiscovery' is specified on the dimensions spec will call the type out as LONG
* fix bugs with sampler response and new schema discovery stuff incorrectly using classic 'json' type for the logical schema instead of the new 'auto' type
2023-04-11 20:49:52 -07:00
Clint Wylie 29652bd246
fix NPE that can happen when merging all null nested v4 format columns (#14068) 2023-04-11 19:04:51 -07:00
Clint Wylie d61bd7f8f1
fix bug in nested v4 format merger from refactoring (#14053) 2023-04-10 20:38:58 -07:00
Clint Wylie 1aef72aa7e
Bump up the version in pom to 27.0.0 in preparation of release (#14051) 2023-04-10 14:56:59 +05:30
Gian Merlino d52bc333aa
Frames: Ensure nulls are read as default values when appropriate. (#14020)
* Frames: Ensure nulls are read as default values when appropriate.

Fixes a bug where LongFieldWriter didn't write a properly transformed
zero when writing out a null. This had no meaningful effect in SQL-compatible
null handling mode, because the field would get treated as a null anyway.
But it does have an effect in default-value mode: it would cause Long.MIN_VALUE
to get read out instead of zero.

Also adds NullHandling checks to the various frame-based column selectors,
allowing reading of nullable frames by servers in default-value mode.
2023-04-10 05:28:46 +05:30
Clint Wylie f41468fd46
fix off by one error in FrontCodedIndexedWriter and FrontCodedIntArrayIndexedWriter getCardinality method (#14047)
* fix off by one error in FrontCodedIndexedWriter and FrontCodedIntArrayIndexedWriter getCardinality method
2023-04-07 03:11:15 -07:00
zachjsh 5c0221375c
Allow for Input source security in native task layer (#14003)
Fixes #13837.

### Description

This change allows for input source type security in the native task layer.

To enable this feature, the user must set the following property to true:

`druid.auth.enableInputSourceSecurity=true`

The default value for this property is false, which will continue the existing functionality of needing authorization to write to the respective datasource.

When this config is enabled, the users will be required to be authorized for the following resource action, in addition to write permission on the respective datasource.

`new ResourceAction(new Resource(ResourceType.EXTERNAL, {INPUT_SOURCE_TYPE}, Action.READ`

where `{INPUT_SOURCE_TYPE}` is the type of the input source being used;, http, inline, s3, etc..

Only tasks that provide a non-default implementation of the `getInputSourceResources` method can be submitted when config `druid.auth.enableInputSourceSecurity=true` is set. Otherwise, a 400 error will be thrown.
2023-04-06 13:13:09 -04:00
Abhishek Agarwal 92912a6a2b
JOIN or UNNEST queries over tombstone segment can fail (#14021)
Join,Unnest queries over tombstone segment can fail
2023-04-06 16:55:58 +05:30
Clint Wylie b11c0bc249
smarter nested column index utilization (#13977)
* smarter nested column index utilization
changes:
* adds skipValueRangeIndexScale and skipValuePredicateIndexScale to ColumnConfig (e.g. DruidProcessingConfig) available as system config via druid.processing.indexes.skipValueRangeIndexScale and druid.processing.indexes.skipValuePredicateIndexScale
* NestedColumnIndexSupplier uses skipValueRangeIndexScale and skipValuePredicateIndexScale to multiply by the total number of rows to be processed to determine the threshold at which we should no longer consider using bitmap indexes because it will be too many operations
* Default values for skipValueRangeIndexScale and skipValuePredicateIndexScale have been initially set to 0.08, but are separate to allow independent tuning
* these are not documented on purpose yet because they are kind of hard to explain, the mainly exist to help conduct larger scale experiments than the jmh benchmarks used to derive the initial set of values
* these changes provide a pretty sweet performance boost for filter processing on nested columns
2023-04-06 04:09:24 -07:00
Gian Merlino 319f99db05
Always use file sizes when determining batch ingest splits (#13955)
* Always use file sizes when determining batch ingest splits.

Main changes:

1) Update CloudObjectInputSource and its subclasses (S3, GCS,
   Azure, Aliyun OSS) to use SplitHintSpecs in all cases. Previously, they
   were only used for prefixes, not uris or objects.

2) Update ExternalInputSpecSlicer (MSQ) to consider file size. Previously,
   file size was ignored; all files were treated as equal weight when
   determining splits.

A side effect of these changes is that we'll make additional network
calls to find the sizes of objects when users specify URIs or objects
as opposed to prefixes. IMO, this is worth it because it's the only way
to respect the user's split hint and task assignment settings.

Secondary changes:

1) S3, Aliyun OSS: Use getObjectMetadata instead of listObjects to get
   metadata for a single object. This is a simpler call that is also
   expected to be less expensive.

2) Azure: Fix a bug where getBlobLength did not populate blob
   reference attributes, and therefore would not actually retrieve the
   blob length.

3) MSQ: Align dynamic slicing logic between ExternalInputSpecSlicer and
   TableInputSpecSlicer.

4) MSQ: Adjust WorkerInputs to ensure there is always at least one
   worker, even if it has a nil slice.

* Add msqCompatible to testGroupByWithImpossibleTimeFilter.

* Fix tests.

* Add additional tests.

* Remove unused stuff.

* Remove more unused stuff.

* Adjust thresholds.

* Remove irrelevant test.

* Fix comments.

* Fix bug.

* Updates.
2023-04-05 08:54:01 -07:00
Clint Wylie d21babc5b8
remix nested columns (#14014)
changes:
* introduce ColumnFormat to separate physical storage format from logical type. ColumnFormat is now used instead of ColumnCapabilities to get column handlers for segment creation
* introduce new 'auto' type indexer and merger which produces a new common nested format of columns, which is the next logical iteration of the nested column stuff. Essentially this is an automatic type column indexer that produces the most appropriate column for the given inputs, making either STRING, ARRAY<STRING>, LONG, ARRAY<LONG>, DOUBLE, ARRAY<DOUBLE>, or COMPLEX<json>.
* revert NestedDataColumnIndexer, NestedDataColumnMerger, NestedDataColumnSerializer to their version pre #13803 behavior (v4) for backwards compatibility
* fix a bug in RoaringBitmapSerdeFactory if anything actually ever wrote out an empty bitmap using toBytes and then later tried to read it (the nerve!)
2023-04-04 17:51:59 -07:00
Karan Kumar 217b0f6832
Eagerly fetching remote s3 files leading to out of disk (OOD) (#13981)
* Eagerly fetching remote s3 files leading to OOD.
2023-04-03 14:10:37 +05:30
Clint Wylie 518698a952
lower segment heap footprint and fix bug with expression type coercion (#14002) 2023-03-31 13:53:22 -07:00
Clint Wylie e3211e3be0
actually backwards compatible frontCoded string encoding strategy (#13996) 2023-03-31 02:24:12 -07:00
Soumyava 1eeecf5fb2
Fixing regression issues on unnest (#13976)
* select sum(c) on an unnested column now does not return 'Type mismatch' error and works properly
* Making sure an inner join query works properly
* Having on unnested column with a group by now works correctly
* count(*) on an unnested query now works correctly
2023-03-31 09:06:43 +05:30
Karan Kumar 8dce3ca4d5
OOM fix for running MSQ jobs with `intermediateSuperSorterStorageMaxLocalBytes` set (#13974)
While using intermediateSuperSorterStorageMaxLocalBytes the super sorter was retaining references of the memory allocator.

The fix clears the current outputChannel when close() is called on the ComposingWritableFrameChannel.java
2023-03-29 18:00:00 +05:30
Clint Wylie 2219e68fa3
add backwards compat mode for frontCoded stringEncodingStrategy (#13988) 2023-03-28 14:44:44 -07:00
Paul Rogers 76fe26d4ba
Fix typos, add tests for http() function (#13954) 2023-03-28 14:41:06 -07:00
Karan Kumar c2fe6a4956
Reworking s3 connector with various improvements (#13960)
* Reworking s3 connector with
1. Adding retries
2. Adding max fetch size
3. Using s3Utils for most of the api's
4. Fixing bugs in DurableStorageCleaner
5. Moving to Iterator for listDir call
2023-03-28 17:05:16 +05:30
Clint Wylie d5b1b5bc8e
nested columns + arrays = array columns! (#13803)
array columns!
changes:
* add support for storing nested arrays of string, long, and double values as specialized nested columns instead of breaking them into separate element columns
* nested column type mimic behavior means that columns ingested with only root arrays of primitive values will be ARRAY typed columns
* neat test refactor stuff
* add v4 segment test
* add array element indexes
* add tests for unnest and array columns
* fix unnest column value selector cursor handling of null and empty arrays
2023-03-27 12:42:35 -07:00
abhagraw c52d15d65d
Fixing security vulnerability check errors (#13956)
* Fixing security vulnerability check errors

* Updating javax.el to jakarta.el

* Adding cron job trigger on changes to suppressions file
2023-03-23 11:10:06 +05:30
Soumyava 2ad133c06e
Unnest changes for moving the filter on right side of correlate to inside the unnest datasource (#13934)
* Refactoring and bug fixes on top of unnest. The filter now is passed inside the unnest cursors. Added tests for scenarios such as
1. filter on unnested column which involves a left filter rewrite
2. filter on unnested virtual column which pushes the filter to the right only and involves no rewrite
3. not filters
4. SQL functions applied on top of unnested column
5. null present in first row of the column to be unnested
2023-03-22 18:24:00 -07:00
Clint Wylie f4392a3155
expression transform improvements and fixes (#13947)
changes:
* fixes inconsistent handling of byte[] values between ExprEval.bestEffortOf and ExprEval.ofType, which could cause byte[] values to end up as java toString values instead of base64 encoded strings in ingest time transforms
* improved ExpressionTransform binding to re-use ExprEval.bestEffortOf when evaluating a binding instead of throwing it away
* improved ExpressionTransform array handling, added RowFunction.evalDimension that returns List<String> to back Row.getDimension and remove the automatic coercing of array types that would typically happen to expression transforms unless using Row.getDimension
* added some tests for ExpressionTransform with array inputs
* improved ExpressionPostAggregator to use partial type information from decoration
* migrate some test uses of InputBindings.forMap to use other methods
2023-03-21 23:26:53 -07:00
Clint Wylie ed57c5c853
better FrontCodedIndexed (#13854)
* Adds new implementation of 'frontCoded' string encoding strategy, which writes out a v1 FrontCodedIndexed which stores buckets on a prefix of the previous value instead of the first value in the bucket
2023-03-14 18:14:11 -07:00
somu-imply a7ba361666
Refactoring and bug fixes on top of unnest. The allowList now is not passed … (#13922)
* Refactoring and bug fixes on top of unnest. The filter now is passed inside the unnest cursors. Added tests for scenarios such as
1. filter on unnested column which involves a left filter rewrite
2. filter on unnested virtual column which pushes the filter to the right only and involves no rewrite
3. not filters
4. SQL functions applied on top of unnested column
5. null present in first row of the column to be unnested
2023-03-14 16:05:56 -07:00
Gian Merlino 4b1ffbc452
Various changes and fixes to UNNEST. (#13892)
* Various changes and fixes to UNNEST.

Native changes:

1) UnnestDataSource: Replace "column" and "outputName" with "virtualColumn".
   This enables pushing expressions into the datasource. This in turn
   allows us to do the next thing...

2) UnnestStorageAdapter: Logically apply query-level filters and virtual
   columns after the unnest operation. (Physically, filters are pulled up,
   when possible.) This is beneficial because it allows filters and
   virtual columns to reference the unnested column, and because it is
   consistent with how the join datasource works.

3) Various documentation updates, including declaring "unnest" as an
   experimental feature for now.

SQL changes:

1) Rename DruidUnnestRel (& Rule) to DruidUnnestRel (& Rule). The rel
   is simplified: it only handles the UNNEST part of a correlated join.
   Constant UNNESTs are handled with regular inline rels.

2) Rework DruidCorrelateUnnestRule to focus on pulling Projects from
   the left side up above the Correlate. New test testUnnestTwice verifies
   that this works even when two UNNESTs are stacked on the same table.

3) Include ProjectCorrelateTransposeRule from Calcite to encourage
   pushing mappings down below the left-hand side of the Correlate.

4) Add a new CorrelateFilterLTransposeRule and CorrelateFilterRTransposeRule
   to handle pulling Filters up above the Correlate. New tests
   testUnnestWithFiltersOutside and testUnnestTwiceWithFilters verify
   this behavior.

5) Require a context feature flag for SQL UNNEST, since it's undocumented.
   As part of this, also cleaned up how we handle feature flags in SQL.
   They're now hooked into EngineFeatures, which is useful because not
   all engines support all features.
2023-03-10 16:42:08 +05:30
imply-cheddar 6b90a320cf
Add back function signature for compat (#13914)
* Add back function signature for compat

* Suppress IntelliJ Error
2023-03-09 21:06:34 -08:00
Laksh Singla 5b0b3a9b2c
Add a readOnly() method for PartitionedOutputChannel (#13755)
With SuperSorter using the PartitionedOutputChannels for sorting, it might OOM on inputs of reasonable size because the channel consists of both the writable frame channel and the frame allocator, both of which are not required once the output channel has been written to.
This change adds a readOnly to the output channel which contains only the readable channel, due to which unnecessary memory references to the writable channel and the memory allocator are lost once the output channel has been written to, preventing the OOM.
2023-03-10 06:58:00 +05:30
Gian Merlino bf39b4d313
Window planning: use collation traits, improve subquery logic. (#13902)
* Window planning: use collation traits, improve subquery logic.

SQL changes:

1) Attach RelCollation (sorting) trait to any PartialDruidQuery
   that ends in AGGREGATE or AGGREGATE_PROJECT. This allows planning to
   take advantage of the fact that Druid sorts by dimensions when
   doing aggregations.

2) Windowing: inspect RelCollation trait from input, and insert naiveSort
   if, and only if, necessary.

3) Windowing: add support for Project after Window, when the Project
   is a simple mapping. Helps eliminate subqueries.

4) DruidRules: update logic for considering subqueries to reflect that
   subqueries are not required to be GroupBys, and that we have a bunch
   of new Stages now. With all of this evolution that has happened, the
   old logic didn't quite make sense.

Native changes:

1) Use merge sort (stable) rather than quicksort when sorting
   RowsAndColumns. Makes it easier to write test cases for plans that
   involve re-sorting the data.

* Changes from review.

* Mark the bad test as failing.

* Additional update.

* Fix failingTest.

* Fix tests.

* Mark a var final.
2023-03-09 15:48:13 -08:00
Gian Merlino fe9d0c46d5
Improve memory efficiency of WrappedRoaringBitmap. (#13889)
* Improve memory efficiency of WrappedRoaringBitmap.

Two changes:

1) Use an int[] for sizes 4 or below.
2) Remove the boolean compressRunOnSerialization. Doesn't save much
   space, but it does save a little, and it isn't adding a ton of value
   to have it be configurable. It was originally configurable in case
   anything broke when enabling it, but it's been a while and nothing
   has broken.

* Slight adjustment.

* Adjust for inspection.

* Updates.

* Update snaps.

* Update test.

* Adjust test.

* Fix snaps.
2023-03-09 15:48:02 -08:00
Clint Wylie 48ac5ce50b
use native nvl expression for SQL NVL and 2 argument COALESCE (#13897)
* use custom case operator conversion instead of direct operator conversion, to produce native nvl expression for SQL NVL and 2 argument COALESCE, and add optimization for certain case filters from coalesce and nvl statements
2023-03-09 05:46:17 -08:00
Gian Merlino 82f7a56475
Sort-merge join and hash shuffles for MSQ. (#13506)
* Sort-merge join and hash shuffles for MSQ.

The main changes are in the processing, multi-stage-query, and sql modules.

processing module:

1) Rename SortColumn to KeyColumn, replace boolean descending with KeyOrder.
   This makes it nicer to model hash keys, which use KeyOrder.NONE.

2) Add nullability checkers to the FieldReader interface, and an
   "isPartiallyNullKey" method to FrameComparisonWidget. The join
   processor uses this to detect null keys.

3) Add WritableFrameChannel.isClosed and OutputChannel.isReadableChannelReady
   so callers can tell which OutputChannels are ready for reading and which
   aren't.

4) Specialize FrameProcessors.makeCursor to return FrameCursor, a random-access
   implementation. The join processor uses this to rewind when it needs to
   replay a set of rows with a particular key.

5) Add MemoryAllocatorFactory, which is embedded inside FrameWriterFactory
   instead of a particular MemoryAllocator. This allows FrameWriterFactory
   to be shared in more scenarios.

multi-stage-query module:

1) ShuffleSpec: Add hash-based shuffles. New enum ShuffleKind helps callers
   figure out what kind of shuffle is happening. The change from SortColumn
   to KeyColumn allows ClusterBy to be used for both hash-based and sort-based
   shuffling.

2) WorkerImpl: Add ability to handle hash-based shuffles. Refactor the logic
   to be more readable by moving the work-order-running code to the inner
   class RunWorkOrder, and the shuffle-pipeline-building code to the inner
   class ShufflePipelineBuilder.

3) Add SortMergeJoinFrameProcessor and factory.

4) WorkerMemoryParameters: Adjust logic to reserve space for output frames
   for hash partitioning. (We need one frame per partition.)

sql module:

1) Add sqlJoinAlgorithm context parameter; can be "broadcast" or
   "sortMerge". With native, it must always be "broadcast", or it's a
   validation error. MSQ supports both. Default is "broadcast" in
   both engines.

2) Validate that MSQs do not use broadcast join with RIGHT or FULL join,
   as results are not correct for broadcast join with those types. Allow
   this in native for two reasons: legacy (the docs caution against it,
   but it's always been allowed), and the fact that it actually *does*
   generate correct results in native when the join is processed on the
   Broker. It is much less likely that MSQ will plan in such a way that
   generates correct results.

3) Remove subquery penalty in DruidJoinQueryRel when using sort-merge
   join, because subqueries are always required, so there's no reason
   to penalize them.

4) Move previously-disabled join reordering and manipulation rules to
   FANCY_JOIN_RULES, and enable them when using sort-merge join. Helps
   get to better plans where projections and filters are pushed down.

* Work around compiler problem.

* Updates from static analysis.

* Fix @param tag.

* Fix declared exception.

* Fix spelling.

* Minor adjustments.

* wip

* Merge fixups

* fixes

* Fix CalciteSelectQueryMSQTest

* Empty keys are sortable.

* Address comments from code review. Rename mux -> mix.

* Restore inspection config.

* Restore original doc.

* Reorder imports.

* Adjustments

* Fix.

* Fix imports.

* Adjustments from review.

* Update header.

* Adjust docs.
2023-03-08 14:19:39 -08:00
Abhishek Agarwal 52bd9e6adb
Improved error message when topic name changes within same supervisor (#13815)
Improved error message when topic name changes within same supervisor

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>
2023-03-07 18:10:18 -08:00
Gian Merlino fcfb7b8ff6
Add warning comments to Granularity.getIterable. (#13888)
This function is notorious for causing memory exhaustion and excessive
CPU usage; so much so that it was valuable to work around it in the
SQL planner in #13206. Hopefully, a warning comment will encourage
developers to stay away and come up with solutions that do not involve
computing all possible buckets.
2023-03-06 22:57:10 -08:00
Anshu Makkar a10e4150d5
Add Post Aggregators for Tuple Sketches (#13819)
You can now do the following operations with TupleSketches in Post Aggregation Step

Get the Sketch Output as Base64 String
Provide a constant Tuple Sketch in post-aggregation step that can be used in Set Operations
Get the Estimated Value(Sum) of Summary/Metrics Objects associated with Tuple Sketch
2023-03-03 09:32:09 +05:30
Tejaswini Bandlamudi 7103cb4b9d
Removes FiniteFirehoseFactory and its implementations (#12852)
The FiniteFirehoseFactory and InputRowParser classes were deprecated in 0.17.0 (#8823) in favor of InputSource & InputFormat. This PR removes the FiniteFirehoseFactory and all its implementations along with classes solely used by them like Fetcher (Used by PrefetchableTextFilesFirehoseFactory). Refactors classes including tests using FiniteFirehoseFactory to use InputSource instead.
Removing InputRowParser may not be as trivial as many classes that aren't deprecated depends on it (with no alternatives), like EventReceiverFirehoseFactory. Hence FirehoseFactory, EventReceiverFirehoseFactory, and Firehose are marked deprecated.
2023-03-02 18:07:17 +05:30
Clint Wylie 6cf754b0e0
move numeric null value coercion out of expression processing engine (#13809)
* move numeric null value coercion out of expression processing engine
* add ExprEval.valueOrDefault() to allow consumers to automatically coerce to default values
* rename Expr.buildVectorized as Expr.asVectorProcessor more consistent naming with Function and ApplyFunction; javadocs for some stuff
2023-02-28 18:10:07 -08:00
Clint Wylie 1d8fff4096
sampler + type detection = bff (#13711)
* sampler + type detection = bff
* split logical and physical dimensions, tidy up
2023-02-28 04:14:30 -08:00
hqx871 79f04e71a1
Hadoop based batch ingestion support range partition (#13303)
This pr implements range partitioning for hadoop-based ingestion. For detail about multi dimension range partition can be seen #11848.
2023-02-23 11:38:03 +05:30
Kashif Faraz 3a67a43c8a
Add method SegmentTimeline.addSegments (#13831) 2023-02-21 23:58:01 -08:00
Clint Wylie 614205f3bc
fix some intellij inspections in druid-processing (#13823)
fix some intellij inspections in druid-processing
2023-02-21 09:02:02 +05:30
Gian Merlino 882ae9f002
Speed up composite key joins on IndexedTable. (#13516)
* Speed up composite key joins on IndexedTable.

Prior to this patch, IndexedTable indexes are sorted IntList. This works
great when we have a single-column join key: we simply retrieve the list
and we know what rows match. However, when we have a composite key, we
need to merge the sorted lists. This is inefficient when one is very dense
and others are very sparse.

This patch switches from sorted IntList to IntSortedSet, and changes
to the following intersection algorithm:

1) Initialize the intersection set to the smallest matching set from the
   various parts of the composite key.

2) For each element in that smallest set, check other sets for that element.
   If any do *not* include it, then remove the element from the intersection
   set.

This way, complexity scales with the size of the smallest set, not the
largest one.

* RangeIntSet stuff.
2023-02-17 22:01:01 -08:00
Clint Wylie 08b5951cc5
merge druid-core, extendedset, and druid-hll into druid-processing to simplify everything (#13698)
* merge druid-core, extendedset, and druid-hll into druid-processing to simplify everything
* fix poms and license stuff
* mockito is evil
* allow reset of JvmUtils RuntimeInfo if tests used static injection to override
2023-02-17 14:27:41 -08:00
Paul Rogers 333196d207
Code cleanup & message improvements (#13778)
* Misc cleanup edits

Correct spacing
Add type parameters
Add toString() methods to formats so tests compare correctly
IT doc revisions
Error message edits
Display UT query results when tests fail

* Edit

* Build fix

* Build fixes
2023-02-15 15:22:54 +05:30
Suneet Saldanha f67abf2e99
Better logs for query errors (#13776)
* Better logs for query errors

* checkstyle
2023-02-14 15:55:58 -08:00
Clint Wylie fa4cab405f
fix bug with sql planner when virtual column capabilities are null (#13797) 2023-02-13 18:27:23 -08:00
Clint Wylie f09f83697d
fix array_agg to work with complex types and bugs with expression aggregator complex array handling (#13781)
* fix array_agg to work with complex types and bugs with expression aggregator complex array handling
* more consistent handling of array expressions, numeric arrays more consistently honor druid.generic.useDefaultValueForNull, fix array_ordinal sql output type
2023-02-12 22:01:39 -08:00
Clint Wylie ffeda72abb
fix filtering nested field virtual column when used with non nested column input (#13779)
* fix filtering nested field virtual column when used with non nested column input
2023-02-09 03:16:38 -08:00
Suneet Saldanha 714ac07b52
Allow users to add additional metadata to ingestion metrics (#13760)
* Allow users to add additional metadata to ingestion metrics

When submitting an ingestion spec, users may pass a map of metadata
in the ingestion spec config that will be added to ingestion metrics.

This will make it possible for operators to tag metrics with other
metadata that doesn't necessarily line up with the existing tags
like taskId.

Druid clusters that ingest these metrics can take advantage of the
nested data columns feature to process this additional metadata.

* rename to tags

* docs

* tests

* fix test

* make code cov happy

* checkstyle
2023-02-08 18:07:23 -08:00
Clint Wylie 2d3bee8545
various nested column (and other) fixes (#13732)
changes:
* modified druid schema column type compution to special case COMPLEX<json> handling to choose COMPLEX<json> if any column in any segment is COMPLEX<json>
* NestedFieldVirtualColumn can now work correctly on any type of column, returning either a column selector if a root path, or nil selector if not
* fixed a random bug with NilVectorSelector when using a vector size larger than the default and druid.generic.useDefaultValueForNull=false would have the nulls vector set to all false instead of true
* fixed an overly aggressive check in ExprEval.ofType when handling complex types which would try to treat any string as base64 without gracefully falling back if it was not in fact base64 encoded, along with special handling for complex<json>
* added ExpressionVectorSelectors.castValueSelectorToObject and ExpressionVectorSelectors.castObjectSelectorToNumeric as convience methods to cast vector selectors using cast expressions without the trouble of constructing an expression. the polymorphic nature of the non-vectorized engine (and significantly larger overhead of non-vectorized expression processing) made adding similar methods for non-vectorized selectors less attractive and so have not been added at this time
* fix inconsistency between nested column indexer and serializer in handling values (coerce non primitive and non arrays of primitives using asString)
* ExprEval best effort mode now handles byte[] as string
* added test for ExprEval.bestEffortOf, and add missing conversion cases that tests uncovered
* more tests more better
2023-02-06 19:48:02 -08:00
imply-cheddar 9c5b61e114
Fallback virtual column (#13739)
* Fallback virtual column

This virtual columns enables falling back to another column if
the original column doesn't exist.  This is useful when doing
column migrations and you have some old data with column X,
new data with column Y and you want to use Y if it exists, X
otherwise so that you can run a consistent query against all of
the data.
2023-02-06 19:36:50 -08:00
Jason Koch 7a3bd89a85
Dimension dictionary reduce locking (#13710)
* perf: introduce benchmark for StringDimensionIndexer

jdk11 -- Benchmark                                                       Mode  Cnt      Score     Error  Units
StringDimensionIndexerProcessBenchmark.parallelReadWrite                 avgt   10  30471.552 ±  456.716  us/op
StringDimensionIndexerProcessBenchmark.parallelReadWrite:parallelReader  avgt   10  18069.863 ±  327.923  us/op
StringDimensionIndexerProcessBenchmark.parallelReadWrite:parallelWriter  avgt   10  67676.617 ± 2351.311  us/op
StringDimensionIndexerProcessBenchmark.soloReader                        avgt   10   1048.079 ±    1.120  us/op
StringDimensionIndexerProcessBenchmark.soloWriter                        avgt   10   4629.769 ±   29.353  us/op

* perf: switch DimensionDictionary to StampedLock

jdk11 - Benchmark                                                        Mode  Cnt      Score      Error  Units
StringDimensionIndexerProcessBenchmark.parallelReadWrite                 avgt   10  37958.372 ± 1685.206  us/op
StringDimensionIndexerProcessBenchmark.parallelReadWrite:parallelReader  avgt   10  31192.232 ± 2755.365  us/op
StringDimensionIndexerProcessBenchmark.parallelReadWrite:parallelWriter  avgt   10  58256.791 ± 1998.220  us/op
StringDimensionIndexerProcessBenchmark.soloReader                        avgt   10   1079.440 ±    1.753  us/op
StringDimensionIndexerProcessBenchmark.soloWriter                        avgt   10   4585.690 ±   13.225  us/op

* perf: use optimistic locking in DimensionDictionary

jdk11 - Benchmark                                                        Mode  Cnt      Score     Error  Units
StringDimensionIndexerProcessBenchmark.parallelReadWrite                 avgt   10   6212.366 ± 162.684  us/op
StringDimensionIndexerProcessBenchmark.parallelReadWrite:parallelReader  avgt   10   1807.235 ± 109.339  us/op
StringDimensionIndexerProcessBenchmark.parallelReadWrite:parallelWriter  avgt   10  19427.759 ± 611.692  us/op
StringDimensionIndexerProcessBenchmark.soloReader                        avgt   10    194.370 ±   1.050  us/op
StringDimensionIndexerProcessBenchmark.soloWriter                        avgt   10   2871.423 ±  14.426  us/op

* perf: refactor DimensionDictionary null handling to need less locks

jdk11 - Benchmark                                                        Mode  Cnt      Score      Error  Units
StringDimensionIndexerProcessBenchmark.parallelReadWrite                 avgt   10   6591.619 ±  470.497  us/op
StringDimensionIndexerProcessBenchmark.parallelReadWrite:parallelReader  avgt   10   1387.338 ±  144.587  us/op
StringDimensionIndexerProcessBenchmark.parallelReadWrite:parallelWriter  avgt   10  22204.462 ± 1620.806  us/op
StringDimensionIndexerProcessBenchmark.soloReader                        avgt   10    204.911 ±    0.459  us/op
StringDimensionIndexerProcessBenchmark.soloWriter                        avgt   10   2935.376 ±   12.639  us/op

* perf: refactor DimensionDictionary add handling to do a little less work

jdk11 - Benchmark                                                        Mode  Cnt      Score    Error  Units
StringDimensionIndexerProcessBenchmark.parallelReadWrite                 avgt   10   2914.859 ± 22.519  us/op
StringDimensionIndexerProcessBenchmark.parallelReadWrite:parallelReader  avgt   10    508.010 ± 14.675  us/op
StringDimensionIndexerProcessBenchmark.parallelReadWrite:parallelWriter  avgt   10  10135.408 ± 82.745  us/op
StringDimensionIndexerProcessBenchmark.soloReader                        avgt   10    205.415 ±  0.158  us/op
StringDimensionIndexerProcessBenchmark.soloWriter                        avgt   10   3098.743 ± 23.603  us/op
2023-02-01 02:59:12 -08:00
Clint Wylie ec1e6ac840
fix nested column handling of null and "null" (#13714)
* fix nested column handling of null and "null"
* fix issue merging nested column value dictionaries that could incorrect lose dictionary values
2023-01-31 20:59:19 -08:00
Tijo Thomas 1beef30bb2
Support postaggregation function as in Math.pow() (#13703) (#13704)
Support postaggregation function as in Math.pow()
2023-01-31 22:55:04 +05:30
somu-imply 17c0167248
Additional native query tests for unnest datasource (#13554)
Native tests for the unnest datasource.
2023-01-25 15:57:52 -08:00
imply-cheddar 706b8a0227
Adjust Operators to be Pausable (#13694)
* Adjust Operators to be Pausable

This enables "merge" style operations that
combine multiple streams.

This change includes a naive implementation
of one such merge operator just to provide
concrete evidence that the refactoring is
effective.
2023-01-23 20:52:06 -08:00
somu-imply 90d445536d
SQL version of unnest native druid function (#13576)
* adds the SQL component of the native unnest functionality in Druid to unnest SQL queries on a table dimension, virtual column or a constant array and convert them into native Druid queries
* unnest in SQL is implemented as a combination of Correlate (the comma join part) and Uncollect (the unnest part)
2023-01-23 12:53:31 -08:00
Rohan Garg f76acccff2
Allow using composed storage for SuperSorter intermediate data (#13368) 2023-01-24 01:02:03 +05:30
Clint Wylie fb26a1093d
discover nested columns when using nested column indexer for schemaless ingestion (#13672)
* discover nested columns when using nested column indexer for schemaless
* move useNestedColumnIndexerForSchemaDiscovery from AppendableIndexSpec to DimensionsSpec
2023-01-18 12:57:28 -08:00
imply-cheddar 566fc990e4
Semantic Implementations for ArrayListRAC (#13652)
* Semantic Implementations for ArrayListRAC

This adds implementations of semantic interfaces
to optimize (eliminate object creation) the
window processing on top of an ArrayListSegment.

Tests are also added to cover the interplay
between the semantic interfaces that are expected
for this use case
2023-01-13 19:42:34 -08:00
Gian Merlino 182c4fad29
Kinesis: More robust default fetch settings. (#13539)
* Kinesis: More robust default fetch settings.

1) Default recordsPerFetch and recordBufferSize based on available memory
   rather than using hardcoded numbers. For this, we need an estimate
   of record size. Use 10 KB for regular records and 1 MB for aggregated
   records. With 1 GB heaps, 2 processors per task, and nonaggregated
   records, recordBufferSize comes out to the same as the old
   default (10000), and recordsPerFetch comes out slightly lower (1250
   instead of 4000).

2) Default maxRecordsPerPoll based on whether records are aggregated
   or not (100 if not aggregated, 1 if aggregated). Prior default was 100.

3) Default fetchThreads based on processors divided by task count on
   Indexers, rather than overall processor count.

4) Additionally clean up the serialized JSON a bit by adding various
   JsonInclude annotations.

* Updates for tests.

* Additional important verify.
2023-01-13 11:03:54 +05:30
Clint Wylie b5b740bbbb
allow using nested column indexer for schema discovery (#13653)
* single typed "root" only nested columns now mimic "regular" columns of those types
* incremental index can now use nested column indexer instead of string indexer for discovered columns
2023-01-12 18:31:12 -08:00
Adarsh Sanjeev 0a486c3bcf
Update forbidden apis with fixed executor (#13633)
* Update forbidden apis with fixed executor
2023-01-12 15:34:36 +05:30
imply-cheddar f1821a7c18
Add Sort Operator for Window Functions (#13619)
* Addition of NaiveSortMaker and Default implementation

Add the NaiveSortMaker which makes a sorter
object and a default implementation of the
interface.

This also allows us to plan multiple different window 
definitions on the same query.
2023-01-06 00:27:18 -08:00
imply-cheddar a8ecc48ffe
Validate response headers and fix exception logging (#13609)
* Validate response headers and fix exception logging

A class of QueryException were throwing away their
causes making it really hard to determine what's
going wrong when something goes wrong in the SQL
planner specifically.  Fix that and adjust tests
 to do more validation of response headers as well.

We allow 404s and 307s to be returned even without 
authorization validated, but others get converted to 403
2023-01-05 14:15:15 -08:00
imply-cheddar 313d937236
Switch operators to a push-style API (#13600)
* Switch operators to a push-style API

This API generates nice stack-traces of processing
for Operators.
2022-12-22 22:01:55 -08:00
imply-cheddar 0efd0879a8
Unify the handling of HTTP between SQL and Native (#13564)
* Unify the handling of HTTP between SQL and Native

The SqlResource and QueryResource have been
using independent logic for things like error
handling and response context stuff.  This
became abundantly clear and painful during a
change I was making for Window Functions, so
I unified them into using the same code for
walking the response and serializing it.

Things are still not perfectly unified (it would
be the absolute best if the SqlResource just
took SQL, planned it and then delegated the
query run entirely to the QueryResource), but
this refactor doesn't take that fully on.

The new code leverages async query processing
from our jetty container, the different
interaction model with the Resource means that
a lot of tests had to be adjusted to align with
the async query model.  The semantics of the
tests remain the same with one exception: the
SqlResource used to not log requests that failed
authorization checks, now it does.
2022-12-19 00:25:33 -08:00
Clint Wylie d9e5245ff0
allow string dimension indexer to handle byte[] as base64 strings (#13573)
This PR expands `StringDimensionIndexer` to handle conversion of `byte[]` to base64 encoded strings, rather than the current behavior of calling java `toString`. 

This issue was uncovered by a regression of sorts introduced by #13519, which updated the protobuf extension to directly convert stuff to java types, resulting in `bytes` typed values being converted as `byte[]` instead of a base64 string which the previous JSON based conversion created. While outputting `byte[]` is more consistent with other input formats, and preferable when the bytes can be consumed directly (such as complex types serde), when fed to a `StringDimensionIndexer`, it resulted in an ugly java `toString` because `processRowValsToUnsortedEncodedKeyComponent` is fed the output of `row.getRaw(..)`. Converting `byte[]` to a base64 string within `StringDimensionIndexer` is consistent with the behavior of calling `row.getDimension(..)` which does do this coercion (and why many tests on binary types appeared to be doing the expected thing).

I added some protobuf `bytes` tests, but they don't really hit the new `StringDimensionIndexer` behavior because they operate on the `InputRow` directly, and call `getDimension` to validate stuff. The parser based version still uses the old conversion mechanisms, so when not using a flattener incorrectly calls `toString` on the `ByteString`. I have encoded this behavior in the test for now, if we either update the parser to use the new flattener or just .. remove parsers we can remove this test stuff.
2022-12-16 14:50:17 +05:30
Clint Wylie 9ae7a36ccd
improve nested column storage format for broader compatibility (#13568)
* bump nested column format version
changes:
* nested field files are now named by their position in field paths list, rather than directly by the path itself. this fixes issues with valid json properties with commas and newlines breaking the csv file meta.smoosh
* update StructuredDataProcessor to deal in NestedPathPart to be consistent with other abstract path handling rather than building JQ syntax strings directly
* add v3 format segment and test
2022-12-15 15:39:26 -08:00
Clint Wylie 49cbfdff83
Fix cool nested column bug caused by not properly validating that global id is present in global dictionary before lookup up local id (#13561)
This commit fixes a bug with nested column "value set" indexes caused by not properly
validating that the globalId looked up for value is present in the global dictionary prior to
looking it up in the local dictionary, which when "adjusting" the global ids for value type
can cause incorrect selection of value indexes.

To use an example of a variant typed nested column with 3 values `["1", null, -2]`.
The string dictionary is `[null, "1"]`, the long dictionary is `[-2]` and our local dictionary is `[0, 1, 2]`.

The code for variant typed indexes checks if the value is present in all global dictionaries
and returns indexes for all matches. So in this case, we first lookup "1" in the string dictionary,
find it at global id 1, all is good. Now, we check the long dictionary for `1`, which due to 
`-(insertionpoint + 1)` gives us `-(1 + 2) = -2`. Since the global id space is actually stacked
dictionaries, global ids for long and double values must be "adjusted" by the size of string
dictionary, and size of string + size of long for doubles.

Prior to this patch we were not checking that the globalId is 0 or larger, we then immediately
looked up the `localDictionary.indexOf(-2 + adjustLong) = localDictionary.indexOf(-2 + 2) = localDictionary.indexOf(0)` ... which is an actual value contained in the dictionary! The fix is
to skip the longs completely since there were no global matches.

On to doubles, `-(insertionPoint + 1)` gives us `-(0 + 1) = -1`. The double adjust value is '3'
since 2 strings and 1 long, so `localDictionary.indexOf(-1 + 3)` = `localDictionary.indexOf(2)` 
which is also a real value in our local dictionary that is definitely not '1'.

So in this one case, looking for '1' incorrectly ended up matching every row.
2022-12-15 17:00:46 +05:30
imply-cheddar 089d8da561
Support Framing for Window Aggregations (#13514)
* Support Framing for Window Aggregations

This adds support for framing  over ROWS
for window aggregations.

Still not implemented as yet:
1. RANGE frames
2. Multiple different frames in the same query
3. Frames on last/first functions
2022-12-14 18:04:39 -08:00
Kashif Faraz 58a3acc2c4
Add InputStats to track bytes processed by a task (#13520)
This commit adds a new class `InputStats` to track the total bytes processed by a task.
The field `processedBytes` is published in task reports along with other row stats.

Major changes:
- Add class `InputStats` to track processed bytes
- Add method `InputSourceReader.read(InputStats)` to read input rows while counting bytes.
> Since we need to count the bytes, we could not just have a wrapper around `InputSourceReader` or `InputEntityReader` (the way `CountableInputSourceReader` does) because the `InputSourceReader` only deals with `InputRow`s and the byte information is already lost.
- Classic batch: Use the new `InputSourceReader.read(inputStats)` in `AbstractBatchIndexTask`
- Streaming: Increment `processedBytes` in `StreamChunkParser`. This does not use the new `InputSourceReader.read(inputStats)` method.
- Extend `InputStats` with `RowIngestionMeters` so that bytes can be exposed in task reports

Other changes:
- Update tests to verify the value of `processedBytes`
- Rename `MutableRowIngestionMeters` to `SimpleRowIngestionMeters` and remove duplicate class
- Replace `CacheTestSegmentCacheManager` with `NoopSegmentCacheManager`
- Refactor `KafkaIndexTaskTest` and `KinesisIndexTaskTest`
2022-12-13 18:54:42 +05:30
somu-imply 7682b0b6b1
Analysis refactor (#13501)
Refactor DataSource to have a getAnalysis method()

This removes various parts of the code where while loops and instanceof
checks were being used to walk through the structure of DataSource objects
in order to build a DataSourceAnalysis.  Instead we just ask the DataSource
for its analysis and allow the stack to rebuild whatever structure existed.
2022-12-12 17:35:44 -08:00
Clint Wylie 37d8833125
fix bug with broker parallel merge metrics emitting, add wall time, fast/slow partition time metrics (#13420) 2022-12-06 17:50:59 -08:00
imply-cheddar 83261f9641
Starting on Window Functions (#13458)
* Processors for Window Processing

This is an initial take on how to use Processors
for Window Processing.  A Processor is an interface
that transforms RowsAndColumns objects.
RowsAndColumns objects are essentially combinations
of rows and columns.

The intention is that these Processors are the start
of a set of operators that more closely resemble what
DB engineers would be accustomed to seeing.

* Wire up windowed processors with a query type that
can run them end-to-end.  This code can be used to
actually run a query, so yay!

* Wire up windowed processors with a query type that
can run them end-to-end.  This code can be used to
actually run a query, so yay!

* Some SQL tests for window functions. Added wikipedia 
data to the indexes available to the
SQL queries and tests validating the windowing
functionality as it exists now.

Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
2022-12-06 15:54:05 -08:00
somu-imply 9177419628
Unnest functionality for Druid (#13268)
* Moving all unnest cursor code atop refactored code for unnest

* Updating unnest cursor

* Removing dedup and fixing up some null checks

* AllowList changes

* Fixing some NPEs

* Using bitset for allowlist

* Updating the initialization only when cursor is in non-done state

* Updating code to skip rows not in allow list

* Adding a flag for cases when first element is not in allowed list

* Updating for a null in allowList

* Splitting unnest cursor into 2 subclasses

* Intercepting some apis with columnName for new unnested column

* Adding test cases and renaming some stuff

* checkstyle fixes

* Moving to an interface for Unnest

* handling null rows in a dimension

* Updating cursors after comments part-1

* Addressing comments and adding some more tests

* Reverting a change to ScanQueryRunner and improving a comment

* removing an unused function

* Updating cursors after comments part 2

* One last fix for review comments

* Making some functions private, deleting some comments, adding a test for unnest of unnest with allowList

* Adding an exception for a case

* Closure for unnest data source

* Adding some javadocs

* One minor change in makeDimSelector of columnarCursor

* Updating an error message

* Update processing/src/main/java/org/apache/druid/segment/DimensionUnnestCursor.java

Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>

* Unnesting on virtual columns was missing an object array, adding that to support virtual columns unnesting

* Updating exceptions to use UOE

* Renamed files, added column capability test on adapter, return statement and made unnest datasource not cacheable for the time being

* Handling for null values in dim selector

* Fixing a NPE for null row

* Updating capabilities

* Updating capabilities

Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>
2022-12-02 18:48:25 -08:00
Paul Rogers b76ff16d00
SQL test framework extensions (#13426)
SQL test framework extensions

* Capture planner artifacts: logical plan, etc.
* Planner test builder validates the logical plan
* Validation for the SQL resut schema (we already have
  validation for the Druid row signature)
* Better Guice integration: properties, reuse Guice modules
* Avoid need for hand-coded expr, macro tables
* Retire some of the test-specific query component creation
* Fix query log hook race condition
2022-12-02 09:11:59 -08:00
Laksh Singla 4ed6255bdf
Convert errors based on implicit type conversion in multi value arrays to parse exception in MSQ (#13366)
* initial commit

* fix test

* push the json changes

* reduce the area of the try..catch

* Trigger Build

* review
2022-11-29 17:19:57 +05:30
Karan Kumar edd076ca69
Remove duplicate FrameRowTooLargeException.java (#13441)
* Removing duplicate FrameRowTooLargeException.java

* Fixing intellij inspection
2022-11-29 08:46:59 +05:30
Kashif Faraz 656b6cdf62
Add MetricsVerifier to simplify verification of metric values in tests (#13442) 2022-11-28 19:32:37 +05:30
Tejaswini Bandlamudi b091b32f21
Fixes reindexing bug with filter on long column (#13386)
* fixes BlockLayoutColumnarLongs close method to nullify internal buffer.

* fixes other BlockLayoutColumnar supplier close methods to nullify internal buffers.

* fix spotbugs
2022-11-25 19:22:48 +05:30
Clint Wylie f524c68f08
Add mechanism for 'safe' memory reads for complex types (#13361)
* we can read where we want to
we can leave your bounds behind
'cause if the memory is not there
we really don't care
and we'll crash this process of mine
2022-11-23 00:25:22 -08:00
Clint Wylie be4914dcd9
fix off by one error in nested column range index (#13405) 2022-11-22 12:46:06 -08:00
Kashif Faraz 7cf761cee4
Prepare master branch for next release, 26.0.0 (#13401)
* Prepare master branch for next release, 26.0.0

* Use docker image for druid 24.0.1

* Fix version in druid-it-cases pom.xml
2022-11-22 15:31:01 +05:30
Adarsh Sanjeev 280a0f7158
Add sequential sketch merging to MSQ (#13205)
* Add sketch fetching framework

* Refactor code to support sequential merge

* Update worker sketch fetcher

* Refactor sketch fetcher

* Refactor sketch fetcher

* Add context parameter and threshold to trigger sequential merge

* Fix test

* Add integration test for non sequential merge

* Address review comments

* Address review comments

* Address review comments

* Resolve maxRetainedBytes

* Add new classes

* Renamed key statistics information class

* Rename fetchStatisticsSnapshotForTimeChunk function

* Address review comments

* Address review comments

* Update documentation and add comments

* Resolve build issues

* Resolve build issues

* Change worker APIs to async

* Address review comments

* Resolve build issues

* Add null time check

* Update integration tests

* Address review comments

* Add log messages and comments

* Resolve build issues

* Add unit tests

* Add unit tests

* Fix timing issue in tests
2022-11-22 09:56:32 +05:30
Rohan Garg 5b625cea96
Improve performance for ReadableInputStreamFrameChannel (#13373)
* Improve performance for ReadableInputStreamFrameChannel

* Fix race condition leading to unnecessary sleep
2022-11-18 18:26:08 +05:30
Clint Wylie 7f4e386509
add missing vector object selector for multi-value string columns, refactor some stuff (#13379)
* add vector object selector for multi-value string columns, refactor some stuff

* use for nested columns too

* add test

* inspections
2022-11-17 21:08:54 -08:00
imply-cheddar 6b9344cd39
Persist legacy LatestPairs for now (#13378)
We added compression to the latest/first pair storage, but
the code change was forcing new things to be persisted
with the new format, meaning that any segment created with
the new code cannot be read by the old code.  Instead, we
need to default to creating the old format and then remove that default in a future version.
2022-11-17 21:37:02 +05:30
Gian Merlino 78d0b0abce
Add string comparison methods to StringUtils, fix dictionary comparisons. (#13364)
* Add string comparison methods to StringUtils, fix dictionary comparisons.

There are various places in Druid code where we assume that String.compareTo
is consistent with Unicode code-point ordering. Sadly this is not the case.

To help deal with this, this patch introduces the following helpers:

1) compareUnicode: Compares two Strings in Unicode code-point order.
2) compareUtf8: Compares two UTF-8 byte arrays in Unicode code-point order.
   Equivalent to comparison as unsigned bytes.
3) compareUtf8UsingJavaStringOrdering: Compares two UTF-8 byte arrays, or
   ByteBuffers, in a manner consistent with String.compareTo.

There is no helper for comparing two Strings in a manner consistent
with String.compareTo, because for that we can use compareTo directly.

The patch also fixes an inconsistency between the String and UTF-8
dictionary GenericIndexed flavors of string-typed columns: they were
formerly using incompatible comparators.

* Adjust test.

* FrontCodedIndexed updates.

* Add test.

* Fix comments.
2022-11-16 07:15:00 -08:00
Clint Wylie 1231ce3b75
dump-segment tool support for examining nested columns (#13356)
* add nested mode to dump segment tool to dump nested columns

* docs

* more test

* fix it
2022-11-14 16:08:47 -08:00
Adarsh Sanjeev a3edda3b63
Modify quantile sketches to add byte[] directly (#13351)
* Modify quantile sketchs to add byte[] directly

* Rename class and add test
2022-11-14 00:24:06 +05:30
Clint Wylie 27215d1ff1
fix complex_decode_base64 function, add SQL bindings (#13332)
* fix complex_decode_base64 function, add SQL bindings

* more permissive
2022-11-09 23:40:25 -08:00
Clint Wylie 3e2bb4cf10
fix front-coded bucket size handling, better validation (#13335)
* fix front-coded bucket size handling, better validation

* Update FrontCodedIndexedTest.java
2022-11-09 13:33:01 -08:00
AmatyaAvadhanula a2013e6566
Enhance streaming ingestion metrics (#13331)
Changes:
- Add a metric for partition-wise kafka/kinesis lag for streaming ingestion.
- Emit lag metrics for streaming ingestion when supervisor is not suspended and state is in {RUNNING, IDLE, UNHEALTHY_TASKS, UNHEALTHY_SUPERVISOR}
- Document metrics
2022-11-09 23:44:15 +05:30
Paul Rogers 7e600d2c63
Enhancements to the Calcite test framework (#13283)
* Enhancements to the Calcite test framework
* Standardize "Unauthorized" messages
* Additional test framework extension points
* Resolved joinable factory dependency issue
2022-11-08 14:28:49 -08:00
Adarsh Sanjeev a28b8c2674
Improve rowkey object size estimate (#13319)
* Improve rowkey object size estimate

* Address review comments

* Update comment

* Fix test
2022-11-08 10:12:07 +05:30
Rohan Garg a9b39fc29d
Try converting all inner joins to filters (#13201) 2022-11-07 23:19:18 +05:30
Gian Merlino 227b57dd8e
Compaction: Fetch segments one at a time on main task; skip when possible. (#13280)
* Compaction: Fetch segments one at a time on main task; skip when possible.

Compact tasks include the ability to fetch existing segments and determine
reasonable defaults for granularitySpec, dimensionsSpec, and metricsSpec.
This is a useful feature that makes compact tasks work well even when the
user running the compaction does not have a clear idea of what they want
the compacted segments to be like.

However, this comes at a cost: it takes time, and disk space, to do all
of these fetches. This patch improves the situation in two ways:

1) When segments do need to be fetched, download them one at a time and
   delete them when we're done. This still takes time, but minimizes the
   required disk space.

2) Don't fetch segments on the main compact task when they aren't needed.
   If the user provides a full granularitySpec, dimensionsSpec, and
   metricsSpec, we can skip it.

* Adjustments.

* Changes from code review.

* Fix logic for determining rollup.
2022-11-07 14:50:14 +05:30
Clint Wylie d8329195f7
fix bug when front-coded index has only the null value (#13309) 2022-11-04 05:26:33 -07:00
Gian Merlino d1877e41ec
Use lookup memory footprint in MSQ memory computations. (#13271)
* Use lookup memory footprint in MSQ memory computations.

Two main changes:

1) Add estimateHeapFootprint to LookupExtractor.

2) Use this in MSQ's IndexerWorkerContext when determining the total
   amount of available memory. It's taken off the top.

This prevents MSQ tasks from running out of memory when there are lookups
defined in the cluster.

* Updates from code review.
2022-11-03 07:36:54 -07:00
Clint Wylie 018f984781
fix nested column range index range computation (#13297)
* fix nested column range index range computation

* simplify, add missing bounds check for FixedIndexed
2022-11-02 21:37:41 -07:00
Gian Merlino d851985cf5
MSQ: Add support for indexSpec. (#13275) 2022-10-28 14:27:50 -07:00
Clint Wylie acb9cb0227
fix thread safety issue with nested column global dictionaries (#13265)
* fix thread safety issue with nested column global dictionaries

* missing float

* clarify javadocs thread safety
2022-10-27 17:58:24 -07:00
somu-imply affc522b9f
Refactoring the data source before unnest (#13085)
* First set of changes for framework

* Second set of changes to move segment map function to data source

* Minot change to server manager

* Removing the createSegmentMapFunction from JoinableFactoryWrapper and moving to JoinDataSource

* Checkstyle fixes

* Patching Eric's fix for injection

* Checkstyle and fixing some CI issues

* Fixing code inspections and some failed tests and one injector for test in avatica

* Another set of changes for CI...almost there

* Equals and hashcode part update

* Fixing injector from Eric + refactoring for broadcastJoinHelper

* Updating second injector. Might revert later if better way found

* Fixing guice issue in JoinableFactory

* Addressing review comments part 1

* Temp changes refactoring

* Revert "Temp changes refactoring"

This reverts commit 9da42a9ef0.

* temp

* Temp discussions

* Refactoring temp

* Refatoring the query rewrite to refer to a datasource

* Refactoring getCacheKey by moving it inside data source

* Nullable annotation check in injector

* Addressing some comments, removing 2 analysis.isJoin() checks and correcting the benchmark files

* Minor changes for refactoring

* Addressing reviews part 1

* Refactoring part 2 with new test cases for broadcast join

* Set for nullables

* removing instance of checks

* Storing nullables in guice to avoid checking on reruns

* Fixing a test case and removing an irrelevant line

* Addressing the atomic reference review comments
2022-10-26 15:58:58 -07:00
Clint Wylie 77e4246598
add support for 'front coded' string dictionaries for smaller string columns (#12277)
* add FrontCodedIndexed for delta string encoding

* now for actual segments

* fix indexOf

* fixes and thread safety

* add bucket size 4, which seems generally better

* fixes

* fixes maybe

* update indexes to latest interfaces

* utf8 support

* adjust

* oops

* oops

* refactor, better, faster

* more test

* fixes

* revert

* adjustments

* fix prefixing

* more chill

* sql nested benchmark too

* refactor

* more comments and javadocs

* better get

* remove base class

* fix

* hot rod

* adjust comments

* faster still

* minor adjustments

* spatial index support

* spotbugs

* add isSorted to Indexed to strengthen indexOf contract if set, improve javadocs, add docs

* fix docs

* push into constructor

* use base buffer instead of copy

* oops
2022-10-25 18:05:38 -07:00
Gian Merlino 6aca61763e
SQL: Use timestamp_floor when granularity is not safe. (#13206)
* SQL: Use timestamp_floor when granularity is not safe.

PR #12944 added a check at the execution layer to avoid materializing
excessive amounts of time-granular buckets. This patch modifies the SQL
planner to avoid generating queries that would throw such errors, by
switching certain plans to use the timestamp_floor function instead of
granularities. This applies both to the Timeseries query type, and the
GroupBy timestampResultFieldGranularity feature.

The patch also goes one step further: we switch to timestamp_floor
not just in the ETERNITY + non-ALL case, but also if the estimated
number of time-granular buckets exceeds 100,000.

Finally, the patch modifies the timestampResultFieldGranularity
field to consistently be a String rather than a Granularity. This
ensures that it can be round-trip serialized and deserialized, which is
useful when trying to execute the results of "EXPLAIN PLAN FOR" with
GroupBy queries that use the timestampResultFieldGranularity feature.

* Fix test, address PR comments.

* Fix ControllerImpl.

* Fix test.

* Fix unused import.
2022-10-17 08:22:45 -07:00
Paul Rogers f4dcc52dac
Redesign QueryContext class (#13071)
We introduce two new configuration keys that refine the query context security model controlled by druid.auth.authorizeQueryContextParams. When that value is set to true then two other configuration options become available:

druid.auth.unsecuredContextKeys: The set of query context keys that do not require a security check. Use this for the "white-list" of key to allow. All other keys go through the existing context key security checks.
druid.auth.securedContextKeys: The set of query context keys that do require a security check. Use this when you want to allow all but a specific set of keys: only these keys go through the existing context key security checks.
Both are set using JSON list format:

druid.auth.securedContextKeys=["secretKey1", "secretKey2"]
You generally set one or the other values. If both are set, unsecuredContextKeys acts as exceptions to securedContextKeys.

In addition, Druid defines two query context keys which always bypass checks because Druid uses them internally:

sqlQueryId
sqlStringifyArrays
2022-10-15 11:02:11 +05:30
Rohan Garg 45dfd679e9
Composite approach for checking in-filter values set in column dictionary (#13133) 2022-10-13 12:32:48 +05:30
Kashif Faraz 346fbf133f
Make DimensionDictionary abstract (#13215)
This is in preparation for eventually retiring the flag `useMaxMemoryEstimates`, 
after which the footprint of a value in the dimension dictionary will always be 
estimated using the `estimateSizeOfValue()` method.
2022-10-13 07:18:46 +05:30
Abhishek Agarwal 548d0d0bb2
Add more information to exceptions occurred while writing temporary data (#13217)
* Add more information to exceptions when writing tmp data to disk

* Better error message
2022-10-13 08:23:51 +08:00
Clint Wylie 6eff6c9ae4
fix json_value sql planning with decimal type, fix vectorized expression math null value handling in default mode (#13214)
* fix json_value sql planning with decimal type, fix vectorized expression math null value handling in default mode
changes:
* json_value 'returning' decimal will now plan to native double typed query instead of ending up with default string typing, allowing decimal vector math expressions to work with this type
* vector math expressions now zero out 'null' values even in 'default' mode (druid.generic.useDefaultValueForNull=false) to prevent downstream things that do not check the null vector from producing incorrect results

* more better

* test and why not vectorize

* more test, more fix
2022-10-12 16:28:41 -07:00
Clint Wylie 59e2afc566
use object[] instead of string[] for vector expressions to be consistent with vector object selectors (#13209)
* use object[] instead of string[] for vector expressions to be consistent with vector object selectors

* simplify
2022-10-12 02:53:43 -07:00
Clint Wylie 9688674ea8
fix issue with nested column null value index incorrectly matching non-null values (#13211) 2022-10-11 15:54:36 -07:00
Adarsh Sanjeev 92d2633ae6
Update ClusterByStatisticsCollectorImpl to use bytes instead of keys (#12998)
* Update clusterByStatistics to use bytes instead of keys

* Address review comments

* Resolve checkstyle

* Increase test coverage

* Update test

* Update thresholds

* Update retained keys function

* Update docs

* Fix spelling
2022-10-03 12:08:23 +05:30
Clint Wylie a0e0fbe1b3
nested column serializer performance improvement for sparse columns (#13101) 2022-09-19 14:07:48 +05:30
Clint Wylie 5ece870634
split up NestedDataColumnSerializer into separate files (#13096)
* split up NestedDataColumnSerializer into separate files

* fix it
2022-09-16 01:28:47 -07:00
Frank Chen fd6c05eee8
Avoid ClassCastException when getting values from `QueryContext` (#13022)
* Use safe conversion methods

* Rename method

* Add getContextAsBoolean

* Update test case

* Remove generic from getContextValue

* Update catch-handler

* Add test

* Resolve comments

* Replace 'getContextXXX' to 'getQueryContext().getAsXXXX'
2022-09-13 18:00:09 +08:00
imply-cheddar 5ba0075c0c
Expose HTTP Response headers from SqlResource (#13052)
* Expose HTTP Response headers from SqlResource

This change makes the SqlResource expose HTTP response
headers in the same way that the QueryResource exposes them.

Fundamentally, the change is to pipe the QueryResponse
object all the way through to the Resource so that it can
populate response headers.  There is also some code
cleanup around DI, as there was a superfluous FactoryFactory
class muddying things up.
2022-09-12 01:40:06 -07:00
Gian Merlino e29e7a8434
Add ARRAY_QUANTILE function. (#13061)
* Add ARRAY_QUANTILE function.

Expected usage is like: ARRAY_QUANTILE(ARRAY_AGG(x), 0.9).

* Fix test.
2022-09-09 11:29:20 -07:00
Clint Wylie 6438f4198d
improve nested column serializer (#13051)
changes:
* long and double value columns are now written directly, at the same time as writing out the 'intermediary' dictionaryid column with unsorted ids
* remove reverse value lookup from GlobalDictionaryIdLookup since it is no longer needed
2022-09-08 18:32:53 -07:00
Rohan Garg 2f156b3610
Disallow timeseries queries with ETERNITY interval and non-ALL granularity (#12944) 2022-09-07 16:45:08 +05:30
Rohan Garg 7aa8d7f987
Add query/time metric for SQL queries from router (#12867)
* Add query/time metric for SQL queries from router

* Fix query cancel bug when user has overriden native query-id in a SQL query
2022-09-07 13:54:46 +05:30
Clint Wylie a3a377e570
more consistent expression error messages (#12995)
* more consistent expression error messages

* review stuff

* add NamedFunction for Function, ApplyFunction, and ExprMacro to share common stuff

* fixes

* add expression transform name to transformer failure, better parse_json error messaging
2022-09-06 23:21:38 -07:00
sr ed26e2d634
Improve String Last/First Storage Efficiency (#12879)
-Add classes for writing cell values in LZ4 block compressed format.
Payloads are indexed by element number for efficient random lookup
-update SerializablePairLongStringComplexMetricSerde to use block
compression
-SerializablePairLongStringComplexMetricSerde also uses delta encoding
of the Long by doing 2-pass encoding: buffers first to find min/max
numbers and delta-encodes as integers if possible

Entry points for doing block-compressed storage of byte[] payloads
are the CellWriter and CellReader class. See
SerializablePairLongStringComplexMetricSerde for how these are used
along with how to do full column-based storage (delta encoding here)
which includes 2-pass encoding to compute a column header
2022-09-06 20:00:54 -07:00
Gian Merlino 2450b96ac8
FrameFile: Java 17 compatibility. (#12987)
* FrameFile: Java 17 compatibility.

DataSketches Memory.map is not Java 17 compatible, and from discussions
with the team, is challenging to make compatible with 17 while also
retaining compatibility with 8 and 11. So, in this patch, we switch away
from Memory.map and instead use the builtin JDK mmap functionality. Since
it only supports maps up to Integer.MAX_VALUE, we also implement windowing
in FrameFile, such that we can still handle large files.

Other changes:

1) Add two new "map" functions to FileUtils, which we use in this patch.
2) Add a footer checksum to the FrameFile format. Individual frames
   already have checksums, but the footer was missing one.

* Changes for static analysis.

* wip

* Fixes.
2022-08-30 11:13:47 -07:00
Gian Merlino 414176fb97
Fix accounting of bytesAdded in ReadableByteChunksFrameChannel. (#12988)
* Fix accounting of bytesAdded in ReadableByteChunksFrameChannel.

Could cause WorkerInputChannelFactory to get into an infinite loop when
reading the footer of a frame file.

* Additional tests.
2022-08-29 18:25:28 -07:00
Abhishek Agarwal 618757352b
Bump up the version to 25.0.0 (#12975)
* Bump up the version to 25.0.0

* Fix the version in console
2022-08-29 11:27:38 +05:30
Kashif Faraz 9843355ddd
Throw parse exception for multi-valued numeric dims (#12953)
During ingestion, if a row containing multiple values for a numeric dimension is encountered,
the whole ingestion task fails. Ideally, this should just be registered as a parse exception.

Changes:
- Remove `instanceof List` check from `LongDimensionIndexer`, `FloatDimensionIndexer` and `DoubleDimensionIndexer`.

Any invalid type, including list, throws a parse exception in `DimensionHandlerUtils.convertObjectToXXX`
methods. `ParseException` is already handled in `OnHeapIncrementalIndex` and does not fail the entire task.
2022-08-29 10:33:48 +05:30
Clint Wylie 16f5ac5bd5
json_value adjustments (#12968)
* json_value adjustments
changes:
* native json_value expression now has optional 3rd argument to specify type, which will cast all values to the specified type
* rework how JSON_VALUE is wired up in SQL. Now we are using a custom convertlet to translate JSON_VALUE(... RETURNING type) into dedicated JSON_VALUE_BIGINT, JSON_VALUE_DOUBLE, JSON_VALUE_VARCHAR, JSON_VALUE_ANY instead of using the calcite StandardConvertletTable that wraps JSON_VALUE_ANY in a CAST, so that we preserve the typing of JSON_VALUE to pass down to the native expression as the 3rd argument

* fix json_value_any to be usable by humans too, coverage

* fix bug

* checkstyle

* checkstyle

* review stuff

* validate that options to json_value are the supported options rather than ignore them

* remove more legacy undocumented functions
2022-08-27 07:15:47 -07:00
Alexander Saydakov 7e2371bbde
KLL sketch (#12498)
* KLL sketch

* added documentation

* direct static refs

* direct static refs

* fixed test

* addressed review points

* added KLL sketch related terms

* return a copy from get

* Copy unions when returning them from "get".

* Remove redundant "final".

Co-authored-by: AlexanderSaydakov <AlexanderSaydakov@users.noreply.github.com>
Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
2022-08-26 21:19:24 -07:00
Clint Wylie 72aba00e09
add json function support for paths with negative array indexes (#12972) 2022-08-25 17:11:28 -07:00
Clint Wylie 82ad927087
tighten up array handling, fix bug with array_slice output type inference (#12914) 2022-08-25 00:48:49 -07:00
Adarsh Sanjeev 3b58a01c7c
Correct spelling in messages and variable names. (#12932) 2022-08-24 11:06:31 +05:30
Clint Wylie 289e43281e
stricter behavior for parse_json, add try_parse_json, remove to_json (#12920) 2022-08-22 18:41:07 -07:00
Clint Wylie 7fb1153bba
add virtual columns to search query cache key (#12907)
* add virtual columns to search query cache key
2022-08-17 20:26:01 -07:00
Gian Merlino d3015d0f8e
DruidQuery: Return a copy from withScanSignatureIfNeeded, as promised. (#12906)
The method wasn't following its contract, leading to pollution of the
overall planner context, when really we just want to create a new
context for a specific query.
2022-08-16 13:23:14 -07:00
Clint Wylie e42e025296
inject @Json ObjectMapper for to_json_string and parse_json expressions (#12900)
* inject @Json ObjectMapper for to_json_string and parse_json expressions

* fix npe

* better
2022-08-15 08:44:24 -07:00
Gian Merlino 846345669d
Error handling improvements for frame channels. (#12895)
* Error handling improvements for frame channels.

Two changes:

1) Send errors down in-memory channels (BlockingQueueFrameChannel) on
   failure. This ensures that in situations where a chain of processors
   has been set up on a single machine, all processors see the root
   cause error. In particular, this means the final processor in the
   chain reports the root cause error, which ensures that someone with
   a handle to the final processor will get the proper error.

2) Update FrameFileHttpResponseHandler to expect that the final fetch,
   rather than being simply empty, is also empty with a special header.
   This ensures that the handler is able to tell the difference between
   an empty fetch due to being at EOF, and an empty fetch due to a
   truncated HTTP response (after the 200 OK and headers are sent down,
   but before any content appears).

* Fix tests, imports.

* Checkstyle!
2022-08-15 11:31:55 +05:30
Rohan Garg b26ab678b9
Do no create filters on right side table columns while join to filter conversion (#12899) 2022-08-14 08:35:23 -07:00
Paul Rogers 41712b7a3a
Refactor SqlLifecycle into statement classes (#12845)
* Refactor SqlLifecycle into statement classes

Create direct & prepared statements
Remove redundant exceptions from tests
Tidy up Calcite query tests
Make PlannerConfig more testable

* Build fixes

* Added builder to SqlQueryPlus

* Moved Calcites system properties to saffron.properties

* Build fix

* Resolve merge conflict

* Fix IntelliJ inspection issue

* Revisions from reviews

Backed out a revision to Calcite tests that didn't work out as planned

* Build fix

* Fixed spelling errors

* Fixed failed test

Prepare now enforces security; before it did not.

* Rebase and fix IntelliJ inspections issue

* Clean up exception handling

* Fix handling of JDBC auth errors

* Build fix

* More tweaks to security messages
2022-08-14 00:44:08 -07:00
Clint Wylie f4e0909e92
fix bug with json_object expression not fully unwrapping inputs (#12893) 2022-08-13 21:15:19 -07:00
Rohan Garg 5394838030
Enable conversion of join to filter by default (#12868) 2022-08-13 20:37:43 +05:30
Rohan Garg af700bba0c
Fix hasBuiltInFilters for joins (#12894) 2022-08-13 16:26:24 +05:30
Lucas Capistrant 3a3271eddc
Introduce defaultOnDiskStorage config for Group By (#12833)
* Introduce defaultOnDiskStorage config for groupBy

* add debug log to groupby query config

* Apply config change suggestion from review

* Remove accidental new lines

* update default value of new default disk storage config

* update debug log to have more descriptive text

* Make maxOnDiskStorage and defaultOnDiskStorage HumanRedadableBytes

* improve test coverage

* Provide default implementation to new default method on advice of reviewer
2022-08-12 09:40:21 -07:00
Karan Kumar 2f2d8ded5a
Introducing Storage connector Interface (#12874)
In the current druid code base, we have the interface DataSegmentPusher which allows us to push segments to the appropriate deep storage without the extension being worried about the semantics of how to push too deep storage.

While working on #12262, whose some part of the code will go as an extension, I realized that we do not have an interface that allows us to do basic "write, get, delete, deleteAll" operations on the appropriate deep storage without let's say pulling the s3-storage-extension dependency in the custom extension.

Hence, the idea of StorageConnector was born where the storage connector sits inside the druid core so all extensions have access to it.

Each deep storage implementation, for eg s3, GCS, will implement this interface.
Now with some Jackson magic, we bind the implementation of the correct deep storage implementation on runtime using a type variable.
2022-08-12 16:11:49 +05:30
Suneet Saldanha 267b32c2e2
Set druid.processing.fifo to true by default (#12571) 2022-08-08 10:18:24 -07:00
Gian Merlino 01d555e47b
Adjust "in" filter null behavior to match "selector". (#12863)
* Adjust "in" filter null behavior to match "selector".

Now, both of them match numeric nulls if constructed with a "null" value.

This is consistent as far as native execution goes, but doesn't match
the behavior of SQL = and IN. So, to address that, this patch also
updates the docs to clarify that the native filters do match nulls.

This patch also updates the SQL docs to describe how Boolean logic is
handled in addition to how NULL values are handled.

Fixes #12856.

* Fix test.
2022-08-08 09:08:36 -07:00
Karan Kumar 607b0b9310
Adding withName implementation to AggregatorFactory (#12862)
* Adding agg factory with name impl

* Adding test cases

* Fixing test case

* Fixing test case

* Updated java docs.
2022-08-08 18:31:56 +05:30
Jonathan Wei 2045a1345c
Fix NPE when applying a transform that outputs to __time (#12870) 2022-08-07 19:21:47 +05:30
Gian Merlino ca4e64aea3
Frame processing and channels. (#12848)
* Frame processing and channels.

Follow-up to #12745. This patch adds three new concepts:

1) Frame channels are interfaces for doing nonblocking reads and writes
   of frames.

2) Frame processors are interfaces for doing nonblocking processing of
   frames received from input channels and sent to output channels.

3) Cluster-by keys, which can be used for sorting or partitioning.

The patch also adds SuperSorter, a user of these concepts, both to
illustrate how they are used, and also because it is going to be useful
in future work.

Central classes:

- ReadableFrameChannel. Implementations include
  BlockingQueueFrameChannel (in-memory channel that implements both interfaces),
  ReadableFileFrameChannel (file-based channel),
  ReadableByteChunksFrameChannel (byte-stream-based channel), and others.

- WritableFrameChannel. Implementations include BlockingQueueFrameChannel
  and WritableStreamFrameChannel (byte-stream-based channel).

- ClusterBy, a sorting or partitioning key.

- FrameProcessor, nonblocking processor of frames. Implementations include
  FrameChannelBatcher, FrameChannelMerger, and FrameChannelMuxer.

- FrameProcessorExecutor, an executor service that runs FrameProcessors.

- SuperSorter, a class that uses frame channels and processors to
  do parallel external merge sort of any amount of data (as long as there
  is enough disk space).

* Additional tests, fixes.

* Changes from review.

* Better implementation for ReadableInputStreamFrameChannel.

* Rename getFrameFileReference -> newFrameFileReference.

* Add InterruptedException to runIncrementally; add more tests.

* Cancellation adjustments.

* Review adjustments.

* Refactor BlockingQueueFrameChannel, rename doneReading and doneWriting to close.

* Additional changes from review.

* Additional changes.

* Fix test.

* Adjustments.

* Adjustments.
2022-08-04 21:29:04 -07:00
Clint Wylie 73cfc4e5d0
fix expression plan type inference to correctly handle complex types (#12857) 2022-08-04 02:56:05 -07:00
Paul Rogers a618458bf0
Tidy up construction of the Guice Injectors (#12816)
* Refactor Guice initialization

Builders for various module collections
Revise the extensions loader
Injector builders for server startup
Move Hadoop init to indexer
Clean up server node role filtering
Calcite test injector builder

* Revisions from review comments

* Build fixes

* Revisions from review comments
2022-08-04 00:05:07 -07:00
Gian Merlino ef6811ef88
Improved Java 17 support and Java runtime docs. (#12839)
* Improved Java 17 support and Java runtime docs.

1) Add a "Java runtime" doc page with information about supported
   Java versions, garbage collection, and strong encapsulation..

2) Update asm and equalsverifier to versions that support Java 17.

3) Add additional "--add-opens" lines to surefire configuration, so
   tests can pass successfully under Java 17.

4) Switch openjdk15 tests to openjdk17.

5) Update FrameFile to specifically mention Java runtime incompatibility
   as the cause of not being able to use Memory.map.

6) Update SegmentLoadDropHandler to log an error for Errors too, not
   just Exceptions. This is important because an IllegalAccessError is
   encountered when the correct "--add-opens" line is not provided,
   which would otherwise be silently ignored.

7) Update example configs to use druid.indexer.runner.javaOptsArray
   instead of druid.indexer.runner.javaOpts. (The latter is deprecated.)

* Adjustments.

* Use run-java in more places.

* Add run-java.

* Update .gitignore.

* Exclude hadoop-client-api.

Brought in when building on Java 17.

* Swap one more usage of java.

* Fix the run-java script.

* Fix flag.

* Include link to Temurin.

* Spelling.

* Update examples/bin/run-java

Co-authored-by: Xavier Léauté <xl+github@xvrl.net>

Co-authored-by: Xavier Léauté <xl+github@xvrl.net>
2022-08-03 23:16:05 -07:00
Clint Wylie 6981b1cc12
fix bugs with nested column jsonpath parser (#12831) 2022-08-02 11:38:25 -07:00
Clint Wylie 6046a392b6
add DictionaryEncodedStringValueIndex implementation to NestedFieldLiteralColumnIndexSupplier (#12837) 2022-08-01 21:40:35 -07:00
Rohan Garg 7ae6cc6e60
Fix string first/last aggregator comparator (#12773) 2022-08-01 20:54:15 +05:30
Clint Wylie d96a9c1e6f
add missing selectors for explicit null columns (#12834) 2022-07-29 19:08:58 -07:00
Clint Wylie 189e8b9d18
add NumericRangeIndex interface and BoundFilter support (#12830)
add NumericRangeIndex interface and BoundFilter support
changes:
* NumericRangeIndex interface, like LexicographicalRangeIndex but for numbers
* BoundFilter now uses NumericRangeIndex if comparator is numeric and there is no extractionFn
* NestedFieldLiteralColumnIndexSupplier.java now supports supplying NumericRangeIndex for single typed numeric nested literal columns

* better faster stronger and (ever so slightly) more understandable

* more tests, fix bug

* fix style
2022-07-29 18:58:49 -07:00
Maytas Monsereenusorn 24c345cdf0
Allow dictionary encoded column to use a more generic index interface (#12826) 2022-07-27 15:23:00 -07:00
Maytas Monsereenusorn 5417aa2055
Fix: ParseException swallow cause Exception (#12810)
* add impl

* add impl

* fix checkstyle
2022-07-22 13:46:28 -07:00
Clint Wylie 1e0542626b
add nested column query benchmarks (#12786) 2022-07-14 18:16:30 -07:00
Clint Wylie 05b2e967ed
druid nested data column type (#12753)
* add new druid nested data column type

* fixes and such

* fixes

* adjustments, more tests

* self review

* oops

* fix and test

* more better

* style
2022-07-14 12:07:23 -07:00
Rohan Garg bb953be09b
Refactor usage of JoinableFactoryWrapper + more test coverage (#12767)
Refactor usage of JoinableFactoryWrapper to add e2e test for createSegmentMapFn with joinToFilter feature enabled
2022-07-12 06:25:36 -07:00
Gian Merlino 97207cdcc7
Automatic sizing for GroupBy dictionaries. (#12763)
* Automatic sizing for GroupBy dictionary sizes.

Merging and selector dictionary sizes currently both default to 100MB.
This is not optimal, because it can lead to OOM on small servers and
insufficient resource utilization on larger servers. It also invites
end users to try to tune it when queries run out of dictionary space,
which can make things worse if the end user sets it to too high.

So, this patch:

- Adds automatic tuning for selector and merge dictionaries. Selectors
  use up to 15% of the heap and merge buffers use up to 30% of the heap
  (aggregate across all queries).

- Updates out-of-memory error messages to emphasize enabling disk
  spilling vs. increasing memory parameters. With the memory parameters
  automatically sized, it is more likely that an end user will get
  benefit from enabling disk spilling.

- Removes the query context parameters that allow lowering of configured
  dictionary sizes. These complicate the calculation, and I don't see a
  reasonable use case for them.

* Adjust tests.

* Review adjustments.

* Additional comment.

* Remove unused import.
2022-07-11 08:20:50 -07:00
Gian Merlino 864b77e91a
SpillingGrouper: Make DISK_FULL sticky. (#12764)
When we return DISK_FULL to a processing thread, it skips the rest of
the segment and the query is canceled. However, it's possible that the
next segment starts processing before cancellation can kick in. We want
that one, if it occurs, to see DISK_FULL too.
2022-07-09 06:45:38 -07:00
Gian Merlino edfbcc8455
Preserve column order in DruidSchema, SegmentMetadataQuery. (#12754)
* Preserve column order in DruidSchema, SegmentMetadataQuery.

Instead of putting columns in alphabetical order. This is helpful
because it makes query order better match ingestion order. It also
allows tools, like the reindexing flow in the web console, to more
easily do follow-on ingestions using a column order that matches the
pre-existing column order.

We prefer the order from the latest segments. The logic takes all
columns from the latest segments in the order they appear, then adds
on columns from older segments after those.

* Additional test adjustments.

* Adjust imports.
2022-07-08 22:04:11 -07:00
Gian Merlino 9c925b4f09
Frame format for data transfer and short-term storage. (#12745)
* Frame format for data transfer and short-term storage.

As we move towards query execution plans that involve more transfer
of data between servers, it's important to have a data format that
provides for doing this more efficiently than the options available to
us today.

This patch adds:

- Columnar frames, which support fast querying.
- Row-based frames, which support fast sorting via memory comparison
  and fast whole-row copies via memory copying.
- Frame files, a container format that can be stored on disk or
  transferred between servers.

The idea is we should use row-based frames when data is expected to
be sorted, and columnar frames when data is expected to be queried.

The code in this patch is not used in production yet. Therefore, the
patch involves minimal changes outside of the org.apache.druid.frame
package.  The main ones are adjustments to SqlBenchmark to add benchmarks
for queries on frames, and the addition of a "forEach" method to Sequence.

* Fixes based on tests, static analysis.

* Additional fixes.

* Skip DS mapping tests on JDK 14+

* Better JDK checking in tests.

* Fix imports.

* Additional comment.

* Adjustments from code review.

* Update test case.
2022-07-08 20:42:06 -07:00
Rohan Garg bcff35f798
Pushdown join filter with right side referencing columns (#12749) 2022-07-08 19:59:41 +05:30
Jianhuan Liu 4574dea5e9
Use MXBeans to get GC metrics #12476 (#12481)
* jvm gc to mxbeans

* add zgc and shenandoah #12476

* remove tryCreateGcCounter

* separate the space collector

* blend GcGenerationCollector into GcCollector

* add jdk surefire argLine
2022-07-08 14:32:06 +08:00
Gian Merlino 49feffff1b
Add comment about double-close in ColumnSelectorColumnIndexSelector. (#12735) 2022-07-06 00:50:35 -07:00
Clint Wylie 36e38b319b
add virtual column support to search query (#12720) 2022-07-04 21:58:10 -07:00
imply-cheddar e3128e3fa3
Poison stupid pool (#12646)
* Poison StupidPool and fix resource leaks

There are various resource leaks from test setup as well as some
corners in query processing.  We poison the StupidPool to start failing
tests when the leaks come and fix any issues uncovered from that so
that we can start from a clean baseline.

Unfortunately, because of how poisoning works,
we can only fail future checkouts from the same pool,
which means that there is a natural race between a
leak happening -> GC occurs -> leak detected -> pool poisoned.

This race means that, depending on interleaving of tests,
if the very last time that an object is checked out
from the pool leaks, then it won't get caught.
At some point in the future, something will catch it,
 however and from that point on it will be deterministic.

* Remove various things left over from iterations

* Clean up FilterAnalysis and add javadoc on StupidPool

* Revert changes to .idea/misc.xml that accidentally got pushed

* Style and test branches

* Stylistic woes
2022-07-03 14:36:22 -07:00
Clint Wylie 48731710fb
precursor changes for nested columns to minimize files changed (#12714)
* precursor changes for nested columns to minimize files changed

* inspection fix

* visibility

* adjustment

* unecessary change
2022-07-01 02:27:19 -07:00
Abhishek Agarwal dbd45daf33
Flakiness and exceptions during tests (#12705) 2022-06-28 10:36:23 +05:30
Tejaswini Bandlamudi 1fc2f6e4b0
Throw BadQueryContextException if context params cannot be parsed (#12680) 2022-06-24 09:21:25 +05:30
Gian Merlino 818974f6e4
ScanQuery: Fix JsonIgnore for isLegacy. (#12674)
True, false, and null have different meanings: true/false mean "legacy"
and "not legacy"; null means use the default set by ScanQueryConfig.
So, we need to respect this in the JsonIgnore setup.
2022-06-18 15:55:54 -07:00
Gian Merlino e76a5077ef
Fix self-referential shape inspection in BaseExpressionColumnValueSelector. (#12669)
* Fix self-referential shape inspection in BaseExpressionColumnValueSelector.

The new test would throw StackOverflowError on the old code.

* Restore prior test.
2022-06-17 16:15:50 -07:00
Clint Wylie 18937ffee2
split out null value index (#12627)
* split out null value index

* gg spotbugs

* fix stuff
2022-06-17 15:29:23 -07:00
Paul Rogers 893759de91
Remove null and empty fields from native queries (#12634)
* Remove null and empty fields from native queries

* Test fixes

* Attempted IT fix.

* Revisions from review comments

* Build fixes resulting from changes suggested by reviews

* IT fix for changed segment size
2022-06-16 14:07:25 -07:00
Paul Rogers 45e3111549
Clean up query contexts (#12633)
* Clean up query contexts

Uses constants in place of literal strings for context keys.
Moves some QueryContext methods to QueryContexts for reuse.

* Revisions from review comments
2022-06-15 11:31:22 -07:00
Rohan Garg 28f2c8e112
Support LoadScope for Peons + Access Modifier Updates (#12640)
* Support LoadScope for Peons

* Update access modifiers for GroupByEngineV2
2022-06-14 21:52:50 -07:00
Rohan Garg afaea251f2
Push join build table values as filter incase of duplicates (#12225)
* Push join build table values as filter

* Add tests for JoinableFactoryWrapper

* fixup! Push join build table values as filter

* fixup! Add tests for JoinableFactoryWrapper

* fixup! Push join build table values as filter
2022-06-13 17:18:27 -07:00
Abhishek Agarwal 59a0c10c47
Add remedial information in error message when type is unknown (#12612)
Often users are submitting queries, and ingestion specs that work only if the relevant extension is not loaded. However, the error is too technical for the users and doesn't suggest them to check for missing extensions. This PR modifies the error message so users can at least check their settings before assuming that the error is because of a bug.
2022-06-07 20:22:45 +05:30
Gian Merlino abf0e0a159
CompressionStrategyTest: Fix thread-unsafe Closer usage. (#12605)
Closer is not thread-safe, so we need one per thread in the
concurrency tests.
2022-06-04 10:57:13 -07:00
Clint Wylie 98f6bca2cd
fix regression with ipv4_match and prefixes (#12542)
* fix issue with ipv4_match and prefixes
2022-06-01 14:03:08 -07:00
Clint Wylie 31f988ec76
fix backwards compatibility for explicit null columns (#12585) 2022-06-01 12:39:48 -07:00
Clint Wylie 0640c9c9ac
fix compression-strategy-test (#12575)
fixes an issue caused by a test modification in #12408 that was closing buffers allocated by the compression strategy instead of allowing the closer to do it
2022-05-31 11:48:32 -07:00
Gian Merlino 02ae3e74ff
RowBasedColumnSelectorFactory: Add "useStringValueOfNullInLists" parameter. (#12578)
RowBasedColumnSelectorFactory inherited strange behavior from
Rows.objectToStrings for nulls that appear in lists: instead of being
left as a null, it is replaced with the string "null". Some callers may
need compatibility with this strange behavior, but it should be opt-in.

Query-time call sites are changed to opt-out of this behavior, since it
is not consistent with query-time expectations. The IncrementalIndex
ingestion-time call site retains the old behavior, as this is traditionally
when Rows.objectToStrings would be used.
2022-05-31 11:38:56 -07:00
Gian Merlino 6d2ff796a3
Add RowIdSupplier to ColumnSelectorFactory. (#12577)
* Add RowIdSupplier to ColumnSelectorFactory.

This enables virtual columns to cache their outputs in case they are
called multiple times on the same underlying row. This is common for
numeric selectors, where the common pattern is to call isNull() and
then follow with getLong(), getFloat(), or getDouble(). Here, output
caching reduces the number of expression evals by half.

* Fix tests.
2022-05-31 11:38:03 -07:00
Clint Wylie b746bf9129
fix virtual column cycle bug, sql virtual column optimize bug (#12576)
* fix virtual column cycle bug, sql virtual column optimize bug

* more test
2022-05-30 23:51:21 -07:00
Dr. Sizzles 7291c92f4f
Adding zstandard compression library (#12408)
* Adding zstandard compression library

* 1. Took @clintropolis's advice to have ZStandard decompressor use the byte array when the buffers are not direct.
2. Cleaned up checkstyle issues.

* Fixing zstandard version to latest stable version in pom's and updating license files

* Removing zstd from benchmarks and adding to processing (poms)

* fix the intellij inspection issue

* Removing the prefix v for the version in the license check for ztsd

* Fixing license checks

Co-authored-by: Rahul Gidwani <r_gidwani@apple.com>
2022-05-28 17:01:44 -07:00
Clint Wylie d0c9c37e35
make query context changes backwards compatible (#12564)
Adds a default implementation of getQueryContext, which was added to the Query interface in #12396. Query is marked with @ExtensionPoint, and lately we have been trying to be less volatile on these interfaces by providing default implementations to be more chill for extension writers.

The way this default implementation is done in this PR is a bit strange due to the way that getQueryContext is used (mutated with system default and system generated keys); the default implementation has a specific object that it returns, and I added another temporary default method isLegacyContext that checks if the getQueryContext returns that object or not. If not, callers fall back to using getContext and withOverriddenContext to set these default and system values.

I am open to other ideas as well, but this way should work at least without exploding, and added some tests to ensure that it is wired up correctly for QueryLifecycle, including the context authorization stuff.

The added test shows the strange behavior if query context authorization is enabled, mainly that the system default and system generated query context keys also need to be granted as permissions for things to function correctly. This is not great, so I mentioned it in the javadocs as well. Not sure if it needs to be called out anywhere else.
2022-05-25 15:24:41 +05:30
Karan Kumar 9f9faeec81
object[] handling for DimensionHandlers for arrays (#12552)
Description
Fixes a bug when running q's like

 SELECT cntarray,
       Count(*)
FROM   (SELECT dim1,
               dim2,
               Array_agg(cnt) AS cntarray
        FROM   (SELECT dim1,
                       dim2,
                       dim3,
                       Count(*) AS cnt
                FROM   foo
                GROUP  BY 1,
                          2,
                          3)
        GROUP  BY 1,
                  2)
GROUP  BY 1  
This generates an error:

org.apache.druid.java.util.common.ISE: Unable to convert type [Ljava.lang.Object; to org.apache.druid.segment.data.ComparableList
        at org.apache.druid.segment.DimensionHandlerUtils.convertToList(DimensionHandlerUtils.java:405) ~[druid-xx]
Because it's an array of numbers it looks like it does the convertToList call, which looks like:

  @Nullable
  public static ComparableList convertToList(Object obj)
  {
    if (obj == null) {
      return null;
    }
    if (obj instanceof List) {
      return new ComparableList((List) obj);
    }
    if (obj instanceof ComparableList) {
      return (ComparableList) obj;
    }
    throw new ISE("Unable to convert type %s to %s", obj.getClass().getName(), ComparableList.class.getName());
  }
I.e. it doesn't know about arrays. Added the array handling as part of this PR.
2022-05-25 15:24:18 +05:30
Agustin Gonzalez 2f3d7a4c07
Emit state of replace and append for native batch tasks (#12488)
* Emit state of replace and append for native batch tasks

* Emit count of one depending on batch ingestion mode (APPEND, OVERWRITE, REPLACE)

* Add metric to compaction job

* Avoid null ptr exc when null emitter

* Coverage

* Emit tombstone & segment counts

* Tasks need a type

* Spelling

* Integrate BatchIngestionMode in batch ingestion tasks functionality

* Typos

* Remove batch ingestion type from metric since it is already in a dimension. Move IngestionMode to AbstractTask to facilitate having mode as a dimension. Add metrics to streaming. Add missing coverage.

* Avoid inner class referenced by sub-class inspection. Refactor computation of IngestionMode to make it more robust to null IOConfig and fix test.

* Spelling

* Avoid polluting the Task interface

* Rename computeCompaction methods to avoid ambiguous java compiler error if they are passed null. Other minor cleanup.
2022-05-23 12:32:47 -07:00
Gian Merlino 37853f8de4
ConcurrentGrouper: Add mergeThreadLocal option, fix bug around the switch to spilling. (#12513)
* ConcurrentGrouper: Add option to always slice up merge buffers thread-locally.

Normally, the ConcurrentGrouper shares merge buffers across processing
threads until spilling starts, and then switches to a thread-local model.
This minimizes memory use and reduces likelihood of spilling, which is
good, but it creates thread contention. The new mergeThreadLocal option
causes a query to start in thread-local mode immediately, and allows us
to experiment with the relative performance of the two modes.

* Fix grammar in docs.

* Fix race in ConcurrentGrouper.

* Fix issue with timeouts.

* Remove unused import.

* Add "tradeoff" to dictionary.
2022-05-21 10:28:54 -07:00
Gian Merlino 69aac6c8dd
Direct UTF-8 access for "in" filters. (#12517)
* Direct UTF-8 access for "in" filters.

Directly related:

1) InDimFilter: Store stored Strings (in ValuesSet) plus sorted UTF-8
   ByteBuffers (in valuesUtf8). Use valuesUtf8 whenever possible. If
   necessary, the input set is copied into a ValuesSet. Much logic is
   simplified, because we always know what type the values set will be.
   I think that there won't even be an efficiency loss in most cases.
   InDimFilter is most frequently created by deserialization, and this
   patch updates the JsonCreator constructor to deserialize
   directly into a ValuesSet.

2) Add Utf8ValueSetIndex, which InDimFilter uses to avoid UTF-8 decodes
   during index lookups.

3) Add unsigned comparator to ByteBufferUtils and use it in
   GenericIndexed.BYTE_BUFFER_STRATEGY. This is important because UTF-8
   bytes can be compared as bytes if, and only if, the comparison
   is unsigned.

4) Add specialization to GenericIndexed.singleThreaded().indexOf that
   avoids needless ByteBuffer allocations.

5) Clarify that objects returned by ColumnIndexSupplier.as are not
   thread-safe. DictionaryEncodedStringIndexSupplier now calls
   singleThreaded() on all relevant GenericIndexed objects, saving
   a ByteBuffer allocation per access.

Also:

1) Fix performance regression in LikeFilter: since #12315, it applied
   the suffix matcher to all values in range even for type MATCH_ALL.

2) Add ObjectStrategy.canCompare() method. This fixes LikeFilterBenchmark,
   which was broken due to calls to strategy.compare in
   GenericIndexed.fromIterable.

* Add like-filter implementation tests.

* Add in-filter implementation tests.

* Add tests, fix issues.

* Fix style.

* Adjustments from review.
2022-05-20 01:51:28 -07:00
machine424 90531fd53f
Do not alter query timeout in ScanQueryEngine (#12271)
Add test to detect timeout mutability
2022-05-19 09:24:42 -07:00
Gian Merlino 4631cff2a9
Free ByteBuffers in tests and fix some bugs. (#12521)
* Ensure ByteBuffers allocated in tests get freed.

Many tests had problems where a direct ByteBuffer would be allocated
and then not freed. This is bad because it causes flaky tests.

To fix this:

1) Add ByteBufferUtils.allocateDirect(size), which returns a ResourceHolder.
   This makes it easy to free the direct buffer. Currently, it's only used
   in tests, because production code seems OK.

2) Update all usages of ByteBuffer.allocateDirect (off-heap) in tests either
   to ByteBuffer.allocate (on-heap, which are garbaged collected), or to
   ByteBufferUtils.allocateDirect (wherever it seemed like there was a good
   reason for the buffer to be off-heap). Make sure to close all direct
   holders when done.

* Changes based on CI results.

* A different approach.

* Roll back BitmapOperationTest stuff.

* Try additional surefire memory.

* Revert "Roll back BitmapOperationTest stuff."

This reverts commit 49f846d9e3.

* Add TestBufferPool.

* Revert Xmx change in tests.

* Better behaved NestedQueryPushDownTest. Exit tests on OOME.

* Fix TestBufferPool.

* Remove T1C from ARM tests.

* Somewhat safer.

* Fix tests.

* Fix style stuff.

* Additional debugging.

* Reset null / expr configs better.

* ExpressionLambdaAggregatorFactory thread-safety.

* Alter forkNode to try to get better info when a JVM crashes.

* Fix buffer retention in ExpressionLambdaAggregatorFactory.

* Remove unused import.
2022-05-19 07:42:29 -07:00
Gian Merlino 5b6727f319
Enable vectorized virtual column processing by default. (#12520)
In the majority of cases, this improves performance.

There's only one case I'm aware of where this may be a net negative: for time_floor(__time, <period>) where there are many repeated __time values. In nonvectorized processing, SingleLongInputCachingExpressionColumnValueSelector implements an optimization to avoid computing the time_floor function on every row. There is no such optimization in vectorized processing.

IMO, we shouldn't mention this in the docs. Rationale: It's too fiddly of a thing: it's not guaranteed that nonvectorized processing will be faster due to the optimization, because it would have to overcome the inherent speed advantage of vectorization. So it'd always require testing to determine the best setting for a specific dataset. It would be bad if users disabled vectorization thinking it would speed up their queries, and it actually slowed them down. And even if users do their own testing, at some point in the future we'll implement the optimization for vectorized processing too, and it's likely that users that explicitly disabled vectorization will continue to have it disabled. I'd like to avoid this outcome by encouraging all users to enable vectorization at all times. Really advanced users would be following development activity anyway, and can read this issue
2022-05-16 15:43:53 +05:30
Gian Merlino ff253fd8a3
Add setProcessingThreadNames context parameter. (#12514)
setting thread names takes a measurable amount of time in the case where segment scans are very quick. In high-QPS testing we found a slight performance boost from turning off processing thread renaming. This option makes that possible.
2022-05-16 13:42:00 +05:30
Abhishek Radhakrishnan 9177515be2
Add IPAddress java library as dependency and migrate IPv4 functions to use the new library. (#11634)
* Add ipaddress library as dependency.

* IPv4 functions to use the inet.ipaddr package.

* Remove unused imports.

* Add new function.

* Minor rename.

* Add more unit tests.

* IPv4 address expr utils unit tests and address options.

* Adjust the IPv4Util functions.

* Move the UTs a bit around.

* Javadoc comments.

* Add license info for IPAddress.

* Fix groupId, artifact and version in license.yaml.

* Remove redundant subnet in messages - fixes UT.

* Remove unused commons-net dependency for /processing project.

* Make class and methods public so it can be accessed.

* Add initial version of benchmark

* Add subnetutils package for benchmarks.

* Auto generate ip addresses.

* Add more v4 address representations in setup to avoid bias.

* Use ThreadLocalRandom to avoid forbidden API usage.

* Adjust IPv4AddressBenchmark to adhere to codestyle rules.

* Update ipaddress library to latest 5.3.4

* Add ipaddress package dependency to benchmarks project.
2022-05-11 22:06:20 -07:00
Clint Wylie 9e5a940cf1
remake column indexes and query processing of filters (#12388)
Following up on #12315, which pushed most of the logic of building ImmutableBitmap into BitmapIndex in order to hide the details of how column indexes are implemented from the Filter implementations, this PR totally refashions how Filter consume indexes. The end result, while a rather dramatic reshuffling of the existing code, should be extraordinarily flexible, eventually allowing us to model any type of index we can imagine, and providing the machinery to build the filters that use them, while also allowing for other column implementations to implement the built-in index types to provide adapters to make use indexing in the current set filters that Druid provides.
2022-05-11 11:57:08 +05:30
Rohan Garg 75836a5a06
Add feature flag for sql planning of TimeBoundary queries (#12491)
* Add feature flag for sql planning of TimeBoundary queries

* fixup! Add feature flag for sql planning of TimeBoundary queries

* Add documentation for enableTimeBoundaryPlanning

* fixup! Add documentation for enableTimeBoundaryPlanning
2022-05-10 15:23:42 +05:30
somu-imply c68388ebcd
Vectorized version of string last aggregator (#12493)
* Vectorized version of string last aggregator

* Updating string last and adding testcases

* Updating code and adding testcases for serializable pairs

* Addressing review comments
2022-05-09 17:02:38 -07:00
Rohan Garg 2dd073c2cd
Pass metrics object for Scan, Timeseries and GroupBy queries during cursor creation (#12484)
* Pass metrics object for Scan, Timeseries and GroupBy queries during cursor creation

* fixup! Pass metrics object for Scan, Timeseries and GroupBy queries during cursor creation

* Document vectorized dimension
2022-05-09 10:40:17 -07:00
Gian Merlino 529b983ad0
GroupBy: Reduce allocations by reusing entry and key holders. (#12474)
* GroupBy: Reduce allocations by reusing entry and key holders.

Two main changes:

1) Reuse Entry objects returned by various implementations of
   Grouper.iterator.

2) Reuse key objects contained within those Entry objects.

This is allowed by the contract, which states that entries must be
processed and immediately discarded. However, not all call sites
respected this, so this patch also updates those call sites.

One particularly sneaky way that the old code retained entries too long
is due to Guava's MergingIterator and CombiningIterator. Internally,
these both advance to the next value prior to returning the current
value. So, this patch addresses that in two ways:

1) For merging, we have our own implementation MergeIterator already,
   although it had the same problem. So, this patch updates our
   implementation to return the current item prior to advancing to the
   next item. It also adds a forbidden-api entry to ensure that this
   safer implementation is used instead of Guava's.

2) For combining, we address the problem in a different way: by copying
   the key when creating the new, combined entry.

* Attempt to fix test.

* Remove unused import.
2022-04-28 23:21:13 -07:00
Gian Merlino a2bad0b3a2
Reduce allocations due to Jackson serialization. (#12468)
* Reduce allocations due to Jackson serialization.

This patch attacks two sources of allocations during Jackson
serialization:

1) ObjectMapper.writeValue and JsonGenerator.writeObject create a new
   DefaultSerializerProvider instance for each call. It has lots of
   fields and creates pressure on the garbage collector. So, this patch
   adds helper functions in JacksonUtils that enable reuse of
   SerializerProvider objects and updates various call sites to make
   use of this.

2) GroupByQueryToolChest copies the ObjectMapper for every query to
   install a special module that supports backwards compatibility with
   map-based rows. This isn't needed if resultAsArray is set and
   all servers are running Druid 0.16.0 or later. This release was a
   while ago. So, this patch disables backwards compatibility by default,
   which eliminates the need to copy the heavyweight ObjectMapper. The
   patch also introduces a configuration option that allows admins to
   explicitly enable backwards compatibility.

* Add test.

* Update additional call sites and add to forbidden APIs.
2022-04-27 14:17:26 -07:00
Gian Merlino 72d15ab321
JvmMonitor: Handle more generation and collector scenarios. (#12469)
* JvmMonitor: Handle more generation and collector scenarios.

ZGC on Java 11 only has a generation 1 (there is no 0). This causes
a NullPointerException when trying to extract the spacesCount for
generation 0. In addition, ZGC on Java 15 has a collector number 2
but no spaces in generation 2, which breaks the assumption that
collectors always have same-numbered spaces.

This patch adjusts things to be more robust, enabling the JvmMonitor
to work properly for ZGC on both Java 11 and 15.

* Test adjustments.

* Improve surefire arglines.

* Need a placeholder
2022-04-27 11:18:40 -07:00
Abhishek Agarwal 2fe053c5cb
Bump up the versions (#12480) 2022-04-27 14:28:20 +05:30
somu-imply 027935dcff
Vectorize numeric latest aggregators (#12439)
* Vectorizing Latest aggregator Part 1

* Updating benchmark tests

* Changing appropriate logic for vectors for null handling

* Introducing an abstract class and moving the commonalities there

* Adding vectorization for StringLast aggregator (initial version)

* Updated bufferized version of numeric aggregators

* Adding some javadocs

* Making sure this PR vectorizes numeric latest agg only

* Adding another benchmarking test

* Fixing intellij inspections

* Adding tests for double

* Adding test cases for long and float

* Updating testcases

* Checkstyle oops..

* One tiny change in test case

* Fixing spotbug and rhs not being used
2022-04-26 11:33:08 -07:00
Will Xu 4868ef9529
Enable Arm builds (#12451)
This PR enables ARM builds on Travis. I've ported over the changes from @martin-g on reducing heap requirements for some of the tests to ensure they run well on Travis arm instances.
2022-04-26 20:14:40 +05:30
Rohan Garg 95694b5afa
Convert simple min/max SQL queries on __time to timeBoundary queries (#12472)
* Support array based results in timeBoundary query

* Fix bug with query interval in timeBoundary

* Convert min(__time) and max(__time) SQL queries to timeBoundary

* Add tests for timeBoundary backed SQL queries

* Fix query plans for existing tests

* fixup! Convert min(__time) and max(__time) SQL queries to timeBoundary

* fixup! Add tests for timeBoundary backed SQL queries

* fixup! Fix bug with query interval in timeBoundary
2022-04-25 08:18:58 -07:00
Jihoon Son 73ce5df22d
Add support for authorizing query context params (#12396)
The query context is a way that the user gives a hint to the Druid query engine, so that they enforce a certain behavior or at least let the query engine prefer a certain plan during query planning. Today, there are 3 types of query context params as below.

Default context params. They are set via druid.query.default.context in runtime properties. Any user context params can be default params.
User context params. They are set in the user query request. See https://druid.apache.org/docs/latest/querying/query-context.html for parameters.
System context params. They are set by the Druid query engine during query processing. These params override other context params.
Today, any context params are allowed to users. This can cause 
1) a bad UX if the context param is not matured yet or 
2) even query failure or system fault in the worst case if a sensitive param is abused, ex) maxSubqueryRows.

This PR adds an ability to limit context params per user role. That means, a query will fail if you have a context param set in the query that is not allowed to you. To do that, this PR adds a new built-in resource type, QUERY_CONTEXT. The resource to authorize has a name of the context param (such as maxSubqueryRows) and the type of QUERY_CONTEXT. To allow a certain context param for a user, the user should be granted WRITE permission on the context param resource. Here is an example of the permission.

{
  "resourceAction" : {
    "resource" : {
      "name" : "maxSubqueryRows",
      "type" : "QUERY_CONTEXT"
    },
    "action" : "WRITE"
  },
  "resourceNamePattern" : "maxSubqueryRows"
}
Each role can have multiple permissions for context params. Each permission should be set for different context params.

When a query is issued with a query context X, the query will fail if the user who issued the query does not have WRITE permission on the query context X. In this case,

HTTP endpoints will return 403 response code.
JDBC will throw ForbiddenException.
Note: there is a context param called brokerService that is used only by the router. This param is used to pin your query to run it in a specific broker. Because the authorization is done not in the router, but in the broker, if you have brokerService set in your query without a proper permission, your query will fail in the broker after routing is done. Technically, this is not right because the authorization is checked after the context param takes effect. However, this should not cause any user-facing issue and thus should be OK. The query will still fail if the user doesn’t have permission for brokerService.

The context param authorization can be enabled using druid.auth.authorizeQueryContextParams. This is disabled by default to avoid any hassle when someone upgrades his cluster blindly without reading release notes.
2022-04-21 14:21:16 +05:30
Rohan Garg 4c6ba73823
Emit vectorized metric dimension by default (#12464) 2022-04-20 21:14:55 -07:00
Frank Chen 2677d279e2
Remove h2 database from dependency (#12447) 2022-04-19 10:25:17 +08:00
Maytas Monsereenusorn c25a556827
Fix bug in auto compaction preserveExistingMetrics feature (#12438)
* fix bug

* fix test

* fix IT
2022-04-15 15:47:47 -07:00
Clint Wylie 5824ab9608
fix issue with boolean expression input (#12429) 2022-04-13 16:34:01 -07:00
Jihoon Son 5e5625f3ae
Fix indexMerger to respect the includeAllDimensions flag (#12428)
* Fix indexMerger to respect flag includeAllDimensions flag; jsonInputFormat should set keepNullColumns if useFieldDiscovery is set

* address comments
2022-04-13 12:43:11 -07:00
Maytas Monsereenusorn 8edea5a82d
Add a new flag for ingestion to preserve existing metrics (#12185)
* add impl

* add impl

* fix checkstyle

* add impl

* add unit test

* fix stuff

* fix stuff

* fix stuff

* add unit test

* add more unit tests

* add more unit tests

* add IT

* add IT

* add IT

* add IT

* add ITs

* address comments

* fix test

* fix test

* fix test

* address comments

* address comments

* address comments

* fix conflict

* fix checkstyle

* address comments

* fix test

* fix checkstyle

* fix test

* fix test

* fix IT
2022-04-08 11:02:02 -07:00
Paul Rogers 2cc2088720
Method to specify eternity in the scan query builder (#12223)
* Method to specify eternity in the scan query builder

* Fix checkstyle issue

* Renamed eterity() to eternityInterval()

* Minor fixes
2022-04-04 15:11:32 -07:00
somu-imply a1ea658115
Introducing a new config to ignore nulls while computing String Cardinality (#12345)
* Counting nulls in String cardinality with a config

* Adding tests for the new config

* Wrapping the vectorize part to allow backward compatibility

* Adding different tests, cleaning the code and putting the check at the proper position, handling hasRow() and hasValue() changes

* Updating testcase and code

* Adding null handling test to improve coverage

* Checkstyle fix

* Adding 1 more change in docs

* Making docs clearer
2022-03-29 14:31:36 -07:00
Jihoon Son b6eeef31e5
Store null columns in the segments (#12279)
* Store null columns in the segments

* fix test

* remove NullNumericColumn and unused dependency

* fix compile failure

* use guava instead of apache commons

* split new tests

* unused imports

* address comments
2022-03-23 16:54:04 -07:00
Adarsh Sanjeev ef45a1551e
Convert inQueryThreshold into query context parameter. (#12357)
Added Calcites InQueryThreshold as a query context parameter. Setting this parameter appropriately reduces the time taken for queries with large number of values in their IN conditions.
2022-03-22 18:33:57 +05:30
Xavier Léauté 1f0447e613
fix use of deprecated initMocks method (#12351)
follow-up to #12341
- fix use of deprecated initMocks methods and properly close mocks on teardown
2022-03-19 10:19:02 -07:00
somu-imply b5195c5095
Graceful null handling and correctness in DoubleMean Aggregator (#12320)
* Adding null handling for double mean aggregator

* Updating code to handle nulls in DoubleMean aggregator

* oops last one should have checkstyle issues. fixed

* Updating some code and test cases

* Checking on object is null in case of numeric aggregator

* Adding one more test to improve coverage

* Changing one test as asked in the review

* Changing one test as asked in the review for nulls
2022-03-14 16:52:47 -07:00
mchades 3de1272926
bug fix: merge results of group by limit push down (#11969) 2022-03-11 09:04:34 -08:00
Gian Merlino cb2b2b696d
Fix error message for groupByEnableMultiValueUnnesting. (#12325)
* Fix error message for groupByEnableMultiValueUnnesting.

It referred to the incorrect context parameter.

Also, create a dedicated exception class, to allow easier detection of this
specific error.

* Fix other test.

* More better error messages.

* Test getDimensionName method.
2022-03-10 11:37:24 -08:00
Clint Wylie 9cfb23935f
push value range and set index get operations into BitmapIndex (#12315)
* push value range and set index get operations into BitmapIndex

* fix bug

* oops, fix better

* better like, fix test, javadocs

* fix checkstyle

* simplify and fixes

* cache

* fix tests

* move indexOf into GenericIndexed

* oops

* fix tests
2022-03-09 13:30:58 -08:00
Rohan Garg 9f6a930462
Fix join query incase of filter explosion during CNF conversion (#12324) 2022-03-09 12:43:09 -08:00
Clint Wylie dc0372a28e
improve FileWriteOutBytes.readFully (#12323)
* improve FileWriteOutBytes.readFully

* no need to flush if out of bounds
2022-03-09 11:45:45 -08:00
Rohan Garg 56fbd2af6f
Guard against exponential increase of filters during CNF conversion (#12314)
Currently, the CNF conversion of a filter is unbounded, which means that it can create as many filters as possible thereby also leading to OOMs in historical heap. We should throw an error or disable CNF conversion if the filter count starts getting out of hand. There are ways to do CNF conversion with linear increase in filters as well but that has been left out of the scope of this change since those algorithms add new variables in the predicate - which can be contentious.
2022-03-09 13:19:52 +05:30
Agustin Gonzalez abe76ccb90
Batch ingestion replace (#12137)
* Tombstone support for replace functionality

* A used segment interval is the interval of a current used segment that overlaps any of the input intervals for the spec

* Update compaction test to match replace behavior

* Adapt ITAutoCompactionTest to work with tombstones rather than dropping segments. Add support for tombstones in the broker.

* Style plus simple queriableindex test

* Add segment cache loader tombstone test

* Add more tests

* Add a method to the LogicalSegment to test whether it has any data

* Test filter with some empty logical segments

* Refactor more compaction/dropexisting tests

* Code coverage

* Support for all empty segments

* Skip tombstones when looking-up broker's timeline. Discard changes made to tool chest to avoid empty segments since they will no longer have empty segments after lookup because we are skipping over them.

* Fix null ptr when segment does not have a queriable index

* Add support for empty replace interval (all input data has been filtered out)

* Fixed coverage & style

* Find tombstone versions from lock versions

* Test failures & style

* Interner was making this fail since the two segments were consider equal due to their id's being equal

* Cleanup tombstone version code

* Force timeChunkLock whenever replace (i.e. dropExisting=true) is being used

* Reject replace spec when input intervals are empty

* Documentation

* Style and unit test

* Restore test code deleted by mistake

* Allocate forces TIME_CHUNK locking and uses lock versions. TombstoneShardSpec added.

* Unused imports. Dead code. Test coverage.

* Coverage.

* Prevent killer from throwing an exception for tombstones. This is the killer used in the peon for killing segments.

* Fix OmniKiller + more test coverage.

* Tombstones are now marked using a shard spec

* Drop a segment factory.json in the segment cache for tombstones

* Style

* Style + coverage

* style

* Add TombstoneLoadSpec.class to mapper in test

* Update core/src/main/java/org/apache/druid/segment/loading/TombstoneLoadSpec.java

Typo

Co-authored-by: Jonathan Wei <jon-wei@users.noreply.github.com>

* Update docs/configuration/index.md

Missing

Co-authored-by: Jonathan Wei <jon-wei@users.noreply.github.com>

* Typo

* Integrated replace with an existing test since the replace part was redundant and more importantly, the test file was very close or exceeding the 10 min default "no output" CI Travis threshold.

* Range does not work with multi-dim

Co-authored-by: Jonathan Wei <jon-wei@users.noreply.github.com>
2022-03-08 20:07:02 -07:00
Clint Wylie dae53ae36a
adjust topn heap operation when string is dictionary encoded, but not uniquely (#12291)
* add topn heap optimization when string is dictionary encoded, but not uniquely

* use array instead

* is same

* fix javadoc

* fix

* Update StringTopNColumnAggregatesProcessor.java
2022-03-08 14:32:40 -08:00
Gian Merlino 875e0696e0
GroupBy: Cap dictionary-building selector memory usage. (#12309)
* GroupBy: Cap dictionary-building selector memory usage.

New context parameter "maxSelectorDictionarySize" controls when the
per-segment processing code should return early and trigger a trip
to the merge buffer.

Includes:

- Vectorized and nonvectorized implementations.
- Adjustments to GroupByQueryRunnerTest to exercise this code in
  the v2SmallDictionary suite. (Both the selector dictionary and
  the merging dictionary will be small in that suite.)
- Tests for the new config parameter.

* Fix issues from tests.

* Add "pre-existing" to dictionary.

* Simplify GroupByColumnSelectorStrategy interface by removing one of the writeToKeyBuffer methods.

* Adjustments from review comments.
2022-03-08 13:13:11 -08:00
Gian Merlino 3b373114dc
Officially support Java 11. (#12232)
There aren't any changes in this patch that improve Java 11
compatibility; these changes have already been done separately. This
patch merely updates documentation and explicit Java version checks.

The log message adjustments in DruidProcessingConfig are there to make
things a little nicer when running in Java 11, where we can't measure
direct memory _directly_, and so we may auto-size processing buffers
incorrectly.
2022-03-04 14:15:45 -08:00
Clint Wylie 1c004ea47e
use virtual columns for sql simple aggregators instead of inline expressions (#12251)
* use virtual columns for sql simple aggregators instead of inline expressions

* fixes

* always use virtual columns

* add more tests
2022-03-03 15:05:28 -08:00
Tejaswini Bandlamudi 1af4c9c933
Display row stats for multiphase parallel indexing tasks (#12280)
Row stats are reported for single phase tasks in the `/liveReports` and `/rowStats` APIs
and are also a part of the overall task report. This commit adds changes to report
row stats for multiphase tasks too.

Changes:
- Add `TaskReport` in `GeneratedPartitionsReport` generated during hash and range partitioning
- Collect the reports for `index_generate` phase in `ParallelIndexSupervisorTask`
2022-03-02 10:10:31 +05:30
Xavier Léauté 1434197ee1
update airline dependency to 2.x (#12270)
* upgrade Airline to Airline 2
  https://github.com/airlift/airline is no longer maintained, updating to
  https://github.com/rvesse/airline (Airline 2) to use an actively
  maintained version, while minimizing breaking changes.

  Note, this is a backwards incompatible change, and extensions relying on
  the CliCommandCreator extension point will also need to be updated.

* fix dependency checks where jakarta.inject is now resolved first instead
  of javax.inject, due to Airline 2 using jakarta
2022-02-27 15:19:28 -08:00
Jihoon Son e5ad862665
A new includeAllDimension flag for dimensionsSpec (#12276)
* includeAllDimensions in dimensionsSpec

* doc

* address comments

* unused import and doc spelling
2022-02-25 18:27:48 -08:00
Jason Koch eb1b53b7f8
perf: indexing: Introduce a bulk getValuesInto function to read values (#12105)
* perf: indexing: Introduce a bulk getValuesInto function to read values in bulk

If large number of values are required from DimensionDictionary
during indexing, fetch them all in a single lock/unlock instead of
lock/unlock each individual item.

* refactor: rename key to keys in function args

* fix: check explicitly that argument length on arrays match

* refactor: getValuesInto renamed to getValues, now creates and returns a new T[] rather than filling
2022-02-25 12:19:04 -08:00
Karan Kumar 5794331eb1
Adding new config for disabling group by on multiValue column (#12253)
As part of #12078 one of the followup's was to have a specific config which does not allow accidental unnesting of multi value columns if such columns become part of the grouping key.
Added a config groupByEnableMultiValueUnnesting which can be set in the query context.

The default value of groupByEnableMultiValueUnnesting is true, therefore it does not change the current engine behavior.
If groupByEnableMultiValueUnnesting is set to false, the query will fail if it encounters a multi-value column in the grouping key.
2022-02-16 20:53:26 +05:30
somu-imply eae163a797
Moving in filter check to broker (#12195)
* Moving in filter check to broker

* Adding more unit tests, making error message meaningful

* Spelling and doc changes

* Updating default to -1 and making this feature hide by default. The number of IN filters can grow upto a max limit of 100

* Removing upper limit of 100, updated docs

* Making documentation more meaningful

* Moving check outside to PlannerConfig, updating test cases and adding back max limit

* Updated with some additional code comments

* Missed removing one line during the checkin

* Addressing doc changes and one forbidden API correction

* Final doc change

* Adding a speling exception, correcting a testcase

* Reading entire filter tree to address combinations of ANDs and ORs

* Specifying in docs that, this case works only for ORs

* Revert "Reading entire filter tree to address combinations of ANDs and ORs"

This reverts commit 81ca8f8496.

* Covering a class cast exception and updating docs

* Counting changed

Co-authored-by: Jihoon Son <jihoonson@apache.org>
2022-02-15 20:45:07 -08:00
Jason Koch 26bc4b7345
perf: cache row if it is a transformed row (#12113)
* perf: cache row if it is a transformed row

* perf: cache row if it is a transformed row (also cache DateTime object)
2022-02-15 10:08:41 -08:00
somu-imply 033989eb1d
Adding vectorized time_shift (#12254)
* Adding vectorized time_shift

* Vectorize time shift, addressing review comments

* Remove an unused import
2022-02-11 14:44:52 -08:00
Clint Wylie 3ee66bb492
allow optimizing sql expressions and virtual columns (#12241)
* rework sql planner expression and virtual column handling

* simplify a bit

* add back and deprecate old methods, more tests, fix multi-value string coercion bug and associated tests

* spotbugs

* fix bugs with multi-value string array expression handling

* javadocs and adjust test

* better

* fix tests
2022-02-09 14:55:50 -08:00
Clint Wylie ae71e05fc5
array_concat_agg and array_agg support for array inputs (#12226)
* array_concat_agg and array_agg support for array inputs
changes:
* added array_concat_agg to aggregate arrays into a single array
* added array_agg support for array inputs to make nested array
* added 'shouldAggregateNullInputs' and 'shouldCombineAggregateNullInputs' to fix a correctness issue with STRING_AGG and ARRAY_AGG when merging results, with dual purpose of being an optimization for aggregating

* fix test

* tie capabilities type to legacy mode flag about coercing arrays to strings

* oops

* better javadoc
2022-02-07 19:59:30 -08:00
Gian Merlino de82c611de
Harmonize implementations of "visit" for Exprs from ExprMacros. (#12230)
* Harmonize implementations of "visit" for Exprs from ExprMacros.

Many of them had bugs where they would not visit all of the original
arguments. I don't think this has user-visible consequences right now,
but it's possible it would in a future world where "visit" is used
for more stuff than it is today.

So, this patch all updates all implementations to a more consistent
style that emphasizes reapplying the macro to the shuttled args.

* Test fixes, test coverage, PR review comments.
2022-02-04 08:08:54 -08:00
Clint Wylie a3affe1471
make EncodedKeyComponent constructor public, remove nullable from DimensionIndexer.processRowValsToUnsortedEncodedKeyComponent (#12229) 2022-02-03 15:02:32 -08:00
Kashif Faraz e648b01afb
Improve memory estimates in Aggregator and DimensionIndexer (#12073)
Fixes #12022  

### Description
The current implementations of memory estimation in `OnHeapIncrementalIndex` and `StringDimensionIndexer` tend to over-estimate which leads to more persistence cycles than necessary.

This PR replaces the max estimation mechanism with getting the incremental memory used by the aggregator or indexer at each invocation of `aggregate` or `encode` respectively.

### Changes
- Add new flag `useMaxMemoryEstimates` in the task context. This overrides the same flag in DefaultTaskConfig i.e. `druid.indexer.task.default.context` map
- Add method `AggregatorFactory.factorizeWithSize()` that returns an `AggregatorAndSize` which contains
  the aggregator instance and the estimated initial size of the aggregator
- Add method `Aggregator.aggregateWithSize()` which returns the incremental memory used by this aggregation step
- Update the method `DimensionIndexer.processRowValsToKeyComponent()` to return the encoded key component as well as its effective size in bytes
- Update `OnHeapIncrementalIndex` to use the new estimations only if `useMaxMemoryEstimates = false`
2022-02-03 10:34:02 +05:30
Clint Wylie f9b406c8f2
add backwards compatibility mode for multi-value string array null value coercion (#12210) 2022-01-31 22:38:15 -08:00
Jihoon Son eeed156dc0
Fix compile error in VirtualizedColumnSelectorFactoryTest (#12208) 2022-01-27 17:35:50 -08:00
Gian Merlino 99a5c2f3d3
Harmonize behavior when virtual columns reference each other. (#11955)
* VirtualizedColumnSelectorFactory: Allow virtual columns to reference each other.

This matches the behavior of QueryableIndex and IncrementalIndex based cursors.

* Fixes to getColumnCapabilities.
2022-01-27 14:31:48 -08:00
Karan Kumar 96b3498a40
Grouping on arrays as arrays (#12078)
* init multiValue column group by

* Changing sorting to Lexicographic as default

* Adding initial tests

* 1.Fixing test cases adding
2.Optimized inmem structs

* Linking SQL layer to native layer

* Adding multiDimension support to group by column strategy

* 1. Removing array coercion in Calcite layer
2. Removing ResultRowDeserializer

* 1. Supporting all primitive array types
2. Removing dimension spec as part of columnSelector

* 1. Supporting all primitive array types
2. Removing dimension spec as part of columnSelector

* 1. Checkstyle things
2. Removing flag

* Minor naming things

* CheckStyle Things

* Fixing test case

* Fixing hashing

* 1. Adding the MV function
2. Added few test cases

* 1. Adding MV function test cases

* Adding Selector strategy function test cases

* Fixing ClientQuerySegmentWalkerTest

* Adding GroupByQueryRunnerTest test cases

* Fixing test cases

* Adding few more test cases

* Fixing Exception asset statement and intellij inspection

* Adding null compatibility tests

* Review comments

* Fixing few failing tests

* Fixing few failing tests

* Do no convert to topN Q incase of group by on array

* Fixing checkstyle

* Fixing differences between jdk's class cast exception message

* 1. Fixing ordering if the grouping key is an array

* Fixing DefaultLimitSpec

* Fixing CalciteArraysQueryTest

* Dummy commit for LGTM

* changes:
* only coerce multi-value string null values when `ExpressionPlan.Trait.NEEDS_APPLIED` is set
* correct return type inference for ARRAY_APPEND,ARRAY_PREPEND,ARRAY_SLICE,ARRAY_CONCAT
* fix bug with ExprEval.ofType when actual type of object from binding doesn't match its claimed type

* Review comments

* Fixing test cases

* Fixing spot bugs

* Fixing strict compile

Co-authored-by: Clint Wylie <cwylie@apache.org>
2022-01-25 20:30:56 -08:00
Clint Wylie fce62b2643
fix StringAnyAggregatorFactory to use single value selector for non-existent columns (#12194) 2022-01-25 12:52:30 -08:00
somu-imply cc8b9c0b6e
Handling OOM error in ExpressionVector setup by reducing number of rows (#12186)
* Handling OOM error in ExpressionVector setup by reducing number of rows

* Removing row size to 10K in sanity tests
2022-01-24 08:37:13 -08:00
Clint Wylie e0c4c568cb
fix incorrect ColumnInspector in IncrementalIndex.makeColumnSelectorFactory (#12155) 2022-01-13 18:09:06 -08:00
Clint Wylie f2ce76966c
add EARLIEST_BY/LATEST_BY to make EARLIEST/LATEST function signatures less ambiguous (#12145)
* add EARLIEST_BY/LATEST_BY to make EARLIEST/LATEST function signatures unambiguous

* switcheroo

* EARLIEST_BY/LATEST_BY use timestamp instead of numeric types, update docs

* revert unintended change

* fix docs

* fix docs better
2022-01-12 03:48:53 -08:00
Rohan Garg 81f0aba6cb
Use ListFilteredVirtualColumn for left/fact table expression in join condition (#12127)
* Pass VirtualColumnRegistry in PlannerContext for join expression planning

* Allow for including VCs from join fact table expression

* Optmize MV_FILTER functions to use a VC when in join fact table expression

* fixup! Allow for including VCs from join fact table expression

* Address review comments
2022-01-11 14:47:13 -08:00
imply-cheddar eb0bae49ec
Update PostAggregator to be backwards compat (#12138)
This change mimics what was done in PR #11917 to
fix the incompatibilities produced by #11713. #11917
fixed it with AggregatorFactory by creating default
methods to allow for extensions built against old
jars to still work.  This does the same for PostAggregator
2022-01-11 02:18:14 -08:00
Clint Wylie 7cf9192765
fix delegated smoosh writer and some new facilities for segment writeout medium (#12132)
* fix delegated smoosh writer and some new facilities for segment writeout medium
changes:
* fixed issue with delegated `SmooshedWriter` when writing files that look like paths, causing `NoSuchFileException` exceptions when attempting to open a channel to the file
* `FileSmoosher.addWithSmooshedWriter` when _not_ delegating now checks that it is still open when closing, making it a no-op if already closed (allowing column serializers to add additional files and avoid delegated mode if they are finished writing out their own content and ned to add additional files)
* add `makeChildWriteOutMedium` to `SegmentWriteOutMedium` interface, which allows users of a shared medium to clean up `WriteOutBytes` if they fully control the lifecycle. there are no callers of this yet, adding for future functionality
* `OnHeapByteBufferWriteOutBytes` now can be marked as not open so it `OnHeapMemorySegmentWriteOutMedium` can now behave identically to other medium implementations

* fix to address nit - use AtomicLong
2022-01-10 22:25:19 -08:00
Clint Wylie e583033231
add 'TypeStrategy' to types (#11888)
* add TypeStrategy - value comparators and binary serialization for any TypeSignature
2022-01-10 17:12:14 -08:00
somu-imply c267b65f97
Removing unused processing threadpool on broker (#12070)
* Thread pool for broker

* Updating two tests to improve coverage for new method added

* Updating druidProcessingConfigTest to cover coverage

* Adding missed spelling errors caused in doc

* Adding test to cover lines of new function added
2021-12-21 13:07:53 -08:00
Abhishek Agarwal 5d043cefbc
Fix test in ResponseContextTest (#12077) 2021-12-16 22:51:51 -08:00
Clint Wylie 244c2559e9
fix IncrementalIndex performance regression (#12048)
changes:
* IncrementalIndex is now a ColumnInspector
* fixes performance regression from using map of ColumnCapabilities from IncrementalIndex as a RowSignature
2021-12-09 22:04:32 -08:00
Jonathan Wei 229f82a6f0
Add parse error list API for stream supervisors, use structured object for parse exceptions, simplify parse exception message (#11961)
* Add parse error list API for stream supervisors, simplify parse exception message

* Add input string to parse exception

* Use structured ParseExceptionReport

* Fix tests

* Add test

* PR comments, add ParseExceptionReport equals verifier

* Fix test
2021-12-09 15:42:55 -06:00
Laksh Singla ca260dfef6
Intern RowSignature in DruidSchema to reduce its memory footprint (#12001)
DruidSchema consists of a concurrent HashMap of DataSource -> Segement -> AvailableSegmentMetadata. AvailableSegmentMetadata contains RowSignature of the segment, and for each segment, a new object is getting created. RowSignature is an immutable class, and hence it can be interned, and this can lead to huge savings of memory being used in broker, since a lot of the segments of a table would potentially have same RowSignature.
2021-12-08 15:11:13 +05:30
Clint Wylie 45be2be368
fix issues with multi-value string constant expressions (#12025)
* add specialized constant selector for multi-valued string constants
2021-12-08 00:10:26 -08:00
Clint Wylie a8815f671e
Fix druid client timeout zero (#12023)
* fix bug where queries fail immediately when timeout is 0 instead of using default timeout

* fix to use serverside max

* more better

* less flaky test

* oops
2021-12-07 12:41:01 -08:00
Paul Rogers 34a3d45737
Refactor ResponseContext (#11828)
* Refactor ResponseContext

Fixes a number of issues in preparation for request trailers
and the query profile.

* Converts keys from an enum to classes for smaller code
* Wraps stored values in functions for easier capture for other uses
* Reworks the "header squeezer" to handle types other than arrays.
* Uses metadata for visibility, and ability to compress,
  to replace ad-hoc code.
* Cleans up JSON serialization for the response context.
* Other miscellaneous cleanup.

* Handle unknown keys in deserialization

Also, make "Visibility" into a boolean.

* Revised comment

* Renamd variable
2021-12-06 17:03:12 -08:00
Clint Wylie 84b4bf56d8
vectorize logical operators and boolean functions (#11184)
changes:
* adds new config, druid.expressions.useStrictBooleans which make longs the official boolean type of all expressions
* vectorize logical operators and boolean functions, some only if useStrictBooleans is true
2021-12-02 16:40:23 -08:00
Paul Rogers a66f10eea1
Code cleanup from query profile project (#11822)
* Code cleanup from query profile project

* Fix spelling errors
* Fix Javadoc formatting
* Abstract out repeated test code
* Reuse constants in place of some string literals
* Fix up some parameterized types
* Reduce warnings reported by Eclipse

* Reverted change due to lack of tests
2021-11-30 11:35:38 -08:00
Gian Merlino f6e6ca2893
Use intermediate-persist IndexSpec during multiphase merge. (#11940)
* Use intermediate-persist IndexSpec during multiphase merge.

The main change is the addition of an intermediate-persist IndexSpec
to the main "merge" method in IndexMerger. There are also a few minor
adjustments to the IndexMerger interface to encourage more harmonious
usage of its methods in the future.

* Additional changes inspired by the test coverage checker.

- Remove unused-in-production IndexMerger methods "append" and "convert".
- Add additional unit tests to UnifiedIndexerAppenderatorsManager.

* Additional adjustments.

* Even more additional adjustments.

* Test fixes.
2021-11-29 15:08:49 -08:00
Gian Merlino 93aeaf4801
Improve on-heap aggregator footprint estimates. (#11950)
Add a "guessAggregatorHeapFootprint" method to AggregatorFactory that
mitigates #6743 by enabling heap footprint estimates based on a specific
number of rows. The idea is that at ingestion time, the number of rows
that go into an aggregator will be 1 (if rollup is off) or will likely
be a small number (if rollup is on).

It's a heuristic, because of course nothing guarantees that the rollup
ratio is a small number. But it's a common case, and I expect this logic
to go wrong much less often than the current logic. Also, when it does
go wrong, users can fix it by lowering maxRowsInMemory or
maxBytesInMemory. The current situation is unintuitive: when the
estimation goes wrong, users get an OOME, but actually they need to
*raise* these limits to fix it.
2021-11-28 13:21:24 +05:30
Rohan Garg 2c08055962
Specify time column for first/last aggregators (#11949)
Add the ability to pass time column in first/last aggregator (and latest/earliest SQL functions). It is to support cases where the time to query upon is stored as a part of a column different than __time. Also, some other logical time column can be specified.
2021-11-25 09:44:14 +05:30
Gian Merlino 12e2228510
RowBasedGrouperHelper: Set hasMultipleValues = false in capabilities. (#11954)
Useful because it enables anything that consumes groupBy results to
potentially operate more efficiently.
2021-11-24 13:14:58 -08:00
Gian Merlino 5e168b861a
StorageAdapter: Add getRowSignature method. (#11953)
Simplifies logic for callers that only want to get a list of all the
column names, or column names and types. Updated callers SegmentAnalyzer,
HashJoinSegmentStorageAdapter, and DruidSegmentReader.
2021-11-24 13:14:25 -08:00
Gian Merlino 0354407655
SQL INSERT planner support. (#11959)
* SQL INSERT planner support.

The main changes are:

1) DruidPlanner is able to validate and authorize INSERT queries. They
   require WRITE permission on the target datasource.

2) QueryMaker is now an interface, and there is a QueryMakerFactory that
   creates instances of it. There is only one production implementation
   of each (NativeQueryMaker and NativeQueryMakerFactory), which
   together behave the same way as the former QueryMaker class. But this
   opens the door to executing queries in ways other than the Druid
   query stack, and is used by unit tests (CalciteInsertDmlTest) to
   test the INSERT planning functionality.

3) Adds an EXTERN table macro that allows references external data using
   InputSource and InputFormat from Druid's batch ingestion API. This is
   not exposed in production yet, but is used by unit tests.

4) Adds a QueryFeature concept that enables the planner to change its
   behavior slightly depending on the capabilities of the execution
   system.

5) Adds an "AuthorizableOperator" concept that enables SqlOperators
   to require additional permissions. This is used by the EXTERN table
   macro.

Related odds and ends:

- Add equals, hashCode, toString methods to InlineInputSource. Aids in
  the "from external" tests in CalciteInsertDmlTest.
- Add JSON-serializability to RowSignature.
- Move the SQL string inside PlannerContext so it is "baked into" the
  planner when the planner is created. Cleans up the code a bit, since
  in practice, the same query is passed in every time to the
  same planner anyway.

* Fix up calls to CalciteTests.createMockQueryLifecycleFactory.

* Fix checkstyle issues.

* Adjustments for CI.

* Adjust DruidAvaticaHandlerTest for stricter test authorizations.
2021-11-24 12:14:04 -08:00
Gian Merlino 35b610ada7
QueryableIndexColumnSelectorFactory: Double-check cached column class. (#11957)
Important because an earlier call to getCachedColumn may have been
done with a different class, leading to a ClassCastException on the
second call. In the prior code, this could happen if a complex column
had makeDimensionSelector called on it after makeColumnValueSelector had
already been called.
2021-11-22 11:31:24 -08:00
Gian Merlino d6507c9428
PrioritizedExecutorService: Properly wrap on direct calls to "execute". (#11956)
Usually, "execute" is called by methods defined in the superclass
AbstractExecutorService, and the passed-in Runnable has been wrapped
by newTaskFor inside a PrioritizedListenableFutureTask. But this method
can also be called directly, and if so, the same wrapping is necessary
for the delegate to get a Runnable that can be entered into a priority
queue with the others.
2021-11-22 10:30:12 -08:00
Clint Wylie f260bbed23
restore and deprecate AggregatorFactory methods (#11917)
* add back and deprecate aggregator factory methods so i can say i told you so when i delete these later

* rename to make less ambiguous, fix fill method

* adjust
2021-11-19 15:59:35 -08:00
Gian Merlino 36ee0367ff
Scan: Add "orderBy" parameter. (#11930)
* Scan: Add "orderBy" parameter.

This patch adds an API for requesting non-time orderings, although it
does not actually add the ability to execute such queries.

The changes are done in such a way that no matter how Scan query objects
are constructed, they will have a correct "getOrderBy". This will enable
us to switch the execution to exclusively use "getOrderBy" later on when
it's implemented.

Scan queries are serialized such that they only include "order" (time
order) if the ordering is time-based, and they only include "orderBy" if
the ordering is non-time-based. This maximizes compatibility with
the existing API while also providing a clean look for formatted queries.

Because this patch does not include execution logic, if someone actually
tries to run a query with non-time ordering, then they will get an error
like "Cannot execute query with orderBy [quality ASC]".

* SQL module fixes.

* Add spotbugs-exclude.

* Remove unused method.
2021-11-19 08:19:12 -08:00
Clint Wylie 7f0bede878
autocompaction support for complex dimensions (#11924)
* autocompaction support for complex dimensions

* more test
2021-11-16 15:57:44 -08:00
Clint Wylie 00c976a3fe
only get bitmap index for string dictionary encoded columns (#11925) 2021-11-16 15:50:02 -08:00
Kashif Faraz 223c5692a8
Add dimension partitioningType to metrics to track usage of different partitioning schemes (#11902)
Add method ShardSpec.getType() to get name of shard spec type
List all names of shard spec types in the interface ShardSpec itself
for easy reference and maintenance
Add dimension partitioningType to metric segment/added/bytes
2021-11-11 18:34:27 +05:30
Gian Merlino fe2f7742f7
Fix incorrect comparison in RowSignature. (#11905)
PR #11882 introduced a type comparison using ==, but while it was in flight,
another PR #11713 changed the type enum to a class. So the comparison should
properly be done with "equals".
2021-11-11 04:30:42 -08:00
Laksh Singla 57ed5127a7
Make subquery IDs more comprehensive (#11809)
There are 3 types of query IDs - id, subQueryId, sqlQueryId. Currently, whenever a query generates subqueries, the subquery's subQueryId is populated randomly. Also, subquery's Id is not set to the parent query Id. Therefore there is no way of linking the subqueries to the parent query, and one loses the ability to look at end to end view of the query.

This PR aims to implement following couple of things:

Populate the subqueries with it's parent's id (and sqlQueryId if present)
Populate the subqueryId such that it forms a hierarchical relationship amongs themselves. For example, if there is a query which launches a subquery, which in turn launches a couple of subqueries, then the ids and subQueryIds should have following structure.
2021-11-11 16:31:56 +05:30
Clint Wylie 5baa22148e
revert ColumnAnalysis type, add typeSignature and use it for DruidSchema (#11895)
* revert ColumnAnalysis type, add typeSignature and use it for DruidSchema

* review stuffs

* maybe null

* better maybe null

* Update docs/querying/segmentmetadataquery.md

* Update docs/querying/segmentmetadataquery.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* fix null right

* sad

* oops

* Update batch_hadoop_queries.json

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2021-11-10 18:46:29 -08:00
Gian Merlino 14b0b4aee2
RowBasedSegment: Use Sequence instead of Iterable. (#11886)
* RowBasedSegment: Use Sequence instead of Iterable.

The main reason this is good is that Sequences can include baggage that
must be closed after iteration is finished. This enables creating
RowBasedSegments on top of closeable sequences of rows.

To preserve the optimization that allows reversing a List without
copying it, this patch also makes SimpleSequence its own class and allows
extracting the Iterable that was used to create it.

* Fix tests.
2021-11-10 06:06:52 -08:00
Gian Merlino db4d157be6
Add Finalization option to RowSignature.addAggregators. (#11882)
* Add Finalization option to RowSignature.addAggregators.

This make type signatures more useful when the caller knows whether it will
be reading aggregation results in their finalized or intermediate types.

* Fix call site.
2021-11-10 06:05:29 -08:00
Clint Wylie a8805ab60d
add missing json type for ListFilteredVirtualColumn (#11887)
* add missing json type for ListFilteredVirtualColumn, and tests to try to avoid this happening again

* fixes

* ugly, but maybe this

* oops

* too many mappers
2021-11-09 17:25:12 -08:00
Gian Merlino 6c196a5ea2
Remove StorageAdapter.getColumnTypeName. (#11893)
* Remove StorageAdapter.getColumnTypeName.

It was only used by SegmentAnalyzer, and isn't necessary anymore due to
the recent improvements to ColumnCapabilities.

Also: tidy ColumnDescriptor.read slightly by removing an instanceof
check, and moving the relevant logic into ComplexColumnPartSerde.

* Fix spellings.
2021-11-09 15:18:07 -08:00
Gian Merlino 324d4374f6
HashJoinEngine: Fix extraneous advance of left cursor. (#11890)
This could happen for right or full outer joins in certain cases. Tests
weren't catching this because existing Cursor implementations generally
ignore extraneous calls to "advance". So, to help catch this in tests,
extra state validations are also added to RowWalker, which is used by
RowBasedSegment.
2021-11-09 11:34:11 -08:00
Gian Merlino babf00f8e3
Migrate File.mkdirs to FileUtils.mkdirp. (#11879)
* Migrate File.mkdirs to FileUtils.mkdirp.

* Remove unused imports.

* Fix LookupReferencesManager.

* Simplify.

* Also migrate usages of forceMkdir.

* Fix var name.

* Fix incorrect call.

* Update test.
2021-11-09 11:10:49 -08:00
Gian Merlino 945a341acd
RowBasedCursor: Add column-value-reuse optimization. (#11884)
* RowBasedCursor: Add column-value-reuse optimization.

Most of the logic is in RowBasedColumnSelectorFactory, although in this
patch its only user is RowBasedCursor. This improves performance of
features that use RowBasedSegment, like lookup and inline datasources.
It's especially helpful for inline datasources that contain lengthy
arrays, due to the fact that the transformed array can be reused.

* Changes from code review.

* Fixes for ColumnCapabilitiesImplTest.
2021-11-09 07:18:09 -08:00
Gian Merlino a5bd0b8cc0
RowAdapter: Add a default implementation for timestampFunction. (#11885)
Enables simpler implementations for adapters that want to treat the
timestamp as "just another column".
2021-11-08 10:25:13 -08:00