Commit Graph

2838 Commits

Author SHA1 Message Date
Laksh Singla c1c7dff2ad
Using DruidExceptions in MSQ (changes related to the Broker) (#14534)
MSQ engine returns correct error codes for invalid user inputs in the query context. Also, using DruidExceptions for MSQ related errors happening in the Broker with improved error messages.
2023-07-13 19:08:49 +00:00
Abhishek Radhakrishnan f4ee58eaa8
Add `aggregatorMergeStrategy` property in SegmentMetadata queries (#14560)
* Add aggregatorMergeStrategy property to SegmentMetadaQuery.

- Adds a new property aggregatorMergeStrategy to segmentMetadata query.
aggregatorMergeStrategy currently supports three types of merge strategies -
the legacy strict and lenient strategies, and the new latest strategy.
- The latest strategy considers the latest aggregator from the latest segment
by time order when there's a conflict when merging aggregators from different
segments.
- Deprecate lenientAggregatorMerge property; The API validates that both the new
and old properties are not set, and returns an exception.
- When merging segments as part of segmentMetadata query, the segments have a more
elaborate id -- <datasource>_<interval>_merged_<partition_number> format, similar to
the name format that segments usually contain. Previously it was simply "merged".
- Adjust unit tests to test the latest strategy, to assert the returned complete
SegmentAnalysis object instead of just the aggregators for completeness.

* Don't explicitly set strict strategy in tests

* Apply suggestions from code review

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

* Update docs/querying/segmentmetadataquery.md

* Apply suggestions from code review

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

---------

Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>
2023-07-13 12:37:36 -04:00
Sam Rash 0dcb19f7e3
Add Continuous Profiling to Unit Tests (#14506)
Uses a custom continusou jfr profiler.

Modifies the github actions for tests to do profiling only in the case
of jdk17, as the profiler requires jdk17+ to use the JFR streaming API
plus a few other language features in the code.

Continuous Profiling service is provided to the Apache Druid project
free of charge by Imply and any committer can request free access to
the UI.
2023-07-12 17:50:38 -07:00
imply-cheddar 65e1b27aa7
Fix a resource leak with Window processing (#14573)
* Fix a resource leak with Window processing

Additionally, in order to find the leak, there were
adjustments to the StupidPool to track leaks a bit better.
It would appear that the pool objects get GC'd during testing
for some reason which was causing some incorrect identification
of leaks from objects that had been returned but were GC'd along
with the pool.

* Suppress unused warning
2023-07-12 17:25:42 -05:00
Gian Merlino 3ff51487b7
Add ZooKeeper connection state alerts and metrics. (#14333)
* Add ZooKeeper connection state alerts and metrics.

- New metric "zk/connected" is an indicator showing 1 when connected,
  0 when disconnected.
- New metric "zk/disconnected/time" measures time spent disconnected.
- New alert when Curator connection state enters LOST or SUSPENDED.

* Use right GuardedBy.

* Test fixes, coverage.

* Adjustment.

* Fix tests.

* Fix ITs.

* Improved injection.

* Adjust metric name, add tests.
2023-07-12 09:34:28 -07:00
Gian Merlino 3711c0d987
Reduce heap footprint of GenericIndexed. (#14563)
Two changes:

1) Intern DecompressingByteBufferObjectStrategy. Saves ~32 bytes per column.

2) Split GenericIndexed into GenericIndexed.V1 and GenericIndexed.V2. The
   major benefit here is isolating out the ByteBuffers that are only needed
   for V2. This saves ~80 bytes for V1 (one buffer instead of two).
2023-07-12 08:11:41 -07:00
Gian Merlino cc8b210e4c
AggregatorFactory: Use guessAggregatorHeapFootprint when factorizeWithSize is not implemented. (#14567)
There are two ways of estimating heap footprint of an Aggregator:

1) AggregatorFactory#guessAggregatorHeapFootprint
2) AggregatorFactory#factorizeWithSize + Aggregator#aggregateWithSize

When the second path is used, the default implementation of factorizeWithSize
is now updated to delegate to guessAggregatorHeapFootprint, making these equivalent.
The old logic used getMaxIntermediateSize, which is less accurate.

Also fixes a bug where, when using the second path, calling factorizeWithSize
on PassthroughAggregatorFactory would fail because getMaxIntermediateSize was
not implemented. (There is no buffer aggregator, so there would be no need.)
2023-07-12 07:33:27 -07:00
hqx871 7142b0c39e
Enable result level cache for GroupByStrategyV2 on broker (#11595)
Cache is disabled for GroupByStrategyV2 on broker since the pr #3820 [groupBy v2: Results not fully merged when caching is enabled on the broker]. But we can enable the result-level cache on broker for GroupByStrategyV2 and keep the segment-level cache disabled.
2023-07-12 15:00:01 +05:30
Kashif Faraz 58a35bf07e
Deprecate EntryExistsException in Druid 27 and remove in Druid 28 (#14554)
Also deprecate UnknownSegmentIdsException.
2023-07-08 15:40:14 +05:30
Gian Merlino 63ee69b4e8
Claim full support for Java 17. (#14384)
* Claim full support for Java 17.

No production code has changed, except the startup scripts.

Changes:

1) Allow Java 17 without DRUID_SKIP_JAVA_CHECK.

2) Include the full list of opens and exports on both Java 11 and 17.

3) Document that Java 17 is both supported and preferred.

4) Switch some tests from Java 11 to 17 to get better coverage on the
   preferred version.

* Doc update.

* Update errorprone.

* Update docker_build_containers.sh.

* Update errorprone in licenses.yaml.

* Add some more run-javas.

* Additional run-javas.

* Update errorprone.

* Suppress new errorprone error.

* Add exports and opens in ForkingTaskRunner for Java 11+.

Test, doc changes.

* Additional errorprone updates.

* Update for errorprone.

* Restore old fomatting in LdapCredentialsValidator.

* Copy bin/ too.

* Fix Java 15, 17 build line in docker_build_containers.sh.

* Update busybox image.

* One more java command.

* Fix interpolation.

* IT commandline refinements.

* Switch to busybox 1.34.1-glibc.

* POM adjustments, build and test one IT on 17.

* Additional debugging.

* Fix silly thing.

* Adjust command line.

* Add exports and opens one more place.

* Additional harmonization of strong encapsulation parameters.
2023-07-07 12:52:35 -07:00
Karan Kumar afa8c7b8ab
Adding Ability for MSQ to write select results to durable storage. (#14527)
One of the most requested features in druid is to have an ability to download big result sets.
As part of #14416 , we added an ability for MSQ to be queried via a query friendly endpoint. This PR builds upon that work and adds the ability for MSQ to write select results to durable storage.

We write the results to the durable storage location <prefix>/results/<queryId> in the druid frame format. This is exposed to users by
/v2/sql/statements/:queryId/results.
2023-07-07 20:49:48 +05:30
Gian Merlino dd78e00dc5
Fix ColumnSignature error message and jdk17 test issue. (#14538)
* Fix ColumnSignature error message and jdk17 test issue.

On jdk17, the "problem" part of the error message could change from
NullPointerException to:

  Cannot invoke "String.length()" because "s" is null

Due to the new more-helpful NPEs in Java 17. This broke the expectation
and led to test failures on this case.

This patch fixes the problem by improving the error message so it isn't
a generic NullPointerException.

* Fix format.
2023-07-06 15:10:59 -07:00
imply-cheddar 5fc122a144
Add window-focused tests from Drill (#13773)
This commit borrows some test definitions from Drill's test suite
and tries to use them to flesh out the full validation of window
function capbilities.

In order to be able to run these tests, we also add the ability to
run a Scan operation against segments, which also meant an
implementation of RowsAndColumns for frames.
2023-07-06 09:20:32 -07:00
imply-cheddar 277b357256
Optimize IntervalIterator (#14530)
UniformGranularityTest's test to test a large number of intervals
runs through 10 years of 1 second intervals.  This pushes a lot of
stuff through IntervalIterator and shows up in terms of test
runtime as one of the hottest tests.  Most of the time is going to
constructing jodatime objects because it is doing things with
DateTime objects instead of millis.  Change the calls to use
millis instead and things go faster.
2023-07-06 14:44:23 +05:30
Kashif Faraz 87bb1b9709
Fix bug during initialization of HttpServerInventoryView (#14517)
If a server is removed during `HttpServerInventoryView.serverInventoryInitialized`,
the initialization gets stuck as this server is never synced. The method eventually times
out (default 250s).

Fix: Mark a server as stopped if it is removed. `serverInventoryInitialized` only waits for
non-stopped servers to sync.

Other changes:
- Add new metrics for better debugging of slow broker/coordinator startup
  - `segment/serverview/sync/healthy`: whether the server view is syncing properly with a server
  - `segment/serverview/sync/unstableTime`: time for which sync with a server has been unstable  
- Clean up logging in `HttpServerInventoryView` and `ChangeRequestHttpSyncer`
- Minor refactor for readability
- Add utility class `Stopwatch`
- Add tests and stubs
2023-07-06 13:04:53 +05:30
Clint Wylie 277aaa5c57
remove druid.processing.columnCache.sizeBytes and CachingIndexed, combine string column implementations (#14500)
* combine string column implementations
changes:
* generic indexed, front-coded, and auto string columns now all share the same column and index supplier implementations
* remove CachingIndexed implementation, which I think is largely no longer needed by the switch of many things to directly using ByteBuffer, avoiding the cost of creating Strings
* remove ColumnConfig.columnCacheSizeBytes since CachingIndexed was the only user
2023-07-02 19:37:15 -07:00
Gian Merlino 67fbd8e7fc
Add "stringEncoding" parameter to DataSketches HLL. (#11201)
* Add "stringEncoding" parameter to DataSketches HLL.

Builds on the concept from #11172 and adds a way to feed HLL sketches
with UTF-8 bytes.

This must be an option rather than always-on, because prior to this
patch, HLL sketches used UTF-16LE encoding when hashing strings. To
remain compatible with sketch images created prior to this patch -- which
matters during rolling updates and when reading sketches that have been
written to segments -- we must keep UTF-16LE as the default.

Not currently documented, because I'm not yet sure how best to expose
this functionality to users. I think the first place would be in the SQL
layer: we could have it automatically select UTF-8 or UTF-16LE when
building sketches at query time. We need to be careful about this, though,
because UTF-8 isn't always faster. Sometimes, like for the results of
expressions, UTF-16LE is faster. I expect we will sort this out in
future patches.

* Fix benchmark.

* Fix style issues, improve test coverage.

* Put round back, to make IT updates easier.

* Fix test.

* Fix issue with filtered aggregators and add test.

* Use DS native update(ByteBuffer) method. Improve test coverage.

* Add another suppression.

* Fix ITAutoCompactionTest.

* Update benchmarks.

* Updates.

* Fix conflict.

* Adjustments.
2023-06-30 12:45:55 -07:00
Gian Merlino e10e35aa2c
Add REGEXP_REPLACE function. (#14460)
* Add REGEXP_REPLACE function.

Replaces all instances of a pattern with a replacement string.

* Fixes.

* Improve test coverage.

* Adjust behavior.
2023-06-29 13:47:57 -07:00
Gian Merlino a6cabbe10f
SQL: Avoid "intervals" for non-table-based datasources. (#14336)
In these other cases, stick to plain "filter". This simplifies lots of
logic downstream, and doesn't hurt since we don't have intervals-specific
optimizations outside of tables.

Fixes an issue where we couldn't properly filter on a column from an
external datasource if it was named __time.
2023-06-29 09:57:11 +05:30
Gian Merlino 82fbb31c7c
Properly read SQL-compatible segments in default-value mode. (#14142)
* Properly read SQL-compatible segments in default-value mode.

Main changes:

1) Dictionary-encoded and front-coded string columns: in default-value
   mode, detect cases where a dictionary has the empty string in it, then
   either combine it with null (if null is present) or replace it with
   null (if null is not present).

2) Numeric nullable columns: in default-value mode, ignore the null
   value bitmap. This causes all null numbers to be read as zeroes.

Testing strategy:

1) Add a mmappedWithSqlCompatibleNulls case to BaseFilterTest that
   writes segments under SQL-compatible mode, and reads them under
   default-value mode.

2) Unit tests for the new wrapper classes (CombineFirstTwoEntriesIndexed,
   CombineFirstTwoValuesColumnarInts, CombineFirstTwoValuesColumnarMultiInts,
   CombineFirstTwoValuesIndexedInts).

* Fix a mistake, use more singlethreadedness.

* WIP

* Tests, improvements.

* Style.

* See Spot bug.

* Remove unused method.

* Address review comments.

1) Read bitmaps even if we don't retain them.
2) Combine StringFrontCodedDictionaryEncodedColumn and ScalarStringDictionaryEncodedColumn.

* Add missing tests.
2023-06-28 10:30:27 -07:00
Karan Kumar cb3a9d2b57
Adding Interactive API's for MSQ engine (#14416)
This PR aims to expose a new API called
"@path("/druid/v2/sql/statements/")" which takes the same payload as the current "/druid/v2/sql" endpoint and allows users to fetch results in an async manner.
2023-06-28 17:51:58 +05:30
imply-cheddar fd20bbd30e
Fix another infinite loop and remove Mockito usage (#14493)
* Fix another infinite loop and remove Mockito usage

The ConfigManager objects were `started()` without ever being
stopped.  This scheduled a poll call that never-ended, to make
matters worse, the poll interval was set to 0 ms, making an
infinite poll with 0 sleep, i.e. an infinite loop.

Also introduce test classes and remove usage of mocks

* Checkstyle
2023-06-27 21:49:27 -07:00
Adarsh Sanjeev 0335aaa279
Add query results directory and prevent the auto cleaner from cleaning it (#14446)
Adds support for automatic cleaning of a "query-results" directory in durable storage. This directory will be cleaned up only if the task id is not known to the overlord. This will allow the storage of query results after the task has finished running.
2023-06-28 10:14:04 +05:30
Abhishek Radhakrishnan 2cfb00b1de
Add missing `isNull()` implementation to `FilteredAggregator` (#14465) 2023-06-27 16:35:15 -07:00
Gian Merlino c78d885b80
Cache parsed expressions and binding analysis in more places. (#14124)
* Cache parsed expressions and binding analysis in more places.

Main changes:

1) Cache parsed and analyzed expressions within PlannerContext for a
   single SQL query.

2) Cache parsed expressions together with input binding analysis using
   a new class AnalyzeExpr.

This speeds up SQL planning, because SQL planning involves parsing
analyzing the same expression strings over and over again.

* Fixes.

* Fix style.

* Fix test.

* Simplify: get rid of AnalyzedExpr, focus on caching.

* Rename parse -> parseExpression.
2023-06-27 13:40:35 -07:00
imply-cheddar 2f0a43790c
Make GuavaUtilsTest use less CPU (#14487) 2023-06-26 21:45:29 -07:00
Clint Wylie 6ba10c8b6c
fix bug with json_value expression array extraction (#14461) 2023-06-26 21:02:44 -07:00
Laksh Singla f546cd64a9
MSQ: Ensure that the allocated segment aligns with the requested granularity (#14475)
Changes:
- Throw an `InsertCannotAllocateSegmentFault` if the allocated segment is not aligned with
the requested granularity.
- Tests to verify new behaviour
2023-06-27 09:25:32 +05:30
Laksh Singla 114380749d
MSQ: Improve the parse exception errors and the handling of null UTF characters in Strings in Frames (#14398) 2023-06-26 18:14:29 +05:30
Laksh Singla 1647d5f4a0
Limit the subquery results by memory usage (#13952)
Users can now add a guardrail to prevent subquery’s results from exceeding the set number of bytes by setting druid.server.http.maxSubqueryRows in Broker's config or maxSubqueryRows in the query context. This feature is experimental for now and would default back to row-based limiting in case it fails to get the accurate size of the results consumed by the query.
2023-06-26 18:12:28 +05:30
Gian Merlino 970288067a
Fix flaky HttpEmitterConfigTest and ParametrizedUriEmitterConfigTest. (#14481)
Recently, we have seen flakiness in these two tests, apparently due to
computations based on Runtime.getRuntime().maxMemory() differing during
static initialization and in the actual tests. I can't think of a reason
why this would be happening, but anyway, this patch switches the tests to
use the statics instead of recomputing Runtime.getRuntime().maxMemory().
2023-06-23 16:27:11 -07:00
imply-cheddar 7e2cf35d7b
Fix compatibility issue with SqlTaskResource (#14466)
* Fix compatibility issue with SqlTaskResource

The DruidException changes broke the response format
for errors coming back from the SqlTaskResource, so fix those
2023-06-23 01:15:32 -07:00
Clint Wylie 31b9d5695d
Extend InitializedNullHandlingTest instead of NullHandlingTest (#14467)
NullHandlingTest is an actual test, it shouldn't be used as a base class
2023-06-22 15:01:50 +05:30
Hardik Bajaj 1ea9158a50
Added new SysMonitorOshi v0 using Oshi library (#14359)
Added a new monitor SysMonitorOshi to replace SysMonitor. The new monitor has a wider support for different machine architectures including ARM instances. Please switch to SysMonitorOshi as SysMonitor is now deprecated and will be removed in future releases.
2023-06-20 20:57:58 +05:30
Kashif Faraz 50461c3bd5
Enable smartSegmentLoading on the Coordinator (#13197)
This commit does a complete revamp of the coordinator to address problem areas:
- Stability: Fix several bugs, add capabilities to prioritize and cancel load queue items
- Visibility: Add new metrics, improve logs, revamp `CoordinatorRunStats`
- Configuration: Add dynamic config `smartSegmentLoading` to automatically set
optimal values for all segment loading configs such as `maxSegmentsToMove`,
`replicationThrottleLimit` and `maxSegmentsInNodeLoadingQueue`.

Changed classes:
- Add `StrategicSegmentAssigner` to make assignment decisions for load, replicate and move
- Add `SegmentAction` to distinguish between load, replicate, drop and move operations
- Add `SegmentReplicationStatus` to capture current state of replication of all used segments
- Add `SegmentLoadingConfig` to contain recomputed dynamic config values
- Simplify classes `LoadRule`, `BroadcastRule`
- Simplify the `BalancerStrategy` and `CostBalancerStrategy`
- Add several new methods to `ServerHolder` to track loaded and queued segments
- Refactor `DruidCoordinator`

Impact:
- Enable `smartSegmentLoading` by default. With this enabled, none of the following
dynamic configs need to be set: `maxSegmentsToMove`, `replicationThrottleLimit`,
`maxSegmentsInNodeLoadingQueue`, `useRoundRobinSegmentAssignment`,
`emitBalancingStats` and `replicantLifetime`.
- Coordinator reports richer metrics and produces cleaner and more informative logs
- Coordinator uses an unlimited load queue for all serves, and makes better assignment decisions
2023-06-19 14:27:35 +05:30
imply-cheddar cfd07a95b7
Errors take 3 (#14004)
Introduce DruidException, an exception whose goal in life is to be delivered to a user.

DruidException itself has javadoc on it to describe how it should be used.  This commit both introduces the Exception and adjusts some of the places that are generating exceptions to generate DruidException objects instead, as a way to show how the Exception should be used.

This work was a 3rd iteration on top of work that was started by Paul Rogers.  I don't know if his name will survive the squash-and-merge, so I'm calling it out here and thanking him for starting on this.
2023-06-19 01:11:13 -07:00
Adarsh Sanjeev 128133fadc
Add column replication_factor column to sys.segments table (#14403)
Description:
Druid allows a configuration of load rules that may cause a used segment to not be loaded
on any historical. This status is not tracked in the sys.segments table on the broker, which
makes it difficult to determine if the unavailability of a segment is expected and if we should
not wait for it to be loaded on a server after ingestion has finished.

Changes:
- Track replication factor in `SegmentReplicantLookup` during evaluation of load rules
- Update API `/druid/coordinator/v1metadata/segments` to return replication factor
- Add column `replication_factor` to the sys.segments virtual table and populate it in
`MetadataSegmentView`
- If this column is 0, the segment is not assigned to any historical and will not be loaded.
2023-06-18 10:02:21 +05:30
George Shiqi Wu 64af9bfe5b
Add groupId to metrics (#14402)
* Add group id as a dimension

* Revert changes

* Add to forking task runner

* Add missing metrics

* Fix indenting

* revert metrics

* Fix indentation
2023-06-16 09:28:16 -07:00
Clint Wylie 359bd63cc9
allow expression "best effort" type determination to better handle mixed type arrays (#14438) 2023-06-16 00:02:43 -07:00
Clint Wylie ff5ae4db6c
fix kafka input format reader schema discovery and partial schema discovery (#14421)
* fix kafka input format reader schema discovery and partial schema discovery to actually work right, by re-using dimension filtering logic of MapInputRowParser
2023-06-15 00:11:04 -07:00
Clint Wylie ca116cf886
adjust broker parallel merge to help managed blocking be more well behaved (#14427) 2023-06-15 00:10:31 -07:00
Pranav e426d370ea
Start with solo accumulator and empty partition (#14426)
* Starting parallel merge with solo accumulator and empty partitions

* shutshown pool in test
2023-06-14 16:20:48 -07:00
Clint Wylie 8454cc619a
auto columns fixes (#14422)
changes:
* auto columns no longer participate in generic 'null column' handling, this was a mistake to try to support and caused ingestion failures due to mismatched ColumnFormat, and will be replaced in the future with nested common format constant column functionality (not in this PR)
* fix bugs with auto columns which contain empty objects, empty arrays, or primitive types mixed with either of these empty constructs
* fix bug with bound filter when upper is null equivalent but is strict
2023-06-14 08:57:06 -07:00
Abhishek Radhakrishnan be5a6593a9
Reset `RuntimeInfo` to fix flaky test `ParametrizedUriEmitterConfigTest`. (#14405)
* Add injector so JVM settings are correctly set up and bound for the test.

* Add VisibleForTesting IDE annotation.

* spacing
2023-06-13 18:07:51 -07:00
Clint Wylie 61120dc49a
fix Kafka input format to throw ParseException if timestamp is missing (#14413) 2023-06-13 09:00:11 -07:00
Adarsh Sanjeev 267cbac6ff
Add logs for deleting files using storage connector (#14350)
* Add logs for deleting files using storage connector

* Address review comments

* Update log message format
2023-06-11 21:24:30 +05:30
Kashif Faraz 6e158704cb
Do not retry INSERT task into metadata if max_allowed_packet limit is violated (#14271)
Changes
- Add a `DruidException` which contains a user-facing error message, HTTP response code
- Make `EntryExistsException` extend `DruidException`
- If metadata store max_allowed_packet limit is violated while inserting a new task, throw
`DruidException` with response code 400 (bad request) to prevent retries
- Add `SQLMetadataConnector.isRootCausePacketTooBigException` with impl for MySQL
2023-06-10 12:15:44 +05:30
imply-cheddar 87149d5975
Remove AbstractIndex (#14388)
The class apparently only exists to add a toString()
method to Indexes, which basically just crashes any debugger
on any meaningfully sized index.  It's a pointless
abstract class that basically only causes pain.
2023-06-08 19:52:16 -07:00
Harini Rajendran 4ff6026d30
Adding SegmentMetadataEvent and publishing them via KafkaEmitter (#14281)
In this PR, we are enhancing KafkaEmitter, to emit metadata about published segments (SegmentMetadataEvent) into a Kafka topic. This segment metadata information that gets published into Kafka, can be used by any other downstream services to query Druid intelligently based on the segments published. The segment metadata gets published into kafka topic in json string format similar to other events.
2023-06-02 21:28:26 +05:30
zachjsh e75fb8e8e3
Account for data format and compression in MSQ auto taskAssignment (#14307)
### Description

This change allows for consideration of the input format and compression  when computing how to split the input files among available tasks, in MSQ ingestion, when considering the value of the  `maxInputBytesPerWorker` query context parameter. This query parameter allows users to control the maximum number of bytes, with granularity of input file / object, that ingestion tasks will be assigned to ingest. With this change, this context parameter now denotes the estimated weighted size in bytes of the input to split on, with consideration for input format and compression format, rather than the actual file size, reported by the file system.  We assume uncompressed newline delimited json as a baseline, with scaling factor of `1`. This means that when computing the byte weight that a file has towards the input splitting, we take the file size as is, if uncompressed json, 1:1. It was found during testing that gzip compressed json, and parquet, has scale factors of `4` and `8` respectively, meaning that each byte of data is weighted 4x and 8x respectively, when computing input splits. This weighted byte scaling is only considered for MSQ ingestion that uses either LocalInputSource or CloudObjectInputSource at the moment. The default value of the `maxInputBytesPerWorker` query context parameter has been updated from 10 GiB, to 512 MiB
2023-06-01 12:53:49 -07:00