Commit Graph

12018 Commits

Author SHA1 Message Date
AmatyaAvadhanula 9c6b9abcde
Use javaOptsArray provided in task context (#12326)
The `javaOpts` property is being read from task context but not `javaOptsArray`.
Changes:
- Read `javaOptsArray` from task context in `ForkingTaskRunner`.
- Add test to verify that `javaOptsArray` in task context takes precedence over `javaOpts`
2022-03-28 16:33:40 +05:30
dependabot[bot] ee44fe45c6
Bump java-dogstatsd-client from 2.13.0 to 4.0.0 (#12353)
* Bump java-dogstatsd-client from 2.13.0 to 4.0.0
Bumps [java-dogstatsd-client](https://github.com/DataDog/java-dogstatsd-client) from 2.13.0 to 4.0.0.
- [Release notes](https://github.com/DataDog/java-dogstatsd-client/releases)
- [Changelog](https://github.com/DataDog/java-dogstatsd-client/blob/master/CHANGELOG.md)
- [Commits](https://github.com/DataDog/java-dogstatsd-client/compare/v2.13.0...v4.0.0)

* migrate statsd-emitter tests from easymock to mockito
* add simple init test to make diff coverage happy

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Xavier Léauté <xvrl@apache.org>
2022-03-26 16:25:13 -07:00
Maytas Monsereenusorn ea51d8a16c
Duties in Indexing group (such as Auto Compaction) does not report metrics (#12352)
* add impl

* add unit tests

* fix checkstyle

* address comments

* fix checkstyle
2022-03-23 18:18:28 -07:00
Jihoon Son b6eeef31e5
Store null columns in the segments (#12279)
* Store null columns in the segments

* fix test

* remove NullNumericColumn and unused dependency

* fix compile failure

* use guava instead of apache commons

* split new tests

* unused imports

* address comments
2022-03-23 16:54:04 -07:00
syacobovitz d7308e9290
Added support in urls, and grouped metrics (#12296) 2022-03-22 11:22:05 -07:00
Kashif Faraz 0867ca75e1
Fix OOM failures in dimension distribution phase of parallel indexing (#12331)
Parallel indexing with range partitioning can often cause OOM in the
`ParallelIndexSupervisorTask` during the dimension distribution phase.
This typically happens because of too many `StringSketch` objects
obtained from the different `partial_dimension_distribution` sub-tasks.

We need not keep any of the sketches in memory until we need to compute
the PartitionBoundaries for the respective interval.

Changes
- Extract `StringDistribution` from `DimensionDistributionReport`s when they are received
  and write to disk inside the task/temp/distributions
- After all the subtasks have finished, iterate over all the intervals one by one
- For each interval, read the distributions from disk, merge them and create `PartitionBoundaries`.
- Cleanup task/temp/distributions directory when all `PartitionBoundaries` have been determined
2022-03-22 19:28:15 +05:30
Adarsh Sanjeev ef45a1551e
Convert inQueryThreshold into query context parameter. (#12357)
Added Calcites InQueryThreshold as a query context parameter. Setting this parameter appropriately reduces the time taken for queries with large number of values in their IN conditions.
2022-03-22 18:33:57 +05:30
Xavier Léauté 1f0447e613
fix use of deprecated initMocks method (#12351)
follow-up to #12341
- fix use of deprecated initMocks methods and properly close mocks on teardown
2022-03-19 10:19:02 -07:00
Xavier Léauté c3377bf744
upgrade maven-pmd-plugin to fix warning (#12349)
we sometimes see warnings similar to the one mentioned
https://issues.apache.org/jira/browse/MPMD-325

Upgrading the plugin should hopefully reduce occurrence of those.
2022-03-19 10:18:26 -07:00
dependabot[bot] 4ed1abca94
Bump slf4j.version from 1.7.12 to 1.7.36 (#11594)
Bump slf4j.version from 1.7.12 to 1.7.36

- [Release notes](Release notes: https://www.slf4j.org/news.html)

Updates `jcl-over-slf4j` from 1.7.12 to 1.7.36
- [Commits](https://github.com/qos-ch/slf4j/compare/v_1.7.12...v_1.7.36)

Updates `slf4j-simple` from 1.7.12 to 1.7.36
- [Commits](https://github.com/qos-ch/slf4j/compare/v_1.7.12...v_1.7.36)

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Suneet Saldanha <suneet@apache.org>
Co-authored-by: Xavier Léauté <xvrl@apache.org>
2022-03-18 13:45:44 -07:00
Maytas Monsereenusorn dbb9518f50
Fix auto compaction by adjusting compaction task's interval to align with segmentGranularity when segmentGranularity is set (#12334)
* add impl

* add ITs

* address comments

* address comments

* address comments

* fix failure

* fix checkstyle

* fix checkstyle
2022-03-18 12:46:16 -07:00
Xavier Léauté 6f0e5f25fa
update surefire plugin to 3.0.0-M4 (#12342)
stay on surefire 3.0.0-M4 until we can upgrade to 3.0.0-M6
with a fix for https://issues.apache.org/jira/browse/SUREFIRE-1815
causing issues in RetryUtilsTest.
2022-03-18 08:20:28 -07:00
Xavier Léauté c33fa11669
improve test compatibility with Java 17 and remove deprecated methods (#12341)
* remove use of reflection in EnvironmentVariableDynamicConfigProvider for Java 17 compatibility
* fix mocks mock objects not getting closed properly, causing issues with Java 17
* remove use of deprecated methods and rules in tests
2022-03-18 08:19:28 -07:00
Aurélien Dunand 8f3a631cbf
Fix missing conversionFactor in prometheus emitter (#12338)
query/node/ttfb metrics are in milliseconds.
2022-03-17 21:46:06 -07:00
Xavier Léauté 192e411249
fix build due to com.nimbusds:lang-tag update (#12348)
the version of com.nimbusds:oauth2-oidc-sdk we depend on does not
specific an exact version dependency for com.nimbusds:lang-tag, and
instead uses a version range (see
    https://search.maven.org/artifact/com.nimbusds/oauth2-oidc-sdk/6.5/jar)

Recently a new version of lang-tag was released requiring us to update
the license file accordingly.
2022-03-17 17:44:08 -07:00
dependabot[bot] a5dfb911de
Bump maven-site-plugin from 3.1 to 3.11.0 (#12310)
Bumps [maven-site-plugin](https://github.com/apache/maven-site-plugin) from 3.1 to 3.11.0.
- [Release notes](https://github.com/apache/maven-site-plugin/releases)
- [Commits](https://github.com/apache/maven-site-plugin/compare/maven-site-plugin-3.1...maven-site-plugin-3.11.0)

---
updated-dependencies:
- dependency-name: org.apache.maven.plugins:maven-site-plugin
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-17 15:17:29 +08:00
Jihoon Son 5e23674fe5
Fix a race condition in the '/tasks' Overlord API (#12330)
* finds complete and active tasks from the same snapshot

* overlord resource

* unit test

* integration test

* javadoc and cleanup

* more cleanup

* fix test and add more
2022-03-17 10:47:45 +09:00
Frank Chen d745d0b338
Add JDK 11 (#12333) 2022-03-16 15:03:04 -07:00
Dr. Sizzles 69f928f50e
Adding k8s support for human readable parsing (#12316)
* Adding k8s support for human readable parsing

* Update docs/configuration/human-readable-byte.md

Co-authored-by: Frank Chen <frankchen@apache.org>

* Update docs/configuration/human-readable-byte.md

Co-authored-by: Frank Chen <frankchen@apache.org>

* Update core/src/main/java/org/apache/druid/java/util/common/HumanReadableBytes.java

Co-authored-by: Frank Chen <frankchen@apache.org>

* Changes per review

Co-authored-by: Rahul Gidwani <r_gidwani@apple.com>
Co-authored-by: Frank Chen <frankchen@apache.org>
2022-03-16 11:18:47 +08:00
Xavier Léauté 5d02a91faa
upgrade Error Prone to 2.11 (requires Java 11) (#12306)
The latest version of Error Prone now requires Java 11. Upgrading means we can
remove a lot of the maven profile complexity required to run checks with Java 8.
This also requires switching our strict build to use Java 11.

* update error-prone to 2.11
* remove need for specific maven profiles for Java 8 and Java 15
* fix additional Error Prone warnings with Java 11
* update strict build to use Java 11
2022-03-14 19:40:48 -07:00
somu-imply b5195c5095
Graceful null handling and correctness in DoubleMean Aggregator (#12320)
* Adding null handling for double mean aggregator

* Updating code to handle nulls in DoubleMean aggregator

* oops last one should have checkstyle issues. fixed

* Updating some code and test cases

* Checking on object is null in case of numeric aggregator

* Adding one more test to improve coverage

* Changing one test as asked in the review

* Changing one test as asked in the review for nulls
2022-03-14 16:52:47 -07:00
mchades 3de1272926
bug fix: merge results of group by limit push down (#11969) 2022-03-11 09:04:34 -08:00
Kyle Larose db91961af7
kubernetes: restart watch on null response (#12233)
* kubernetes: restart watch on null response

Kubernetes watches allow a client to efficiently processes changes to
resources. However, they have some idiosyncrasies. In particular, they
can error out for various reasons leading to what would normally be seen
as an invalid result.

The Druid kubernetes node discovery subsystem does not handle a certain
case properly. The watch can return an item with a null object.  These
leads to a null pointer exception. When this happens, the provider needs
to restart the watch, because rerunning the watch from the same resource
version leads to the same result: yet another null pointer exception.

This commit changes the provider to handle null objects by restarting
the watch.

* review: add more coverage

This adds a bit more coverage to the K8sDruidNodeDiscoveryProvider watch
loop, and removes an unnecessay return.

* kubernetes: reduce logging verbosity

The log messages about items being NULL don't really deserve to be at a
level other than DEBUG since they are not actionable, particularly since
we automatically recover now. Move them to the DEBUG level.
2022-03-10 12:56:40 -08:00
Gian Merlino cb2b2b696d
Fix error message for groupByEnableMultiValueUnnesting. (#12325)
* Fix error message for groupByEnableMultiValueUnnesting.

It referred to the incorrect context parameter.

Also, create a dedicated exception class, to allow easier detection of this
specific error.

* Fix other test.

* More better error messages.

* Test getDimensionName method.
2022-03-10 11:37:24 -08:00
Parag Jain 2efb74ff1e
fix supervisor auto scaler config serde bug (#12317) 2022-03-09 16:17:12 -08:00
Jihoon Son d89d4ff588
Git hooks should fail on errors; pass args to git hooks (#12322)
* Git hooks should fail on errors

* don't set shell to pass args
2022-03-10 09:07:50 +09:00
Abhishek Agarwal 6346b9561d
Reuse the InputEntityReader in SettableByteEntityReader (#12269)
* Reuse the InputEntityReader in SettableByteEntityReader

* Fix logic

* Fix kafka streaming ingestion

* Add Tests for kafka input format change

* Address review comments
2022-03-09 14:38:31 -08:00
Clint Wylie 9cfb23935f
push value range and set index get operations into BitmapIndex (#12315)
* push value range and set index get operations into BitmapIndex

* fix bug

* oops, fix better

* better like, fix test, javadocs

* fix checkstyle

* simplify and fixes

* cache

* fix tests

* move indexOf into GenericIndexed

* oops

* fix tests
2022-03-09 13:30:58 -08:00
Rohan Garg 9f6a930462
Fix join query incase of filter explosion during CNF conversion (#12324) 2022-03-09 12:43:09 -08:00
Clint Wylie dc0372a28e
improve FileWriteOutBytes.readFully (#12323)
* improve FileWriteOutBytes.readFully

* no need to flush if out of bounds
2022-03-09 11:45:45 -08:00
AmatyaAvadhanula 7bf1d8c5c0
Facilitate lazy initialization of connections to mitigate overwhelming of Coordinator (#12298)
Add config for eager / lazy connection initialization in ResourcePool

Description
Currently, when multiple tasks are launched, each of them eagerly initializes a full pool's worth of connections to the coordinator.

While this is acceptable when the parameter for number of eagerConnections (== maxSize) is small, this can be problematic in environments where it's a large value (say 1000) and multiple tasks are launched simultaneously, which can cause a large number of connections to be created to the coordinator, thereby overwhelming it.

Patch
Nodes like the broker may require eager initialization of resources and do not create connections with the Coordinator.
It is unnecessary to do this with other types of nodes.

A config parameter eagerInitialization is added, which when set to true, initializes the max permissible connections when ResourcePool is initialized.

If set to false, lazy initialization of connection resources takes place.

NOTE: All nodes except the broker have this new parameter set to false in the quickstart as part of this PR

Algorithm
The current implementation relies on the creation of maxSize resources eagerly.

The new implementation's behaviour is as follows:

If a resource has been previously created and is available, lend it.
Else if the number of created resources is less than the allowed parameter, create and lend it.
Else, wait for one of the lent resources to be returned.
2022-03-09 23:17:43 +05:30
Rohan Garg 56fbd2af6f
Guard against exponential increase of filters during CNF conversion (#12314)
Currently, the CNF conversion of a filter is unbounded, which means that it can create as many filters as possible thereby also leading to OOMs in historical heap. We should throw an error or disable CNF conversion if the filter count starts getting out of hand. There are ways to do CNF conversion with linear increase in filters as well but that has been left out of the scope of this change since those algorithms add new variables in the predicate - which can be contentious.
2022-03-09 13:19:52 +05:30
Clint Wylie 0600772cce
use a non-concurrent map for lookups-cached-global unless incremental updates are actually required (#12293)
* use a non-concurrent map for lookups-cached-global unless incremental updates are actually required
* adjustments
* fix test
2022-03-08 21:54:25 -08:00
Agustin Gonzalez abe76ccb90
Batch ingestion replace (#12137)
* Tombstone support for replace functionality

* A used segment interval is the interval of a current used segment that overlaps any of the input intervals for the spec

* Update compaction test to match replace behavior

* Adapt ITAutoCompactionTest to work with tombstones rather than dropping segments. Add support for tombstones in the broker.

* Style plus simple queriableindex test

* Add segment cache loader tombstone test

* Add more tests

* Add a method to the LogicalSegment to test whether it has any data

* Test filter with some empty logical segments

* Refactor more compaction/dropexisting tests

* Code coverage

* Support for all empty segments

* Skip tombstones when looking-up broker's timeline. Discard changes made to tool chest to avoid empty segments since they will no longer have empty segments after lookup because we are skipping over them.

* Fix null ptr when segment does not have a queriable index

* Add support for empty replace interval (all input data has been filtered out)

* Fixed coverage & style

* Find tombstone versions from lock versions

* Test failures & style

* Interner was making this fail since the two segments were consider equal due to their id's being equal

* Cleanup tombstone version code

* Force timeChunkLock whenever replace (i.e. dropExisting=true) is being used

* Reject replace spec when input intervals are empty

* Documentation

* Style and unit test

* Restore test code deleted by mistake

* Allocate forces TIME_CHUNK locking and uses lock versions. TombstoneShardSpec added.

* Unused imports. Dead code. Test coverage.

* Coverage.

* Prevent killer from throwing an exception for tombstones. This is the killer used in the peon for killing segments.

* Fix OmniKiller + more test coverage.

* Tombstones are now marked using a shard spec

* Drop a segment factory.json in the segment cache for tombstones

* Style

* Style + coverage

* style

* Add TombstoneLoadSpec.class to mapper in test

* Update core/src/main/java/org/apache/druid/segment/loading/TombstoneLoadSpec.java

Typo

Co-authored-by: Jonathan Wei <jon-wei@users.noreply.github.com>

* Update docs/configuration/index.md

Missing

Co-authored-by: Jonathan Wei <jon-wei@users.noreply.github.com>

* Typo

* Integrated replace with an existing test since the replace part was redundant and more importantly, the test file was very close or exceeding the 10 min default "no output" CI Travis threshold.

* Range does not work with multi-dim

Co-authored-by: Jonathan Wei <jon-wei@users.noreply.github.com>
2022-03-08 20:07:02 -07:00
Clint Wylie dae53ae36a
adjust topn heap operation when string is dictionary encoded, but not uniquely (#12291)
* add topn heap optimization when string is dictionary encoded, but not uniquely

* use array instead

* is same

* fix javadoc

* fix

* Update StringTopNColumnAggregatesProcessor.java
2022-03-08 14:32:40 -08:00
Jihoon Son 0e097ead36
Add git hooks that can run multiple scripts (#12300)
* Add git hooks that can run multiple scripts

* scripts to install/uninstall hooks

* better message for uninstall; support pre-push params
2022-03-09 07:16:47 +09:00
Gian Merlino 875e0696e0
GroupBy: Cap dictionary-building selector memory usage. (#12309)
* GroupBy: Cap dictionary-building selector memory usage.

New context parameter "maxSelectorDictionarySize" controls when the
per-segment processing code should return early and trigger a trip
to the merge buffer.

Includes:

- Vectorized and nonvectorized implementations.
- Adjustments to GroupByQueryRunnerTest to exercise this code in
  the v2SmallDictionary suite. (Both the selector dictionary and
  the merging dictionary will be small in that suite.)
- Tests for the new config parameter.

* Fix issues from tests.

* Add "pre-existing" to dictionary.

* Simplify GroupByColumnSelectorStrategy interface by removing one of the writeToKeyBuffer methods.

* Adjustments from review comments.
2022-03-08 13:13:11 -08:00
Kashif Faraz baea3ec614
Break up parallel indexing unit test to reduce test times (#12313)
* Break up parallel indexing unit test to reduce test times

* Fix checkstyle
2022-03-07 16:26:24 -07:00
Gian Merlino 28f8bcce9b
Always reopen stream in FileUtils.copyLarge, RetryingInputStream. (#12307)
* Always reopen stream in FileUtils.copyLarge, RetryingInputStream.

When an InputStream throws an exception from one of its read methods,
we should assume it's bad and reopen it.

The main changes here are:

- In FileUtils.copyLarge, replace InputStream with InputStreamSupplier.
- In RetryingInputStream, collapse retryCondition and resetCondition
  into a single condition. Also, make it required, since every usage
  is passing in a specific condition anyway.

* Test fixes.

* Fix read impl.
2022-03-05 14:39:14 -08:00
Victoria Lim 903174de20
correct errors on compaction doc (#12308) 2022-03-04 15:33:35 -08:00
Gian Merlino 3b373114dc
Officially support Java 11. (#12232)
There aren't any changes in this patch that improve Java 11
compatibility; these changes have already been done separately. This
patch merely updates documentation and explicit Java version checks.

The log message adjustments in DruidProcessingConfig are there to make
things a little nicer when running in Java 11, where we can't measure
direct memory _directly_, and so we may auto-size processing buffers
incorrectly.
2022-03-04 14:15:45 -08:00
Gian Merlino ada3ae08df
Retain order in TaskReport. (#12005) 2022-03-04 08:06:20 -08:00
Sandeep 61e1ffc7f7
add a new query laning metrics to visualize lane assignment (#12111)
* add a new query laning metrics to visualize lane assignment

* fixes :spotbugs check

* Update docs/operations/metrics.md

Co-authored-by: Benedict Jin <asdf2014@apache.org>

* Update server/src/main/java/org/apache/druid/server/QueryScheduler.java

Co-authored-by: Benedict Jin <asdf2014@apache.org>

* Update server/src/main/java/org/apache/druid/server/QueryScheduler.java

Co-authored-by: Benedict Jin <asdf2014@apache.org>

Co-authored-by: Benedict Jin <asdf2014@apache.org>
2022-03-04 15:21:17 +08:00
Frank Chen 36bc41855d
Set Content-Type for String based response (#12295) 2022-03-04 15:17:03 +08:00
Samarth Jain 58d05d7014
Fix ci (#12304) 2022-03-03 23:05:50 -08:00
dependabot[bot] a1cdee2a3a
Bump jersey.version from 1.19.3 to 1.19.4 (#12290)
* Bump jersey.version from 1.19.3 to 1.19.4

Bumps `jersey.version` from 1.19.3 to 1.19.4.

Updates `jersey-client` from 1.19.3 to 1.19.4

Updates `jersey-core` from 1.19.3 to 1.19.4

Updates `jersey-grizzly2` from 1.19.3 to 1.19.4

Updates `jersey-guice` from 1.19.3 to 1.19.4

Updates `jersey-server` from 1.19.3 to 1.19.4

Updates `jersey-servlet` from 1.19.3 to 1.19.4

Updates `jersey-json` from 1.19.3 to 1.19.4

Updates `jersey-test-framework-core` from 1.19.3 to 1.19.4

Updates `jersey-test-framework-grizzly2` from 1.19.3 to 1.19.4

---
updated-dependencies:
- dependency-name: com.sun.jersey:jersey-client
  dependency-type: direct:development
  update-type: version-update:semver-patch
- dependency-name: com.sun.jersey:jersey-core
  dependency-type: direct:development
  update-type: version-update:semver-patch
- dependency-name: com.sun.jersey:jersey-grizzly2
  dependency-type: direct:development
  update-type: version-update:semver-patch
- dependency-name: com.sun.jersey.contribs:jersey-guice
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: com.sun.jersey:jersey-server
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: com.sun.jersey:jersey-servlet
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: com.sun.jersey:jersey-json
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: com.sun.jersey.jersey-test-framework:jersey-test-framework-core
  dependency-type: direct:development
  update-type: version-update:semver-patch
- dependency-name: com.sun.jersey.jersey-test-framework:jersey-test-framework-grizzly2
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update licenses.yaml

* Update licenses.yaml

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Clint Wylie <cwylie@apache.org>
2022-03-04 09:57:20 +08:00
Clint Wylie 1c004ea47e
use virtual columns for sql simple aggregators instead of inline expressions (#12251)
* use virtual columns for sql simple aggregators instead of inline expressions

* fixes

* always use virtual columns

* add more tests
2022-03-03 15:05:28 -08:00
Jason Koch 36193955b6
perf: eliminate expensive log construction in remote-task-runner shutdown (#12097) 2022-03-03 13:38:21 -08:00
Jason Koch f594e7ac24
perf: improve RemoteTaskRunner task assignment loop performance (#12096)
* perf: improve ZkWorker task lookup performance

This improves the performance of the ZkWorker task lookup loop by
eliminating repeat calls to getRunningTasks() in toImmutable(),
and reduces the work performed in isRunningTask() to stream-parse
the id field instead of entire JSON blob.
2022-03-02 09:38:32 -08:00
Tejaswini Bandlamudi 1af4c9c933
Display row stats for multiphase parallel indexing tasks (#12280)
Row stats are reported for single phase tasks in the `/liveReports` and `/rowStats` APIs
and are also a part of the overall task report. This commit adds changes to report
row stats for multiphase tasks too.

Changes:
- Add `TaskReport` in `GeneratedPartitionsReport` generated during hash and range partitioning
- Collect the reports for `index_generate` phase in `ParallelIndexSupervisorTask`
2022-03-02 10:10:31 +05:30