Commit Graph

423 Commits

Author SHA1 Message Date
Maytas Monsereenusorn ce4dd48bb8
Support custom coordinator duties (#11601)
* impl

* fix checkstyle

* fix checkstyle

* fix checkstyle

* add test

* add test

* add test

* add integration tests

* add integration tests

* add more docs

* address comments

* address comments

* address comments

* add test

* fix checkstyle

* fix test
2021-08-19 11:54:11 +07:00
Parag Jain c7b46671b3
option to use deep storage for storing shuffle data (#11507)
Fixes #11297.
Description

Description and design in the proposal #11297
Key changed/added classes in this PR

    *DataSegmentPusher
    *ShuffleClient
    *PartitionStat
    *PartitionLocation
    *IntermediaryDataManager
2021-08-13 16:40:25 -04:00
Maytas Monsereenusorn 06bae29979
Fix ingestion task failure when no input split to process (#11553)
* fix ingestion task failure when no input split to process

* add IT

* fix IT
2021-08-09 23:11:08 +07:00
dependabot[bot] 511bc964ff
Bump docker-java-transport-netty from 3.2.8 to 3.2.11 (#11532)
Bumps [docker-java-transport-netty](https://github.com/docker-java/docker-java) from 3.2.8 to 3.2.11.
- [Release notes](https://github.com/docker-java/docker-java/releases)
- [Changelog](https://github.com/docker-java/docker-java/blob/master/CHANGELOG.md)
- [Commits](https://github.com/docker-java/docker-java/compare/3.2.8...3.2.11)

---
updated-dependencies:
- dependency-name: com.github.docker-java:docker-java-transport-netty
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-08-03 17:53:22 -07:00
Jonathan Wei 676efb1b3f
Fix integration test credential resource path handling (#11487)
This PR fixes an issue with the integration test copy_resources.sh script.

The "install druid jars" portion was removing the $SHARED_DIR/docker directory, which wipes out the $SHARED_DIR/docker/extensions and $SHARED_DIR/docker/credentials directories created just before, which leads to issues later in the script when copying resources to the $SHARED_DIR/docker/credentials/ dir.
2021-07-27 12:32:34 +05:30
Maytas Monsereenusorn 161f4dbc0e
Add integration tests for S3 Assume Role ingestion feature (#11472)
* add IT for S3 assume role

* fix checkstyle

* fix test

* fix pom

* fix test
2021-07-23 10:09:09 +07:00
Maytas Monsereenusorn d3e82b1114
speed up test (#11442) 2021-07-14 21:14:38 +07:00
Maytas Monsereenusorn 05d5dd9289
compaction/status API retains status for datasources that no longer existed causing in-memory used to grow unbounded (#11426)
* compaction/status API retains status for datasources that no longer existed causing in-memory used to grow unbounded

* compaction/status API retains status for datasources that no longer existed causing in-memory used to grow unbounded

* compaction/status API retains status for datasources that no longer existed causing in-memory used to grow unbounded

* fix test

* fix test
2021-07-13 09:48:06 +07:00
Maytas Monsereenusorn f5d53569ca
Supervisor metadata auto cleanup failing as missing Guice injection (#11424)
* Fix Supervisor metadata auto cleanup failing as missing Guice injection

* Fix Supervisor metadata auto cleanup failing as missing Guice injection

* fix IT

* fix IT

* Update services/src/main/java/org/apache/druid/cli/CliCoordinator.java

Co-authored-by: Clint Wylie <cjwylie@gmail.com>

* fix

* fix

* fix

* fix

* fix

* fix

* fix

Co-authored-by: Clint Wylie <cjwylie@gmail.com>
2021-07-13 09:47:49 +07:00
Abhishek Agarwal e228a84d91
Fix retry sleep when callable throws exception (#11430)
If the callable throws an exception, we neither increase the retry count nor sleep the thread.
2021-07-11 15:06:10 +05:30
Clint Wylie 63fcd77c38
support using mariadb connector with mysql extensions (#11402)
* support using mariadb connector with mysql extensions

* cleanup and more tests

* fix test

* javadocs, more tests, etc

* style and more test

* more test more better

* missing pom

* more pom
2021-07-08 12:25:37 -07:00
Abhishek Agarwal 03a6a6d6e1
Replace Processing ExecutorService with QueryProcessingPool (#11382)
This PR refactors the code for QueryRunnerFactory#mergeRunners to accept a new interface called QueryProcessingPool instead of ExecutorService for concurrent execution of query runners. This interface will let custom extensions inject their own implementation for deciding which query-runner to prioritize first. The default implementation is the same as today that takes the priority of query into account. QueryProcessingPool can also be used as a regular executor service. It has a dedicated method for accepting query execution work so implementations can differentiate between regular async tasks and query execution tasks. This dedicated method also passes the QueryRunner object as part of the task information. This hook will let custom extensions carry any state from QuerySegmentWalker to QueryProcessingPool#mergeRunners which is not possible currently.
2021-07-01 16:03:08 +05:30
frank chen 906a704c55
Eliminate ambiguities of KB/MB/GB in the doc (#11333)
* GB ---> GiB

* suppress spelling check

* MB --> MiB, KB --> KiB

* Use IEC binary prefix

* Add reference link

* Fix doc style
2021-06-30 13:42:45 -07:00
Xavier Léauté 3ad6a3d74f
switch to netty-bom instead of individual dependencies (#11356) 2021-06-29 12:52:12 -07:00
Kashif Faraz f0b105ec63
Temporarily skip compaction for locked intervals (#11190)
* Add overlord API /lockedIntervals. Skip compaction for locked intervals

* Revert formatting changes

* Add license info

* Fix checkstyle

* Remove invalid API invocation

* Fix checkstyle

* Add DatasourceIntervalsTest

* Fix checkstyle

* Remove LockedIntervalsResponse

* Add integration tests for lockedIntervals

* Add ITAutoCompactionLockContentionTest

* Add config druid.coordinator.compaction.skipLockedIntervals

* Add test for TaskQueue
2021-06-20 17:21:59 -07:00
dependabot[bot] 1e8b5360b3
Bump docker-java-transport-netty from 3.2.0 to 3.2.8 (#11337)
Bumps [docker-java-transport-netty](https://github.com/docker-java/docker-java) from 3.2.0 to 3.2.8.
- [Release notes](https://github.com/docker-java/docker-java/releases)
- [Changelog](https://github.com/docker-java/docker-java/blob/master/CHANGELOG.md)
- [Commits](https://github.com/docker-java/docker-java/compare/3.2.0...3.2.8)

---
updated-dependencies:
- dependency-name: com.github.docker-java:docker-java-transport-netty
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-06-07 18:58:38 -07:00
zachjsh 27f1b6cbf3
Fix Index hadoop failing with index.zip is not a valid DFS filename (#11316)
* * Fix bug

* * simplify class loading

* * fix example configs for integration tests

* Small classloader cleanup

Co-authored-by: jon-wei <jon.wei@imply.io>
2021-06-01 19:14:50 -04:00
Maytas Monsereenusorn e5633d7842
Fix bug: 502 bad gateway thrown when we edit/delete any auto compaction config created 0.21.0 or before (#11311)
* fix bug

* add test

* fix IT

* fix checkstyle

* address comments
2021-05-27 16:34:32 -07:00
Xavier Léauté b517c3339b
remove ZooKeeper 3.4 support + pass tests with Java 15 (#11073)
With this change, Druid will only support ZooKeeper 3.5.x and later.

In order to support Java 15 we need to switch to ZK 3.5.x client libraries and drop support for ZK 3.4.x
(see #10780 for the detailed reasons) 

* remove ZooKeeper 3.4.x compatibility
* exclude additional ZK 3.5.x netty dependencies to ensure we use our version
* keep ZooKeeper version used for integration tests in sync with client library version
* remove the need to specify ZK version at runtime for docker
* add support to run integration tests with JDK 15
* build and run unit tests with Java 15 in travis
2021-05-25 12:49:49 -07:00
fhan 82380b67e0
Improve IT job 79 ITNestedQueryPushDownTest integration test (#11268)
* improve occasional failure caused by resource competition

* adjust more configs in tiny-cluters.yaml

Co-authored-by: yfhanfei <yfhanfei@ZBMac-C02DW5SMMD6P.local>
2021-05-24 10:12:34 +08:00
Agustin Gonzalez 383daa4029
Catch exception inside ITRetryUtil to fix one of the causes for flaky integration tests (#11265)
* Do not stop retrying when an exception is encountered. Save & propagate last exception if retry count is exceeded.

* Add one more log message to help with debugging

* Limit schema registry heap to attempt to control OOMs
2021-05-19 13:56:02 -07:00
Clint Wylie 933350d106
integration test runner xmx (#11273)
* integration test runner xmx

* smaller
2021-05-19 12:59:50 -07:00
Yi Yuan 3be8e29269
Add integration test for protobuf (#11126)
* add file test

* test

* for test

* bug fixed

* test

* test

* test

* bug fixed

* delete auto scaler

* add input format

* add extensions

* bug fixed

* bug fixed

* bug fixed

* revert

* add schema registry test

* bug fixed

* bug fixed

* delete desc

* delete change

* add desc

* bug fixed

* test inputformat

* bug fixed

* bug fixed

* bug fixed

* bug fixed

* delete io exception

* change builder not static

* change pom

* bug fixed

Co-authored-by: yuanyi <yuanyi@freewheel.tv>
2021-05-17 15:45:07 -07:00
Xavier Léauté 3b9dad4c9e
Consolidate the number of Dockerfiles (#11187)
* Consolidate the number of Dockerfiles

* add build-arguments to choose which Java base image to use at runtime
* default to building image with Java 11
* base k8s integration test image off of the default image: this ensures
  our docker image now gets tested as part of integration tests.

* upgrade maven help plugin to 3.2.0
2021-05-07 10:41:34 -07:00
zachjsh 99f39c7202
Hadoop segment index file rename (#11194)
* Do stuff

* Do more stuff

* * Do more stuff

* * Do more stuff

* * working

* * cleanup

* * more cleanup

* * more cleanup

* * add license header

* * Add unit tests

* * add java docs

* * add more unit tests

* * Cleanup test

* * Move removing of workingPath to index task rather than in hadoop job.

* * Address review comments

* * remove unused import

* * Address review comments

* Do not overwrite segment descriptor for segment if it already exists.

* * add comments to FileSystemHelper class

* * fix local hadoop integration test

* * Fix failing test failures when running with java11

* Revert "Revert "Adjust HadoopIndexTask temp segment renaming to avoid potential race conditions (#11075)" (#11151)"

This reverts commit 49a9c3ffb7.

* * remove JobHelperPowerMockTest

* * remove FileSystemHelper class
2021-05-04 20:22:18 -04:00
frank chen 204901a602
Fix Smile encoding for HTTP response (#10980)
* fix Smile encoding bug

Signed-off-by: frank chen <frank.chen021@outlook.com>

* Add unit tests

* Add IT for smile encoding

* Fix cases

* Update javadoc

Co-authored-by: Jihoon Son <jihoonson@apache.org>

* resolve comments

Co-authored-by: Jihoon Son <jihoonson@apache.org>
2021-05-03 22:43:47 -07:00
Xavier Léauté 0296f20551
upgrade Apache Kafka to 2.8.0 (#11139)
* upgrade to Apache Kafka 2.8.0 (release notes:
  https://downloads.apache.org/kafka/2.8.0/RELEASE_NOTES.html)
* pass Kafka version as a Docker argument in integration tests
  to keep in sync with maven version
* fix use of internal Kafka APIs in integration tests
2021-04-24 08:27:07 -07:00
Jonathan Wei 49a9c3ffb7
Revert "Adjust HadoopIndexTask temp segment renaming to avoid potential race conditions (#11075)" (#11151)
This reverts commit a2892d9c40.
2021-04-22 15:33:27 -07:00
zachjsh a2892d9c40
Adjust HadoopIndexTask temp segment renaming to avoid potential race conditions (#11075)
* Do stuff

* Do more stuff

* * Do more stuff

* * Do more stuff

* * working

* * cleanup

* * more cleanup

* * more cleanup

* * add license header

* * Add unit tests

* * add java docs

* * add more unit tests

* * Cleanup test

* * Move removing of workingPath to index task rather than in hadoop job.

* * Address review comments

* * remove unused import

* * Address review comments

* Do not overwrite segment descriptor for segment if it already exists.

* * add comments to FileSystemHelper class

* * fix local hadoop integration test
2021-04-21 12:24:31 -07:00
Yi Yuan d0a94a8c14
add avro stream input format (#11040)
* add avro stream input format

* bug fixed

* add document

* doc fix

* change doc

* add integretion test

* bug fixed

* bug fixed

* add string as binary getter

Co-authored-by: yuanyi <yuanyi@freewheel.tv>
2021-04-12 21:53:41 -07:00
Jihoon Son a6a2758095
More unit tests for JsonParserIterator; Integration tests for query errors (#11091)
* unit tests for timeout exception in init

* integration tests

* run integraion test on travis

* fix inspection
2021-04-12 15:08:50 -07:00
Jonathan Wei e7b2ecd0fd
Add retry around query loop in ITWikipediaQueryTest.testQueryLaningLaneIsLimited (#11077) 2021-04-09 10:54:34 -07:00
Maytas Monsereenusorn 4576152e4a
Make dropExisting flag for Compaction configurable and add warning documentations (#11070)
* Make dropExisting flag for Compaction configurable

* fix checkstyle

* fix checkstyle

* fix test

* add tests

* fix spelling

* fix docs

* add IT

* fix test

* fix doc

* fix doc
2021-04-09 00:12:28 -07:00
Lucas Capistrant 8264203cee
Allow client to configure batch ingestion task to wait to complete until segments are confirmed to be available by other (#10676)
* Add ability to wait for segment availability for batch jobs

* IT updates

* fix queries in legacy hadoop IT

* Fix broken indexing integration tests

* address an lgtm flag

* spell checker still flagging for hadoop doc. adding under that file header too

* fix compaction IT

* Updates to wait for availability method

* improve unit testing for patch

* fix bad indentation

* refactor waitForSegmentAvailability

* Fixes based off of review comments

* cleanup to get compile after merging with master

* fix failing test after previous logic update

* add back code that must have gotten deleted during conflict resolution

* update some logging code

* fixes to get compilation working after merge with master

* reset interrupt flag in catch block after code review pointed it out

* small changes following self-review

* fixup some issues brought on by merge with master

* small changes after review

* cleanup a little bit after merge with master

* Fix potential resource leak in AbstractBatchIndexTask

* syntax fix

* Add a Compcation TuningConfig type

* add docs stipulating the lack of support by Compaction tasks for the new config

* Fixup compilation errors after merge with master

* Remove erreneous newline
2021-04-08 21:03:00 -07:00
zhangyue19921010 de691808ce
[Bug]Kinesis-data-format IT can not work (#11071)
* start schema-resgity and replace json template

* add docs

Co-authored-by: yuezhang <yuezhang@freewheel.tv>
2021-04-08 15:50:04 -07:00
Xavier Léauté 15bdd6bc2f
Fix unit tests and GC settings for Java 15 (#11074)
* JavaScript script engine support was removed in JDK 15: skip those tests for JDKs without it
* Fix flaky HTTP client tests with Java 15
* Switch from CMS to G1GC in integration tests, since CMS is no longer available in JDK 15
2021-04-08 10:33:37 -07:00
Yi Yuan 053af6815d
bug fixed (#11066)
Co-authored-by: yuanyi <yuanyi@freewheel.tv>
2021-04-06 10:39:06 +08:00
Maytas Monsereenusorn d7f5293364
Add an option for ingestion task to drop (mark unused) all existing segments that are contained by interval in the ingestionSpec (#11025)
* Auto-Compaction can run indefinitely when segmentGranularity is changed from coarser to finer.

* Add option to drop segments after ingestion

* fix checkstyle

* add tests

* add tests

* add tests

* fix test

* add tests

* fix checkstyle

* fix checkstyle

* add docs

* fix docs

* address comments

* address comments

* fix spelling
2021-04-01 12:29:36 -07:00
Lasse Krogh Mammen 782a1d4e6c
Add Calcite Avatica protobuf handler (#10543) 2021-03-31 12:46:25 -07:00
Himadri Singh 74ae2eb71a
Fix Integration Tests (#11046) 2021-03-30 01:03:49 +05:30
Clint Wylie f160548231
maybe fix leadership integration test flakes (#11031) 2021-03-26 03:43:06 -07:00
Gian Merlino bf20f9e979
DruidInputSource: Fix issues in column projection, timestamp handling. (#10267)
* DruidInputSource: Fix issues in column projection, timestamp handling.

DruidInputSource, DruidSegmentReader changes:

1) Remove "dimensions" and "metrics". They are not necessary, because we
   can compute which columns we need to read based on what is going to
   be used by the timestamp, transform, dimensions, and metrics.
2) Start using ColumnsFilter (see below) to decide which columns we need
   to read.
3) Actually respect the "timestampSpec". Previously, it was ignored, and
   the timestamp of the returned InputRows was set to the `__time` column
   of the input datasource.

(1) and (2) together fix a bug in which the DruidInputSource would not
properly read columns that are used as inputs to a transformSpec.

(3) fixes a bug where the timestampSpec would be ignored if you attempted
to set the column to something other than `__time`.

(1) and (3) are breaking changes.

Web console changes:

1) Remove "Dimensions" and "Metrics" from the Druid input source.
2) Set timestampSpec to `{"column": "__time", "format": "millis"}` for
   compatibility with the new behavior.

Other changes:

1) Add ColumnsFilter, a new class that allows input readers to determine
   which columns they need to read. Currently, it's only used by the
   DruidInputSource, but it could be used by other columnar input sources
   in the future.
2) Add a ColumnsFilter to InputRowSchema.
3) Remove the metric names from InputRowSchema (they were unused).
4) Add InputRowSchemas.fromDataSchema method that computes the proper
   ColumnsFilter for given timestamp, dimensions, transform, and metrics.
5) Add "getRequiredColumns" method to TransformSpec to support the above.

* Various fixups.

* Uncomment incorrectly commented lines.

* Move TransformSpecTest to the proper module.

* Add druid.indexer.task.ignoreTimestampSpecForDruidInputSource setting.

* Fix.

* Fix build.

* Checkstyle.

* Misc fixes.

* Fix test.

* Move config.

* Fix imports.

* Fixup.

* Fix ShuffleResourceTest.

* Add import.

* Smarter exclusions.

* Fixes based on tests.

Also, add TIME_COLUMN constant in the web console.

* Adjustments for tests.

* Reorder test data.

* Update docs.

* Update docs to say Druid 0.22.0 instead of 0.21.0.

* Fix test.

* Fix ITAutoCompactionTest.

* Changes from review & from merging.
2021-03-25 10:32:21 -07:00
Maytas Monsereenusorn c87ac0823f
Auto-compaction with segment granularity retrieve incomplete segments from timeline when interval overlap (#11019)
* Fix Auto-compaction with segment granularity retrieve incomplete segments from timeline when interval overlap

* Fix Auto-compaction with segment granularity retrieve incomplete segments from timeline when interval overlap

* Fix Auto-compaction with segment granularity retrieve incomplete segments from timeline when interval overlap

* Fix Auto-compaction with segment granularity retrieve incomplete segments from timeline when interval overlap

* address comments
2021-03-24 11:37:29 -07:00
Jihoon Son 6aec8f0c1b
allow multiple ldap bootstrap files for integration tests (#11023) 2021-03-23 13:18:36 -07:00
Maytas Monsereenusorn 51d2c61f1c
Auto-compaction with segment granularity should skip segments that already have the configured segmentGranularity (#11009)
* Auto-compaction with segment granularity should skip segments that already have the configured segmentGranularity

* Auto-compaction with segment granularity should skip segments that already have the configured segmentGranularity

* Auto-compaction with segment granularity should skip segments that already have the configured segmentGranularity

* address comments

* address comments

* address comments

* address comments

* address comments
2021-03-19 17:38:28 -07:00
Maytas Monsereenusorn f19c2e9ce4
If ingested data has sparse columns, the ingested data with forceGuaranteedRollup=true can result in imperfect rollup and final dimension ordering can be different from dimensionSpec ordering in the ingestionSpec (#10948)
* add IT

* add IT

* add the fix

* fix checkstyle

* fix compile

* fix compile

* fix test

* fix test

* address comments
2021-03-18 17:04:28 -07:00
Maytas Monsereenusorn f37713dc6d
Fix auto compaction with mixed versions in the same time chunk based on new segment granularity (#11000) 2021-03-16 12:48:19 -07:00
Xavier Léauté 68781a0d20
update testing frameworks for Java 15 support (#10984)
* update jacoco to 0.8.6
* update easymock to 4.2
* update equalsverifier to 3.5.5
* update mockito to 3.8.0
* update powermock to 2.0.9
* update assertj-core to 3.19.0
* update testng to 7.3.0
  - fix DTD url security for testng 7.x
  - fix backwards incompatibility in testng 7.x
2021-03-12 20:18:13 -08:00
Maytas Monsereenusorn ed91a2bb38
Fix Kinesis should not increment throwAway count on EOS record (#10976)
* fix Kinesis increament throwAway on EOS record

* fix checkstyle

* fix IT

* fix test

* fix IT

* fix IT

* fix IT

* fix IT
2021-03-11 22:04:58 -08:00
Vyatcheslav Mogilevsky b0432be07a
Apache archive mirror (#10979)
* Ability to use mirror of archive.apache.org

* Ability to use mirror of archive.apache.org: documentation

* Ability to use mirror of archive.apache.org: fix int test Dockerfile: missing COPY instruction
2021-03-11 09:07:51 -08:00