druid

Commit Graph

Author	SHA1	Message	Date
Jihoon Son	5e23674fe5	Fix a race condition in the '/tasks' Overlord API (#12330 ) * finds complete and active tasks from the same snapshot * overlord resource * unit test * integration test * javadoc and cleanup * more cleanup * fix test and add more	2022-03-17 10:47:45 +09:00
Agustin Gonzalez	abe76ccb90	Batch ingestion replace (#12137 ) * Tombstone support for replace functionality * A used segment interval is the interval of a current used segment that overlaps any of the input intervals for the spec * Update compaction test to match replace behavior * Adapt ITAutoCompactionTest to work with tombstones rather than dropping segments. Add support for tombstones in the broker. * Style plus simple queriableindex test * Add segment cache loader tombstone test * Add more tests * Add a method to the LogicalSegment to test whether it has any data * Test filter with some empty logical segments * Refactor more compaction/dropexisting tests * Code coverage * Support for all empty segments * Skip tombstones when looking-up broker's timeline. Discard changes made to tool chest to avoid empty segments since they will no longer have empty segments after lookup because we are skipping over them. * Fix null ptr when segment does not have a queriable index * Add support for empty replace interval (all input data has been filtered out) * Fixed coverage & style * Find tombstone versions from lock versions * Test failures & style * Interner was making this fail since the two segments were consider equal due to their id's being equal * Cleanup tombstone version code * Force timeChunkLock whenever replace (i.e. dropExisting=true) is being used * Reject replace spec when input intervals are empty * Documentation * Style and unit test * Restore test code deleted by mistake * Allocate forces TIME_CHUNK locking and uses lock versions. TombstoneShardSpec added. * Unused imports. Dead code. Test coverage. * Coverage. * Prevent killer from throwing an exception for tombstones. This is the killer used in the peon for killing segments. * Fix OmniKiller + more test coverage. * Tombstones are now marked using a shard spec * Drop a segment factory.json in the segment cache for tombstones * Style * Style + coverage * style * Add TombstoneLoadSpec.class to mapper in test * Update core/src/main/java/org/apache/druid/segment/loading/TombstoneLoadSpec.java Typo Co-authored-by: Jonathan Wei <jon-wei@users.noreply.github.com> * Update docs/configuration/index.md Missing Co-authored-by: Jonathan Wei <jon-wei@users.noreply.github.com> * Typo * Integrated replace with an existing test since the replace part was redundant and more importantly, the test file was very close or exceeding the 10 min default "no output" CI Travis threshold. * Range does not work with multi-dim Co-authored-by: Jonathan Wei <jon-wei@users.noreply.github.com>	2022-03-08 20:07:02 -07:00
Xavier Léauté	1434197ee1	update airline dependency to 2.x (#12270 ) * upgrade Airline to Airline 2 https://github.com/airlift/airline is no longer maintained, updating to https://github.com/rvesse/airline (Airline 2) to use an actively maintained version, while minimizing breaking changes. Note, this is a backwards incompatible change, and extensions relying on the CliCommandCreator extension point will also need to be updated. * fix dependency checks where jakarta.inject is now resolved first instead of javax.inject, due to Airline 2 using jakarta	2022-02-27 15:19:28 -08:00
Xavier Léauté	4c61878f9c	Reduce use of mocking and simplify some tests (#12283 ) * remove use of mocks for ServiceMetricEvent * simplify KafkaEmitterTests by moving to Mockito * speed up KafkaEmitterTest by adjusting reporting frequency in tests * remove unnecessary easymock and JUnitParams dependencies	2022-02-26 17:23:09 -08:00
AmatyaAvadhanula	1ec57cb935	Improve kinesis task assignment after resharding (#12235 ) Problem: - When a kinesis stream is resharded, the original shards are closed. Any intermediate shard created in the process is eventually closed as well. - If a shard is closed before any record is put into it, it can be safely ignored for ingestion. - It is expensive to determine if a closed shard is empty, since it requires a call to the Kinesis cluster. Changes: - Maintain a cache of closed empty and closed non-empty shards in `KinesisSupervisor` - Add config `skipIngorableShards` to `KinesisSupervisorTuningConfig` - The caches are used and updated only when `skipIgnorableShards = true`	2022-02-18 12:37:06 +05:30
Abhishek Agarwal	575874705f	Fix the flakiness in getLockedIntervals test (#12172 ) Fix the flakiness in getLockedIntervals test	2022-02-17 12:08:46 +05:30
Daniel Koepke	47153cd7bd	Increase retries for Kinesis sharding integration tests. (#12255 ) This fixes intermittent, spurious failures that we've observed in the Kinesis sharding integration tests due to Kinesis taking longer than the code expected to start a sharding operation. The method that's changed is part of the integration test suite and only used by the test cases that we've seen are flaky. Prior to this change, the tests expected a sharding operation to start in 9 seconds (30 retries * 300ms delay/retry). This change bumps the number of retries to 100, giving Kinesis 30 seconds to start the sharding. This PR also makes a small, clarifying change to the condition used to determine if sharding has started. Instead of checking if the number of shards has increased (which was technically correct even if the test is reducing the number of shards due to a Kinesis implementation detail), we now just check if the shard count has changed.	2022-02-14 23:33:13 -08:00
Jihoon Son	ab3d994a17	Lazy instantiation for segmentKillers, segmentMovers, and segmentArchivers (#12207 ) * working * Lazily load segmentKillers, segmentMovers, and segmentArchivers * more tests * test-jar plugin * more coverage * lazy client * clean up changes * checkstyle * i did not change the branch condition * adjust failure rate to run tests faster * javadocs * checkstyle	2022-02-08 13:02:06 -08:00
Suneet Saldanha	ced1389d4c	Enable auto kill segments by default (#12187 ) * Enable auto-kill by default * tests * wip * test * fix IT * fix it * remove from docs * make coverage bot happy	2022-02-07 06:57:54 -08:00
Maytas Monsereenusorn	2b8e7fc0b4	Add a flag to allow auto compaction task slot ratio to consider auto scaler slots (#12228 ) * add impl * fix checkstyle * add unit tests * checkstyle * add IT * fix IT * add comments * fix checkstyle	2022-02-06 20:46:05 -08:00
Suneet Saldanha	159f97dcb0	Update docs for druid.processing.numThreads in brokers (#12231 ) * Update docs for druid.processing.numThreads * error msg * one more reference	2022-02-04 17:34:21 -08:00
Jihoon Son	20347e0c86	Wait for datasource to be ready for SQL in integration tests (#12189 ) * Wait for datasource to be ready for SQL in integration tests * add limit to the check query	2022-01-25 10:14:26 -08:00
AmatyaAvadhanula	1f63b447c4	Mitigate Kinesis stream LimitExceededException by using listShards API (#12161 ) Makes kinesis ingestion resilient to `LimitExceededException` caused by resharding. Replace `describeStream` with `listShards` (recommended) to get shard related info. `describeStream` has a limit (100) to the number of shards returned per call and a low default TPS limit of 10. `listShards` returns the info for at most 1000 shards and has a higher TPS limit of 100 as well. Key changed/added classes in this PR * `KinesisRecordSupplier` * `KinesisAdminClient`	2022-01-21 10:15:51 +05:30
Jihoon Son	cacfcfcdab	ignore hadoop-gcs directory already exists error for integration tests (#12169 )	2022-01-19 09:35:50 -08:00
Maytas Monsereenusorn	bd7fe45da0	Support adding metrics in Auto Compaction (#12125 ) * add impl * add impl * add unit tests * add unit tests * add unit tests * add unit tests * add unit tests * add integration tests * add integration tests * fix LGTM * fix test * remove doc	2022-01-17 20:19:31 -08:00
Jihoon Son	58378aa967	Move gcs-connector from lib to hadoop-dependencies for integration test (#12144 )	2022-01-12 16:47:34 -08:00
Frank Chen	c8ddf60851	Upgrade RSA Key from 1024 bit to 4096 to eliminate warnings (#11743 ) * eliminate warnings * Change the keyStore type to PKCS12	2022-01-11 13:24:09 +08:00
Jihoon Son	4a74c5adcc	Use Druid's extension loading for integration test instead of maven (#12095 ) * Use Druid's extension loading for integration test instead of maven * fix maven command * override config path * load input format extensions and kafka by default; add prepopulated-data group * all docker-composes are overridable * fix s3 configs * override config for all * fix docker_compose_args * fix security tests * turn off debug logs for overlord api calls * clean up stuff * revert docker-compose.yml * fix override config for query error test; fix circular dependency in docker compose * add back some dependencies in docker compose * new maven profile for integration test * example file filter	2022-01-05 23:33:04 -08:00
Maytas Monsereenusorn	b53e7f4d12	Support overlapping segment intervals in auto compaction (#12062 ) * add impl * add impl * fix more bugs * add tests * fix checkstyle * address comments * address comments * fix test	2022-01-04 11:47:38 -08:00
Frank Chen	58245b4617	Support JsonPath functions in JsonPath expressions (#11722 ) * Add jsonPath functions support * Add jsonPath function test for Avro * Add jsonPath function length() to Orc * Add jsonPath function length() to Parquet * Add more tests to ORC format * update doc * Fix exception during ingestion * Add IT test case * Revert "Fix exception during ingestion" This reverts commit `5a5484b9ea`. * update IT test case * Add 'keys()' * Commit IT test case * Fix UT	2021-12-10 10:53:23 +08:00
Jihoon Son	fc9513b6cd	Make NodeRole available during binding; add support for dynamic registration of DruidService (#12012 ) * Make nodeRole available during binding; add support for dynamic registration of DruidService * fix checkstyle and test * fix customRole test * address comments * add more javadoc	2021-12-03 11:59:00 -08:00
Paul Rogers	a66f10eea1	Code cleanup from query profile project (#11822 ) * Code cleanup from query profile project * Fix spelling errors * Fix Javadoc formatting * Abstract out repeated test code * Reuse constants in place of some string literals * Fix up some parameterized types * Reduce warnings reported by Eclipse * Reverted change due to lack of tests	2021-11-30 11:35:38 -08:00
Frank Chen	98957be044	Return HTTP 404 instead of 400 for supervisor/task endpoints (#11724 ) * Use 404 instead of 400 * Use 404 instead of 400 * Add UT test cases * Add IT testcases * add UT for task resource filter Signed-off-by: frank chen <frank.chen021@outlook.com> * Using org.testing.Assert instead of org.junit.Assert * Resolve comments and fix test * Fix test * Fix tests * Resolve comments	2021-11-25 13:09:47 +08:00
Maytas Monsereenusorn	bb3d2a433a	Support filtering data in Auto Compaction (#11922 ) * add impl * fix checkstyle * add test * add test * add unit tests * fix unit tests * fix unit tests * fix unit tests * add IT * add IT * add comments * fix spelling	2021-11-24 10:56:38 -08:00
Frank Chen	cfd60f1222	Improve README for integration test (#11860 ) * Optimize IT readme * Resolve comments	2021-11-22 21:32:36 +08:00
Gian Merlino	b13f07a057	Harmonize local input sources; fix batch index integration test. (#11965 ) * Make LocalInputSource.files a List instead of Set and adjust wikipedia_index_task to use file list. Rationale: the behavior of wikipedia_index_task.json is order-dependent with regard to its input files; some orders produce 4 segments and some produce 5 segments. Some integration tests, like ITSystemTableBatchIndexTaskTest and ITAutoCompactionTest, are written assuming that the 4-segment case will always happen. Providing the file list in a specific order ensures that this will happen as expected by the tests. I didn't see a specific reason why the LocalInputSource.files parameter needed to be a Set, so changing it to a List was the simplest way to achieve the consistent ordering. I think it will also make the behavior make more sense if someone does specify the same input file multiple times in a spec: I think they'd expect it to be loaded multiple times instead of deduped. This is consistent with the behavior of other input sources like S3, GCS, HTTP. * Sort files in LocalFirehoseFactory.	2021-11-21 22:26:31 -08:00
Frank Chen	2e3767bef0	Use the last ip as docker host ip (#11742 )	2021-11-20 13:31:39 +08:00
TSFenwick	1487f558b1	Use a simple class to sanitize JDBC exceptions and also log them (#11843 ) * Use a simple class to sanitize sanitizable errors and log them The purpose of this is to sanitize JDBC errors, but can sanitize other errors if they implement SanitizableError Interface add a class to log errors and sanitize them added a simple test that tests out that the error gets sanitized add @NonNull annotation to serverconfig's ErrorResponseTransfromStrategy * return less information as part of too many connections, and instead only log specific details This is so an end user gets relevant information but not too much info since they might now how many brokers they have * return only runtime exceptions added new error types that need to be sanitized also sanitize deprecated and unsupported exceptions. * dont reqrewite exceptions unless necessary for checked exceptions add docs avoid blanket turning all exceptions into runtime exceptions * address comments, to fix up docs. add more javadocs add support UOE sanitization * use try catch instead and sanitize at public methods * checkstyle fixes * throw noSuchStatement and NoSuchConnection as Avatica is affected by those * address comments. move log error back to druid meta clean up bad formatting and commented code. add missed catch for NoSuchStatementException clean up comments for error handler and add comment explainging not wanting to santize avatica exceptions * alter test to reflect new error message	2021-11-16 13:13:03 -08:00
Gian Merlino	6f6e88e02e	SQL: Add type headers to response formats. (#11914 ) This allows clients to interpret the results of SQL queries without having to guess types.	2021-11-13 11:30:57 +05:30
Clint Wylie	5baa22148e	revert ColumnAnalysis type, add typeSignature and use it for DruidSchema (#11895 ) * revert ColumnAnalysis type, add typeSignature and use it for DruidSchema * review stuffs * maybe null * better maybe null * Update docs/querying/segmentmetadataquery.md * Update docs/querying/segmentmetadataquery.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * fix null right * sad * oops * Update batch_hadoop_queries.json Co-authored-by: Charles Smith <techdocsmith@gmail.com>	2021-11-10 18:46:29 -08:00
Jihoon Son	13bec7468a	Fix NPE for SQL queries when a query parameter is missing in the mid (#11900 ) * Fix NPE for SQL queries when a query parameter is missing in the mid * checkstyle * Throw SqlPlanningException instead of IAE	2021-11-10 10:02:26 -08:00
Maytas Monsereenusorn	ddc68c6a81	Support changing dimension schema in Auto Compaction (#11874 ) * add impl * add unit tests * fix checkstyle * add impl * add impl * add impl * add impl * add impl * add impl * fix test * add IT * add IT * fix docs * add test * address comments * fix conflict	2021-11-08 21:17:08 -08:00
Karan Kumar	cf27366b35	Fixing typos in docker build scripts (#11866 )	2021-11-02 23:50:52 +05:30
Maytas Monsereenusorn	ba2874ee1f	Support changing query granularity in Auto Compaction (#11856 ) * add queryGranularity * fix checkstyle * fix test	2021-11-01 15:18:44 -07:00
Karan Kumar	90640bb316	Support for hadoop 3 via maven profiles (#11794 ) Add support for hadoop 3 profiles . Most of the details are captured in #11791 . We use a combination of maven profiles and resource filtering to achieve this. Hadoop2 is supported by default and a new maven profile with the name hadoop3 is created. This will allow the user to choose the profile which is best suited for the use case.	2021-10-30 22:46:24 +05:30
Maytas Monsereenusorn	33d9d9bd74	Add rollup config to auto and manual compaction (#11850 ) * add rollup to auto and manual compaction * add unit tests * add unit tests * add IT * fix checkstyle	2021-10-29 10:22:25 -07:00
Kashif Faraz	abac9e39ed	Revert permission changes to Supervisor and Task APIs (#11819 ) * Revert "Require Datasource WRITE authorization for Supervisor and Task access (#11718)" This reverts commit `f2d6100124`. * Revert "Require DATASOURCE WRITE access in SupervisorResourceFilter and TaskResourceFilter (#11680)" This reverts commit `6779c4652d`. * Fix docs for the reverted commits * Fix and restore deleted tests * Fix and restore SystemSchemaTest	2021-10-25 14:50:38 +05:30
Agustin Gonzalez	887cecf29e	Simplify ITHttpInputSourceTest to mitigate flakiness (#11751 ) * Increment retry count to add more time for tests to pass * Re-enable ITHttpInputSourceTest * Restore original count * This test is about input source, hash partitioning takes longer and not required thus changing to dynamic * Further simplify by removing sketches	2021-10-12 11:51:27 -05:00
Kashif Faraz	f2d6100124	Require Datasource WRITE authorization for Supervisor and Task access (#11718 ) Follow up PR for #11680 Description Supervisor and Task APIs are related to ingestion and must always require Datasource WRITE authorization even if they are purely informative. Changes Check Datasource WRITE in SystemSchema for tables "supervisors" and "tasks" Check Datasource WRITE for APIs /supervisor/history and /supervisor/{id}/history Check Datasource for all Indexing Task APIs	2021-10-08 10:39:48 +05:30
Jihoon Son	1c0b76ba93	Add killAndRestart for container for integration tests (#11754 )	2021-09-30 13:47:57 -07:00
Clint Wylie	11017ef00a	support jdbc even if trailing / is missing (#11737 ) * support jdbc even if trailing / is missing * fix tests	2021-09-29 13:59:26 -07:00
Maytas Monsereenusorn	a04b08e45c	Add new config to filter internal Druid-related messages from Query API response (#11711 ) * add impl * add impl * add tests * add unit test * fix checkstyle * address comments * fix checkstyle * fix checkstyle * fix checkstyle * fix checkstyle * fix checkstyle * address comments * address comments * address comments * fix test * fix test * fix test * fix test * fix test * change config name * change config name * change config name * address comments * address comments * address comments * address comments * address comments * address comments * fix compile * fix compile * change config * add more tests * fix IT	2021-09-29 12:55:49 +07:00
Agustin Gonzalez	988623b7ae	ITHttpInputSourceTest instability blocking the development pipeline (#11749 )	2021-09-28 13:42:01 -07:00
Clint Wylie	3525c0b195	make authorization integration test more extensible (#11730 )	2021-09-22 08:15:30 -07:00
Clint Wylie	5de26cf6d9	add optional system schema authorization (#11720 ) * add optional system schema authorization * remove unused * adjust docs * doc fixes, missing ldap config change for integration tests * style	2021-09-21 13:28:26 -07:00
Lucas Capistrant	5c3f3da146	Add handoff wait time to IngestionStatsAndErrorsTaskReportData (#11090 ) * Add handoff wait time to ingestion stats report. Refactor some code for batch handoff * fix checkstyle * Add assertion to AbstractITBatchIndexTask to make sure report reflects wait for segments happened * add docs to the task reports section of doc	2021-09-20 22:48:44 -07:00
Clint Wylie	fe1d8c206a	bump version to 0.23.0-SNAPSHOT (#11670 )	2021-09-08 15:56:04 -07:00
Jihoon Son	82049bbf0a	Cancel API for sqls (#11643 ) * initial work * reduce lock in sqlLifecycle * Integration test for sql canceling * javadoc, cleanup, more tests * log level to debug * fix test * checkstyle * fix flaky test; address comments * rowTransformer * cancelled state * use lock * explode instead of noop * oops * unused import * less aggressive with state * fix calcite charset * don't emit metrics when you are not authorized	2021-09-05 10:57:45 -07:00
Jihoon Son	7e90d00cc0	Configurable maxStreamLength for doubles sketches (#11574 ) * Configurable maxStreamLength for doubles sketches * fix equals/hashcode and it test failure * fix test * fix it test * benchmark * doc * grouping key * fix comment * dependency check * Update docs/development/extensions-core/datasketches-quantiles.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/querying/sql.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/querying/sql.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/querying/sql.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/querying/sql.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/querying/sql.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/querying/sql.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/querying/sql.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> Co-authored-by: Charles Smith <techdocsmith@gmail.com>	2021-08-31 14:56:37 -07:00
Maytas Monsereenusorn	fc86a7a97f	fix custom coordinator duty (#11641 )	2021-08-31 14:04:00 +07:00
Maytas Monsereenusorn	6997fd787d	Add functionality for integration test to run command inside the Docker container (#11640 ) * add run cmd for IT * fix checksyle * fix checksyle	2021-08-31 13:26:26 +07:00
Clint Wylie	a09688862e	fix integration tests (#11638 ) * Update Dockerfile * Update docker_build_containers.sh * Update Dockerfile	2021-08-30 13:53:13 -07:00
Maytas Monsereenusorn	ce4dd48bb8	Support custom coordinator duties (#11601 ) * impl * fix checkstyle * fix checkstyle * fix checkstyle * add test * add test * add test * add integration tests * add integration tests * add more docs * address comments * address comments * address comments * add test * fix checkstyle * fix test	2021-08-19 11:54:11 +07:00
Parag Jain	c7b46671b3	option to use deep storage for storing shuffle data (#11507 ) Fixes #11297. Description Description and design in the proposal #11297 Key changed/added classes in this PR DataSegmentPusher ShuffleClient PartitionStat PartitionLocation *IntermediaryDataManager	2021-08-13 16:40:25 -04:00
Maytas Monsereenusorn	06bae29979	Fix ingestion task failure when no input split to process (#11553 ) * fix ingestion task failure when no input split to process * add IT * fix IT	2021-08-09 23:11:08 +07:00
dependabot[bot]	511bc964ff	Bump docker-java-transport-netty from 3.2.8 to 3.2.11 (#11532 ) Bumps [docker-java-transport-netty](https://github.com/docker-java/docker-java) from 3.2.8 to 3.2.11. - [Release notes](https://github.com/docker-java/docker-java/releases) - [Changelog](https://github.com/docker-java/docker-java/blob/master/CHANGELOG.md) - [Commits](https://github.com/docker-java/docker-java/compare/3.2.8...3.2.11) --- updated-dependencies: - dependency-name: com.github.docker-java:docker-java-transport-netty dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2021-08-03 17:53:22 -07:00
Jonathan Wei	676efb1b3f	Fix integration test credential resource path handling (#11487 ) This PR fixes an issue with the integration test copy_resources.sh script. The "install druid jars" portion was removing the $SHARED_DIR/docker directory, which wipes out the $SHARED_DIR/docker/extensions and $SHARED_DIR/docker/credentials directories created just before, which leads to issues later in the script when copying resources to the $SHARED_DIR/docker/credentials/ dir.	2021-07-27 12:32:34 +05:30
Maytas Monsereenusorn	161f4dbc0e	Add integration tests for S3 Assume Role ingestion feature (#11472 ) * add IT for S3 assume role * fix checkstyle * fix test * fix pom * fix test	2021-07-23 10:09:09 +07:00
Maytas Monsereenusorn	d3e82b1114	speed up test (#11442 )	2021-07-14 21:14:38 +07:00
Maytas Monsereenusorn	05d5dd9289	compaction/status API retains status for datasources that no longer existed causing in-memory used to grow unbounded (#11426 ) * compaction/status API retains status for datasources that no longer existed causing in-memory used to grow unbounded * compaction/status API retains status for datasources that no longer existed causing in-memory used to grow unbounded * compaction/status API retains status for datasources that no longer existed causing in-memory used to grow unbounded * fix test * fix test	2021-07-13 09:48:06 +07:00
Maytas Monsereenusorn	f5d53569ca	Supervisor metadata auto cleanup failing as missing Guice injection (#11424 ) * Fix Supervisor metadata auto cleanup failing as missing Guice injection * Fix Supervisor metadata auto cleanup failing as missing Guice injection * fix IT * fix IT * Update services/src/main/java/org/apache/druid/cli/CliCoordinator.java Co-authored-by: Clint Wylie <cjwylie@gmail.com> * fix * fix * fix * fix * fix * fix * fix Co-authored-by: Clint Wylie <cjwylie@gmail.com>	2021-07-13 09:47:49 +07:00
Abhishek Agarwal	e228a84d91	Fix retry sleep when callable throws exception (#11430 ) If the callable throws an exception, we neither increase the retry count nor sleep the thread.	2021-07-11 15:06:10 +05:30
Clint Wylie	63fcd77c38	support using mariadb connector with mysql extensions (#11402 ) * support using mariadb connector with mysql extensions * cleanup and more tests * fix test * javadocs, more tests, etc * style and more test * more test more better * missing pom * more pom	2021-07-08 12:25:37 -07:00
Abhishek Agarwal	03a6a6d6e1	Replace Processing ExecutorService with QueryProcessingPool (#11382 ) This PR refactors the code for QueryRunnerFactory#mergeRunners to accept a new interface called QueryProcessingPool instead of ExecutorService for concurrent execution of query runners. This interface will let custom extensions inject their own implementation for deciding which query-runner to prioritize first. The default implementation is the same as today that takes the priority of query into account. QueryProcessingPool can also be used as a regular executor service. It has a dedicated method for accepting query execution work so implementations can differentiate between regular async tasks and query execution tasks. This dedicated method also passes the QueryRunner object as part of the task information. This hook will let custom extensions carry any state from QuerySegmentWalker to QueryProcessingPool#mergeRunners which is not possible currently.	2021-07-01 16:03:08 +05:30
frank chen	906a704c55	Eliminate ambiguities of KB/MB/GB in the doc (#11333 ) * GB ---> GiB * suppress spelling check * MB --> MiB, KB --> KiB * Use IEC binary prefix * Add reference link * Fix doc style	2021-06-30 13:42:45 -07:00
Xavier Léauté	3ad6a3d74f	switch to netty-bom instead of individual dependencies (#11356 )	2021-06-29 12:52:12 -07:00
Kashif Faraz	f0b105ec63	Temporarily skip compaction for locked intervals (#11190 ) * Add overlord API /lockedIntervals. Skip compaction for locked intervals * Revert formatting changes * Add license info * Fix checkstyle * Remove invalid API invocation * Fix checkstyle * Add DatasourceIntervalsTest * Fix checkstyle * Remove LockedIntervalsResponse * Add integration tests for lockedIntervals * Add ITAutoCompactionLockContentionTest * Add config druid.coordinator.compaction.skipLockedIntervals * Add test for TaskQueue	2021-06-20 17:21:59 -07:00
dependabot[bot]	1e8b5360b3	Bump docker-java-transport-netty from 3.2.0 to 3.2.8 (#11337 ) Bumps [docker-java-transport-netty](https://github.com/docker-java/docker-java) from 3.2.0 to 3.2.8. - [Release notes](https://github.com/docker-java/docker-java/releases) - [Changelog](https://github.com/docker-java/docker-java/blob/master/CHANGELOG.md) - [Commits](https://github.com/docker-java/docker-java/compare/3.2.0...3.2.8) --- updated-dependencies: - dependency-name: com.github.docker-java:docker-java-transport-netty dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2021-06-07 18:58:38 -07:00
zachjsh	27f1b6cbf3	Fix Index hadoop failing with index.zip is not a valid DFS filename (#11316 ) * * Fix bug * * simplify class loading * * fix example configs for integration tests * Small classloader cleanup Co-authored-by: jon-wei <jon.wei@imply.io>	2021-06-01 19:14:50 -04:00
Maytas Monsereenusorn	e5633d7842	Fix bug: 502 bad gateway thrown when we edit/delete any auto compaction config created 0.21.0 or before (#11311 ) * fix bug * add test * fix IT * fix checkstyle * address comments	2021-05-27 16:34:32 -07:00
Xavier Léauté	b517c3339b	remove ZooKeeper 3.4 support + pass tests with Java 15 (#11073 ) With this change, Druid will only support ZooKeeper 3.5.x and later. In order to support Java 15 we need to switch to ZK 3.5.x client libraries and drop support for ZK 3.4.x (see #10780 for the detailed reasons) * remove ZooKeeper 3.4.x compatibility * exclude additional ZK 3.5.x netty dependencies to ensure we use our version * keep ZooKeeper version used for integration tests in sync with client library version * remove the need to specify ZK version at runtime for docker * add support to run integration tests with JDK 15 * build and run unit tests with Java 15 in travis	2021-05-25 12:49:49 -07:00
fhan	82380b67e0	Improve IT job 79 ITNestedQueryPushDownTest integration test (#11268 ) * improve occasional failure caused by resource competition * adjust more configs in tiny-cluters.yaml Co-authored-by: yfhanfei <yfhanfei@ZBMac-C02DW5SMMD6P.local>	2021-05-24 10:12:34 +08:00
Agustin Gonzalez	383daa4029	Catch exception inside ITRetryUtil to fix one of the causes for flaky integration tests (#11265 ) * Do not stop retrying when an exception is encountered. Save & propagate last exception if retry count is exceeded. * Add one more log message to help with debugging * Limit schema registry heap to attempt to control OOMs	2021-05-19 13:56:02 -07:00
Clint Wylie	933350d106	integration test runner xmx (#11273 ) * integration test runner xmx * smaller	2021-05-19 12:59:50 -07:00
Yi Yuan	3be8e29269	Add integration test for protobuf (#11126 ) * add file test * test * for test * bug fixed * test * test * test * bug fixed * delete auto scaler * add input format * add extensions * bug fixed * bug fixed * bug fixed * revert * add schema registry test * bug fixed * bug fixed * delete desc * delete change * add desc * bug fixed * test inputformat * bug fixed * bug fixed * bug fixed * bug fixed * delete io exception * change builder not static * change pom * bug fixed Co-authored-by: yuanyi <yuanyi@freewheel.tv>	2021-05-17 15:45:07 -07:00
Xavier Léauté	3b9dad4c9e	Consolidate the number of Dockerfiles (#11187 ) * Consolidate the number of Dockerfiles * add build-arguments to choose which Java base image to use at runtime * default to building image with Java 11 * base k8s integration test image off of the default image: this ensures our docker image now gets tested as part of integration tests. * upgrade maven help plugin to 3.2.0	2021-05-07 10:41:34 -07:00
zachjsh	99f39c7202	Hadoop segment index file rename (#11194 ) * Do stuff * Do more stuff * * Do more stuff * * Do more stuff * * working * * cleanup * * more cleanup * * more cleanup * * add license header * * Add unit tests * * add java docs * * add more unit tests * * Cleanup test * * Move removing of workingPath to index task rather than in hadoop job. * * Address review comments * * remove unused import * * Address review comments * Do not overwrite segment descriptor for segment if it already exists. * * add comments to FileSystemHelper class * * fix local hadoop integration test * * Fix failing test failures when running with java11 * Revert "Revert "Adjust HadoopIndexTask temp segment renaming to avoid potential race conditions (#11075)" (#11151)" This reverts commit `49a9c3ffb7`. * * remove JobHelperPowerMockTest * * remove FileSystemHelper class	2021-05-04 20:22:18 -04:00
frank chen	204901a602	Fix Smile encoding for HTTP response (#10980 ) * fix Smile encoding bug Signed-off-by: frank chen <frank.chen021@outlook.com> * Add unit tests * Add IT for smile encoding * Fix cases * Update javadoc Co-authored-by: Jihoon Son <jihoonson@apache.org> * resolve comments Co-authored-by: Jihoon Son <jihoonson@apache.org>	2021-05-03 22:43:47 -07:00
Xavier Léauté	0296f20551	upgrade Apache Kafka to 2.8.0 (#11139 ) * upgrade to Apache Kafka 2.8.0 (release notes: https://downloads.apache.org/kafka/2.8.0/RELEASE_NOTES.html) * pass Kafka version as a Docker argument in integration tests to keep in sync with maven version * fix use of internal Kafka APIs in integration tests	2021-04-24 08:27:07 -07:00
Jonathan Wei	49a9c3ffb7	Revert "Adjust HadoopIndexTask temp segment renaming to avoid potential race conditions (#11075 )" (#11151 ) This reverts commit `a2892d9c40`.	2021-04-22 15:33:27 -07:00
zachjsh	a2892d9c40	Adjust HadoopIndexTask temp segment renaming to avoid potential race conditions (#11075 ) * Do stuff * Do more stuff * * Do more stuff * * Do more stuff * * working * * cleanup * * more cleanup * * more cleanup * * add license header * * Add unit tests * * add java docs * * add more unit tests * * Cleanup test * * Move removing of workingPath to index task rather than in hadoop job. * * Address review comments * * remove unused import * * Address review comments * Do not overwrite segment descriptor for segment if it already exists. * * add comments to FileSystemHelper class * * fix local hadoop integration test	2021-04-21 12:24:31 -07:00
Yi Yuan	d0a94a8c14	add avro stream input format (#11040 ) * add avro stream input format * bug fixed * add document * doc fix * change doc * add integretion test * bug fixed * bug fixed * add string as binary getter Co-authored-by: yuanyi <yuanyi@freewheel.tv>	2021-04-12 21:53:41 -07:00
Jihoon Son	a6a2758095	More unit tests for JsonParserIterator; Integration tests for query errors (#11091 ) * unit tests for timeout exception in init * integration tests * run integraion test on travis * fix inspection	2021-04-12 15:08:50 -07:00
Jonathan Wei	e7b2ecd0fd	Add retry around query loop in ITWikipediaQueryTest.testQueryLaningLaneIsLimited (#11077 )	2021-04-09 10:54:34 -07:00
Maytas Monsereenusorn	4576152e4a	Make dropExisting flag for Compaction configurable and add warning documentations (#11070 ) * Make dropExisting flag for Compaction configurable * fix checkstyle * fix checkstyle * fix test * add tests * fix spelling * fix docs * add IT * fix test * fix doc * fix doc	2021-04-09 00:12:28 -07:00
Lucas Capistrant	8264203cee	Allow client to configure batch ingestion task to wait to complete until segments are confirmed to be available by other (#10676 ) * Add ability to wait for segment availability for batch jobs * IT updates * fix queries in legacy hadoop IT * Fix broken indexing integration tests * address an lgtm flag * spell checker still flagging for hadoop doc. adding under that file header too * fix compaction IT * Updates to wait for availability method * improve unit testing for patch * fix bad indentation * refactor waitForSegmentAvailability * Fixes based off of review comments * cleanup to get compile after merging with master * fix failing test after previous logic update * add back code that must have gotten deleted during conflict resolution * update some logging code * fixes to get compilation working after merge with master * reset interrupt flag in catch block after code review pointed it out * small changes following self-review * fixup some issues brought on by merge with master * small changes after review * cleanup a little bit after merge with master * Fix potential resource leak in AbstractBatchIndexTask * syntax fix * Add a Compcation TuningConfig type * add docs stipulating the lack of support by Compaction tasks for the new config * Fixup compilation errors after merge with master * Remove erreneous newline	2021-04-08 21:03:00 -07:00
zhangyue19921010	de691808ce	[Bug]Kinesis-data-format IT can not work (#11071 ) * start schema-resgity and replace json template * add docs Co-authored-by: yuezhang <yuezhang@freewheel.tv>	2021-04-08 15:50:04 -07:00
Xavier Léauté	15bdd6bc2f	Fix unit tests and GC settings for Java 15 (#11074 ) * JavaScript script engine support was removed in JDK 15: skip those tests for JDKs without it * Fix flaky HTTP client tests with Java 15 * Switch from CMS to G1GC in integration tests, since CMS is no longer available in JDK 15	2021-04-08 10:33:37 -07:00
Yi Yuan	053af6815d	bug fixed (#11066 ) Co-authored-by: yuanyi <yuanyi@freewheel.tv>	2021-04-06 10:39:06 +08:00
Maytas Monsereenusorn	d7f5293364	Add an option for ingestion task to drop (mark unused) all existing segments that are contained by interval in the ingestionSpec (#11025 ) * Auto-Compaction can run indefinitely when segmentGranularity is changed from coarser to finer. * Add option to drop segments after ingestion * fix checkstyle * add tests * add tests * add tests * fix test * add tests * fix checkstyle * fix checkstyle * add docs * fix docs * address comments * address comments * fix spelling	2021-04-01 12:29:36 -07:00
Lasse Krogh Mammen	782a1d4e6c	Add Calcite Avatica protobuf handler (#10543 )	2021-03-31 12:46:25 -07:00
Himadri Singh	74ae2eb71a	Fix Integration Tests (#11046 )	2021-03-30 01:03:49 +05:30
Clint Wylie	f160548231	maybe fix leadership integration test flakes (#11031 )	2021-03-26 03:43:06 -07:00
Gian Merlino	bf20f9e979	DruidInputSource: Fix issues in column projection, timestamp handling. (#10267 ) * DruidInputSource: Fix issues in column projection, timestamp handling. DruidInputSource, DruidSegmentReader changes: 1) Remove "dimensions" and "metrics". They are not necessary, because we can compute which columns we need to read based on what is going to be used by the timestamp, transform, dimensions, and metrics. 2) Start using ColumnsFilter (see below) to decide which columns we need to read. 3) Actually respect the "timestampSpec". Previously, it was ignored, and the timestamp of the returned InputRows was set to the `__time` column of the input datasource. (1) and (2) together fix a bug in which the DruidInputSource would not properly read columns that are used as inputs to a transformSpec. (3) fixes a bug where the timestampSpec would be ignored if you attempted to set the column to something other than `__time`. (1) and (3) are breaking changes. Web console changes: 1) Remove "Dimensions" and "Metrics" from the Druid input source. 2) Set timestampSpec to `{"column": "__time", "format": "millis"}` for compatibility with the new behavior. Other changes: 1) Add ColumnsFilter, a new class that allows input readers to determine which columns they need to read. Currently, it's only used by the DruidInputSource, but it could be used by other columnar input sources in the future. 2) Add a ColumnsFilter to InputRowSchema. 3) Remove the metric names from InputRowSchema (they were unused). 4) Add InputRowSchemas.fromDataSchema method that computes the proper ColumnsFilter for given timestamp, dimensions, transform, and metrics. 5) Add "getRequiredColumns" method to TransformSpec to support the above. * Various fixups. * Uncomment incorrectly commented lines. * Move TransformSpecTest to the proper module. * Add druid.indexer.task.ignoreTimestampSpecForDruidInputSource setting. * Fix. * Fix build. * Checkstyle. * Misc fixes. * Fix test. * Move config. * Fix imports. * Fixup. * Fix ShuffleResourceTest. * Add import. * Smarter exclusions. * Fixes based on tests. Also, add TIME_COLUMN constant in the web console. * Adjustments for tests. * Reorder test data. * Update docs. * Update docs to say Druid 0.22.0 instead of 0.21.0. * Fix test. * Fix ITAutoCompactionTest. * Changes from review & from merging.	2021-03-25 10:32:21 -07:00
Maytas Monsereenusorn	c87ac0823f	Auto-compaction with segment granularity retrieve incomplete segments from timeline when interval overlap (#11019 ) * Fix Auto-compaction with segment granularity retrieve incomplete segments from timeline when interval overlap * Fix Auto-compaction with segment granularity retrieve incomplete segments from timeline when interval overlap * Fix Auto-compaction with segment granularity retrieve incomplete segments from timeline when interval overlap * Fix Auto-compaction with segment granularity retrieve incomplete segments from timeline when interval overlap * address comments	2021-03-24 11:37:29 -07:00
Jihoon Son	6aec8f0c1b	allow multiple ldap bootstrap files for integration tests (#11023 )	2021-03-23 13:18:36 -07:00
Maytas Monsereenusorn	51d2c61f1c	Auto-compaction with segment granularity should skip segments that already have the configured segmentGranularity (#11009 ) * Auto-compaction with segment granularity should skip segments that already have the configured segmentGranularity * Auto-compaction with segment granularity should skip segments that already have the configured segmentGranularity * Auto-compaction with segment granularity should skip segments that already have the configured segmentGranularity * address comments * address comments * address comments * address comments * address comments	2021-03-19 17:38:28 -07:00
Maytas Monsereenusorn	f19c2e9ce4	If ingested data has sparse columns, the ingested data with forceGuaranteedRollup=true can result in imperfect rollup and final dimension ordering can be different from dimensionSpec ordering in the ingestionSpec (#10948 ) * add IT * add IT * add the fix * fix checkstyle * fix compile * fix compile * fix test * fix test * address comments	2021-03-18 17:04:28 -07:00
Maytas Monsereenusorn	f37713dc6d	Fix auto compaction with mixed versions in the same time chunk based on new segment granularity (#11000 )	2021-03-16 12:48:19 -07:00
Xavier Léauté	68781a0d20	update testing frameworks for Java 15 support (#10984 ) * update jacoco to 0.8.6 * update easymock to 4.2 * update equalsverifier to 3.5.5 * update mockito to 3.8.0 * update powermock to 2.0.9 * update assertj-core to 3.19.0 * update testng to 7.3.0 - fix DTD url security for testng 7.x - fix backwards incompatibility in testng 7.x	2021-03-12 20:18:13 -08:00
Maytas Monsereenusorn	ed91a2bb38	Fix Kinesis should not increment throwAway count on EOS record (#10976 ) * fix Kinesis increament throwAway on EOS record * fix checkstyle * fix IT * fix test * fix IT * fix IT * fix IT * fix IT	2021-03-11 22:04:58 -08:00
Vyatcheslav Mogilevsky	b0432be07a	Apache archive mirror (#10979 ) * Ability to use mirror of archive.apache.org * Ability to use mirror of archive.apache.org: documentation * Ability to use mirror of archive.apache.org: fix int test Dockerfile: missing COPY instruction	2021-03-11 09:07:51 -08:00
frank chen	b79b7e6dfb	Improve exception handling in IT to reduce excessive stack trace messages (#10955 ) * Suppress logging for some exceptions to reduce excessive stack trace messages Signed-off-by: frank chen <frank.chen021@outlook.com> * log message for channel disconnected exception Signed-off-by: frank chen <frank.chen021@outlook.com>	2021-03-10 21:27:55 -08:00
Clint Wylie	96889cdebc	add avro + kafka + schema registry integration test (#10929 ) * add avro + schema registry integration test * style * retry init * maybe this * oops heh * this will fix it * review stuffs * fix comment	2021-03-08 08:12:12 -08:00
zhangyue19921010	bddacbb1c3	Dynamic auto scale Kafka-Stream ingest tasks (#10524 ) * druid task auto scale based on kafka lag * fix kafkaSupervisorIOConfig and KinesisSupervisorIOConfig * druid task auto scale based on kafka lag * fix kafkaSupervisorIOConfig and KinesisSupervisorIOConfig * test dynamic auto scale done * auto scale tasks tested on prd cluster * auto scale tasks tested on prd cluster * modify code style to solve 29055.10 29055.9 29055.17 29055.18 29055.19 29055.20 * rename test fiel function * change codes and add docs based on capistrant reviewed * midify test docs * modify docs * modify docs * modify docs * merge from master * Extract the autoScale logic out of SeekableStreamSupervisor to minimize putting more stuff inside there && Make autoscaling algorithm configurable and scalable. * fix ci failed * revert msic.xml * add uts to test autoscaler create && scale out/in and kafka ingest with scale enable * add more uts * fix inner class check * add IT for kafka ingestion with autoscaler * add new IT in groups=kafka-index named testKafkaIndexDataWithWithAutoscaler * review change * code review * remove unused imports * fix NLP * fix docs and UTs * revert misc.xml * use jackson to build autoScaleConfig with default values * add uts * use jackson to init AutoScalerConfig in IOConfig instead of Map<> * autoscalerConfig interface and provide a defaultAutoScalerConfig * modify uts * modify docs * fix checkstyle * revert misc.xml * modify uts * reviewed code change * reviewed code change * code reviewed * code review * log changed * do StringUtils.encodeForFormat when create allocationExec * code review && limit taskCountMax to partitionNumbers * modify docs * code review Co-authored-by: yuezhang <yuezhang@freewheel.tv>	2021-03-06 14:36:52 +05:30
Maytas Monsereenusorn	23333914c7	add javadoc and test (#10938 )	2021-03-03 11:34:00 +08:00
Maytas Monsereenusorn	b7b0ee8362	Add query granularity to compaction task (#10900 ) * add query granularity to compaction task * fix checkstyle * fix checkstyle * fix test * fix test * add tests * fix test * fix test * cleanup * rename class * fix test * fix test * add test * fix test	2021-03-02 11:23:52 -08:00
zachjsh	553f5c8570	Ldap integration tests (#10901 ) * Add integration tests for ldap extension * * refactor * * add ldap-security integration test to travis * * fix license error * * Fix failing other integration test * * break up large tests * refactor * address review comments * * fix intellij inspections failure * * remove dead code	2021-02-23 13:29:57 -08:00
Agustin Gonzalez	eabad0fb35	Keep query granularity of compacted segments after compaction (#10856 ) * Keep query granularity of compacted segments after compaction * Protect against null isRollup * Fix bugspot check RC_REF_COMPARISON_BAD_PRACTICE_BOOLEAN & edit an existing comment * Make sure that NONE is also included when comparing for the finer granularity * Update integration test check for segment size due to query granularity propagation affecting size * Minor code cleanup * Added functional test to verify queryGranlarity after compaction * Minor style fix * Update unit tests	2021-02-18 01:35:10 -08:00
Maytas Monsereenusorn	6541178c21	Support segmentGranularity for auto-compaction (#10843 ) * Support segmentGranularity for auto-compaction * Support segmentGranularity for auto-compaction * Support segmentGranularity for auto-compaction * Support segmentGranularity for auto-compaction * resolve conflict * Support segmentGranularity for auto-compaction * Support segmentGranularity for auto-compaction * fix tests * fix more tests * fix checkstyle * add unit tests * fix checkstyle * fix checkstyle * fix checkstyle * add unit tests * add integration tests * fix checkstyle * fix checkstyle * fix failing tests * address comments * address comments * fix tests * fix tests * fix test * fix test * fix test * fix test * fix test * fix test * fix test * fix test	2021-02-12 03:03:20 -08:00
zachjsh	64774037c1	Add config option to specify zk version in integration tests (#10870 ) * Update integration-tests README Updated the integration-tests README file to include instructions for setting the `ZK_VERSION` property which is now required to be set prior to executing any integration test. Also added a note about the importance of setting the test group parameter when running integration tests, even when running single tests. * * revert change made to DOCKER_IP doc * * Add default value for zk version * * update travis config to use new zk.version property when running integration tests * Remove doc about needing to set ZK_VERSION variable when running integration tests	2021-02-11 10:31:49 -08:00
Abhishek Agarwal	9526fd38db	Exclude redundant jars from integration-tests build (#10878 ) * Exclude redundant jars from integration-tests build * changes	2021-02-10 23:53:34 -08:00
Jihoon Son	397e7455ba	Increase heap to 64m for custom node (#10846 )	2021-02-03 16:23:19 -08:00
zhangyue19921010	77946f9264	K8s IT Test enhance (#10785 ) * do build and stop action in IT * change base dir from druidHome to druidHome/integration-tests * add env DRUID_HOME * bug fix * modify stop_sh * ready to test * bug fix * modify dir * tested on dev * modify dir * move DRUID_HOME env * done Co-authored-by: yuezhang <yuezhang@freewheel.tv>	2021-02-01 15:48:42 -08:00
Xavier Léauté	c346ce64b1	move integration tests from ZooKeeper 3.4.x to 3.5.x (#10786 ) * move integration tests from ZooKeeper 3.4.x to 3.5.x * run a subset of our integration tests with ZK 3.4 for backwards compatibility testing. * remove need to build separate docker-base image - use multi-stage build for the base image - use openjdk base image instead of building our own JDK base - workaround Debian not including MySQL by using MariaDB - download mysql connector directly instead of using distro version * fix incorrect openssl command failing on Debian * keep mysql connector version in sync with pom version	2021-01-31 08:35:39 -08:00
Maytas Monsereenusorn	a46d561bd7	Fix byte calculation for maxBytesInMemory to take into account of Sink/Hydrant Object overhead (#10740 ) * Fix byte calculation for maxBytesInMemory to take into account of Sink/Hydrant Object overhead * Fix byte calculation for maxBytesInMemory to take into account of Sink/Hydrant Object overhead * Fix byte calculation for maxBytesInMemory to take into account of Sink/Hydrant Object overhead * Fix byte calculation for maxBytesInMemory to take into account of Sink/Hydrant Object overhead * fix checkstyle * Fix byte calculation for maxBytesInMemory to take into account of Sink/Hydrant Object overhead * Fix byte calculation for maxBytesInMemory to take into account of Sink/Hydrant Object overhead * fix test * fix test * add log * Fix byte calculation for maxBytesInMemory to take into account of Sink/Hydrant Object overhead * address comments * fix checkstyle * fix checkstyle * add config to skip overhead memory calculation * add test for the skipBytesInMemoryOverheadCheck config * add docs * fix checkstyle * fix checkstyle * fix spelling * address comments * fix travis * address comments	2021-01-27 00:34:56 -08:00
Jihoon Son	95065bdf1a	Bump dev version to 0.22.0-SNAPSHOT (#10759 )	2021-01-15 13:16:23 -08:00
Jihoon Son	149306c9db	Tidy up HTTP status codes for query errors (#10746 ) * Tidy up query error codes * fix tests * Restore query exception type in JsonParserIterator * address review comments; add a comment explaining the ugly switch * fix test	2021-01-13 17:20:00 -08:00
zhangyue19921010	d5192640cb	remove extra comma (#10670 ) Co-authored-by: yuezhang <yuezhang@freewheel.tv>	2021-01-08 15:15:08 -08:00
Jonathan Wei	68bb038b31	Multiphase segment merge for IndexMergerV9 (#10689 ) * Multiphase merge for IndexMergerV9 * JSON fix * Cleanup temp files * Docs * Address logging and add IT * Fix spelling and test unloader datasource name	2021-01-05 22:19:09 -08:00
Himanshu	d2e6240cac	k8s-int-test-build: zk-less druid cluster and http based segment/task managment (#10686 ) * zk-less druid cluster in k8s build * attempt to fix build and use http based remote task management * mm/router logs for debugging * add default account k8s role and binding for pod, configMap access * fix issue * change router port to 8088 for common readinessProbe * break build_run_k8s_cluster.sh into separate scripts * revert changes to K8sDruidNodeAnnouncer.java * k8s extension doc update * add license to new file * address review comments * do not try to load lookups at startup to improve cluster startup time	2021-01-05 18:51:47 -08:00
Lucas Capistrant	26b911a384	Make some additions to IT suite to make Hadoop related testing more understandable (#10667 ) * Make some additions to IT suite to make Hadoop related testing more understandable * add start.hadoop.docker to mvn arg tips in doc * fix issues preventing ITIndexHadoopTest from running in local mode	2020-12-28 12:25:47 -06:00
Clint Wylie	74fbdd322d	refactor NodeRole so extensions can participate in disco and announcement (#10700 ) * refactor NodeRole so extensions can participate in disco and announcement * fixes, maybe * retries * javadoc * fix * spelling	2020-12-24 15:29:32 -08:00
Xavier Léauté	b7a16d08a6	Update Apache Kafka to 2.7.0 (#10701 ) - align scala versions to match Kafka	2020-12-22 13:56:00 -08:00
Maytas Monsereenusorn	5bd7924296	Fix kinesis integration test (#10696 ) * fix kinesis IT * fix checkstyle	2020-12-21 12:57:40 -08:00
Clint Wylie	92e5700e1e	fix integration test override config which requires environment variables before calling compose (#10694 )	2020-12-18 17:57:07 -08:00
Maytas Monsereenusorn	6f2ce8f0a5	fix Kinesis It (#10692 )	2020-12-18 13:47:00 -08:00
Clint Wylie	da0eabaa01	integration test for coordinator and overlord leadership client (#10680 ) * integration test for coordinator and overlord leadership, added sys.servers is_leader column * docs * remove not needed * fix comments * fix compile heh * oof * revert unintended * fix tests, split out docker-compose file selection from starting cluster, use docker-compose down to stop cluster * fixes * style * dang * heh * scripts are hard * fix spelling * fix thing that must not matter since was already wrong ip, log when test fails * needs more heap * fix merge * less aggro	2020-12-17 22:50:12 -08:00
zhangyue19921010	1884c35698	Do Integrate test for Druid base on K8s cluster (#10669 ) * add a travls job to do integrate test on K8s * revert build_run_cluster.sh * revert msic * run IT test * ready to test * modify before/after script * done * change mod for script * done * add env DRUID_OPERATOR_VERSION=0.0.3 * change version Co-authored-by: yuezhang <yuezhang@freewheel.tv>	2020-12-16 16:00:42 -08:00
Abhishek Agarwal	7a8e9bb156	Fix hadoop docker copy script (#10671 )	2020-12-14 23:08:50 -08:00
Clint Wylie	64f97e7003	fix DruidSchema incorrectly listing tables with no segments (#10660 ) * fix race condition with DruidSchema tables and dataSourcesNeedingRebuild * rework to see if it passes analysis * more better * maybe this * re-arrange and comments	2020-12-11 14:14:00 -08:00
Gian Merlino	96a387d972	Fixes and tests related to the Indexer process. (#10631 ) * Fixes and tests related to the Indexer process. Three bugs fixed: 1) Indexers would not announce themselves as segment servers if they did not have storage locations defined. This used to work, but was broken in #9971. Fixed this by adding an "isSegmentServer" method to ServerType and updating SegmentLoadDropHandler to always announce if this method returns true. 2) Certain batch task types were written in a way that assumed "isReady" would be called before "run", which is not guaranteed. In particular, they relied on it in order to initialize "taskLockHelper". Fixed this by updating AbstractBatchIndexTask to ensure "isReady" is called before "run" for these tasks. 3) UnifiedIndexerAppenderatorsManager did not properly handle complex datasources. Introduced DataSourceAnalysis in order to fix this. Test changes: 1) Add a new "docker-compose.cli-indexer.yml" config that spins up an Indexer instead of a MiddleManager. 2) Introduce a "USE_INDEXER" environment variable that determines if docker-compose will start up an Indexer or a MiddleManager. 3) Duplicate all the jdk8 tests and run them in both MiddleManager and Indexer mode. 4) Various adjustments to encourage fail-fast errors in the Docker build scripts. 5) Various adjustments to speed up integration tests and reduce memory usage. 6) Add another Mac-specific approach to determining a machine's own IP. This was useful on my development machine. 7) Update segment-count check in ITCompactionTaskTest to eliminate a race condition (it was looking for 6 segments, which only exist together briefly, until the older 4 are marked unused). Javadoc updates: 1) AbstractBatchIndexTask: Added javadocs to determineLockGranularityXXX that make it clear when taskLockHelper will be initialized as a side effect. (Related to the second bug above.) 2) Task: Clarified that "isReady" is not guaranteed to be called before "run". It was already implied, but now it's explicit. 3) ZkCoordinator: Clarified deprecation message. 4) DataSegmentServerAnnouncer: Clarified deprecation message. * Fix stop_cluster script. * Fix sanity check in script. * Fix hashbang lines. * Test and doc adjustments. * Additional tests, and adjustments for tests. * Split ITs back out. * Revert change to druid_coordinator_period_indexingPeriod. * Set Indexer capacity to match MM. * Bump up Historical memory. * Bump down coordinator, overlord memory. * Bump up Broker memory.	2020-12-08 16:02:26 -08:00
Vyatcheslav Mogilevsky	5324785eac	integration tests fix: update base image for hadoop containers to centos 7 (#10638 ) LGTM	2020-12-08 11:00:51 -08:00
Gian Merlino	b681861f05	Speed up integration tests in two ways. (#10648 ) 1) Accelerate coordinator runs to speed up segment load after publishing. 2) For streaming ingestion tests, Instead of waiting 3 minutes for data to load, wait until the expected number of rows is loaded. Also updates segment-count check in ITCompactionTaskTest to eliminate a race condition (it was looking for 6 segments, which only exist together briefly, until the older 4 are marked unused).	2020-12-07 10:59:29 -08:00
Gian Merlino	b7641f644c	Two fixes related to encoding of % symbols. (#10645 ) * Two fixes related to encoding of % symbols. 1) TaskResourceFilter: Don't double-decode task ids. request.getPathSegments() returns already-decoded strings. Applying StringUtils.urlDecode on top of that causes erroneous behavior with '%' characters. 2) Update various ThreadFactoryBuilder name formats to escape '%' characters. This fixes situations where substrings starting with '%' are erroneously treated as format specifiers. ITs are updated to include a '%' in extra.datasource.name.suffix. * Avoid String.replace. * Work around surefire bug. * Fix xml encoding. * Another try at the proper encoding. * Give up on the emojis. * Less ambitious testing. * Fix an additional problem. * Adjust encodeForFormat to return null if the input is null.	2020-12-06 22:35:11 -08:00
Jihoon Son	ae6c43de71	Add an integration test for HTTP inputSource (#10620 )	2020-12-03 15:51:56 -08:00
Suneet Saldanha	cd231d8511	Run integration test queries once (#10564 ) * Run integration test queries once * missed a few	2020-11-09 17:34:27 -08:00
Himanshu	ee136303bb	optionally disable all of hardcoded zookeeper use (#9507 ) * optionally disable all of hardcoded zookeeper use * fix DruidCoordinatorTest compilation * fix test in DruidCoordinatorTest * fix strict compilation Co-authored-by: Himanshu Gupta <fill email>	2020-10-26 22:35:59 -07:00
Maytas Monsereenusorn	1b9a8c4687	Fix compaction integration test CI timeout (#10517 ) * fix flaky IT Compaction test * fix flaky IT Compaction test * test * test * test * test * Fix compaction integration test CI timeout * address comments * test * test * Add print logs * add error msg * add taskId to logging	2020-10-21 22:38:11 -07:00
Maytas Monsereenusorn	3538abd5d0	Make sure all fields in sys.segments are JSON-serialized (#10481 ) * fix JSON format * Change all columns in sys segments to be JSON * Change all columns in sys segments to be JSON * add tests * fix failing tests * fix failing tests	2020-10-14 13:49:46 -07:00
Maytas Monsereenusorn	9056d113d0	Add docs and integration tests for Auto-compaction snapshot status API (#10510 ) * add docs and IT for Auto-compaction snapshot status API * fix spellings * fix test * address comments	2020-10-14 06:42:22 -07:00
Abhishek Agarwal	4d2a92f46a	Add caching support to join queries (#10366 ) * Proposed changes for making joins cacheable * Add unit tests * Fix tests * simplify logic * Pull empty byte array logic out of CachingQueryRunner * remove useless null check * Minor refactor * Fix tests * Fix segment caching on Broker * Move join cache key computation in Broker Move join cache key computation in Broker from ResultLevelCachingQueryRunner to CachingClusteredClient * Fix compilation * Review comments * Add more tests * Fix inspection errors * Pushed condition analysis to JoinableFactory * review comments * Disable join caching for broker and add prefix key to BroadcastSegmentIndexedTable * Remove commented lines * Fix populateCache * Disable caching for selective datasources Refactored the code so that we can decide at the data source level, whether to enable cache for broker or data nodes	2020-10-09 17:42:30 -07:00
Clint Wylie	307c1b0720	adjustments to Kafka integration tests to allow running against Azure Event Hubs streams (#10463 ) * adjustments to kafka integration tests to allow running against azure event hubs in kafka mode * oops * make better * more better	2020-10-05 08:54:29 -07:00
Jonathan Wei	65c0d64676	Update version to 0.21.0-SNAPSHOT (#10450 ) * [maven-release-plugin] prepare release druid-0.21.0 * [maven-release-plugin] prepare for next development iteration * Update web-console versions	2020-10-03 16:08:34 -07:00
Jihoon Son	0cc9eb4903	Store hash partition function in dataSegment and allow segment pruning only when hash partition function is provided (#10288 ) * Store hash partition function in dataSegment and allow segment pruning only when hash partition function is provided * query context * fix tests; add more test * javadoc * docs and more tests * remove default and hadoop tests * consistent name and fix javadoc * spelling and field name * default function for partitionsSpec * other comments * address comments * fix tests and spelling * test * doc	2020-09-24 16:32:56 -07:00
Jonathan Wei	cb30b1fe23	Automatically determine numShards for parallel ingestion hash partitioning (#10419 ) * Automatically determine numShards for parallel ingestion hash partitioning * Fix inspection, tests, coverage * Docs and some PR comments * Adjust locking * Use HllSketch instead of HyperLogLogCollector * Fix tests * Address some PR comments * Fix granularity bug * Small doc fix	2020-09-24 13:47:53 -07:00
Maytas Monsereenusorn	72f1b55f56	Add last_compaction_state to sys.segments table (#10413 ) * Add is_compacted to sys.segments table * change is_compacted to last_compaction_state * fix tests * fix tests * address comments	2020-09-23 15:29:36 -07:00
Atul Mohan	b6ad790dc7	Support combining inputsource for parallel ingestion (#10387 ) * Add combining inputsource * Fix documentation Co-authored-by: Atul Mohan <atulmohan@yahoo-inc.com>	2020-09-15 16:25:35 -07:00
Jihoon Son	8657b23ab2	Integration tests and docs for auto compaction with different partitioning (#10354 ) * Working * add test * doc * fix test * split other integration test * exclude other-index from other tests * doc anchor fix * adjust task slots and number of merge tasks * spell check * reduce maxNumConcurrentSubTasks to 1 * maxNumConcurrentSubtasks for range partitinoing * reduce memory for historical * change group name	2020-09-15 11:28:09 -07:00
Joy Kent	e5f0da30ae	Fix stringFirst/stringLast rollup during ingestion (#10332 ) * Add IndexMergerRollupTest This changelist adds a test to merge indexes with StringFirst/StringLast aggregator. * Fix StringFirstAggregateCombiner/StringLastAggregateCombiner The segment-level type for stringFirst/stringLast is SerializablePairLongString, not String. This changelist fixes it. * Fix EarliestLatestAnySqlAggregator to handle COMPLEX type This changelist allows EarliestLatestAnySqlAggregator to accept COMPLEX type as an operand. For its return type, we set it to VARCHAR, since COMPLEX column is only generated by stringFirst/stringLast during ingestion rollup. * Return value with smaller timestamp in StringFirstAggregatorFactory.combine function * Add integration tests for stringFirst/stringLast during ingestion * Use one EarliestLatestReturnTypeInference instance Co-authored-by: Joy Kent <joy@automonic.ai>	2020-09-08 17:36:04 -07:00

1 2 3 4 5 ...

575 Commits