Makes kinesis ingestion resilient to `LimitExceededException` caused by resharding.
Replace `describeStream` with `listShards` (recommended) to get shard related info.
`describeStream` has a limit (100) to the number of shards returned per call and a low default TPS limit of 10.
`listShards` returns the info for at most 1000 shards and has a higher TPS limit of 100 as well.
Key changed/added classes in this PR
* `KinesisRecordSupplier`
* `KinesisAdminClient`
* Use Druid's extension loading for integration test instead of maven
* fix maven command
* override config path
* load input format extensions and kafka by default; add prepopulated-data group
* all docker-composes are overridable
* fix s3 configs
* override config for all
* fix docker_compose_args
* fix security tests
* turn off debug logs for overlord api calls
* clean up stuff
* revert docker-compose.yml
* fix override config for query error test; fix circular dependency in docker compose
* add back some dependencies in docker compose
* new maven profile for integration test
* example file filter
* Add jsonPath functions support
* Add jsonPath function test for Avro
* Add jsonPath function length() to Orc
* Add jsonPath function length() to Parquet
* Add more tests to ORC format
* update doc
* Fix exception during ingestion
* Add IT test case
* Revert "Fix exception during ingestion"
This reverts commit 5a5484b9ea.
* update IT test case
* Add 'keys()'
* Commit IT test case
* Fix UT
* Make nodeRole available during binding; add support for dynamic registration of DruidService
* fix checkstyle and test
* fix customRole test
* address comments
* add more javadoc
* Code cleanup from query profile project
* Fix spelling errors
* Fix Javadoc formatting
* Abstract out repeated test code
* Reuse constants in place of some string literals
* Fix up some parameterized types
* Reduce warnings reported by Eclipse
* Reverted change due to lack of tests
* Use 404 instead of 400
* Use 404 instead of 400
* Add UT test cases
* Add IT testcases
* add UT for task resource filter
Signed-off-by: frank chen <frank.chen021@outlook.com>
* Using org.testing.Assert instead of org.junit.Assert
* Resolve comments and fix test
* Fix test
* Fix tests
* Resolve comments
* add impl
* fix checkstyle
* add test
* add test
* add unit tests
* fix unit tests
* fix unit tests
* fix unit tests
* add IT
* add IT
* add comments
* fix spelling
* Make LocalInputSource.files a List instead of Set and adjust wikipedia_index_task to use file list.
Rationale: the behavior of wikipedia_index_task.json is order-dependent with regard to its input
files; some orders produce 4 segments and some produce 5 segments. Some integration tests, like
ITSystemTableBatchIndexTaskTest and ITAutoCompactionTest, are written assuming that the
4-segment case will always happen. Providing the file list in a specific order ensures that this
will happen as expected by the tests.
I didn't see a specific reason why the LocalInputSource.files parameter needed to be a Set, so
changing it to a List was the simplest way to achieve the consistent ordering. I think it will
also make the behavior make more sense if someone does specify the same input file multiple
times in a spec: I think they'd expect it to be loaded multiple times instead of deduped. This
is consistent with the behavior of other input sources like S3, GCS, HTTP.
* Sort files in LocalFirehoseFactory.
* Use a simple class to sanitize sanitizable errors and log them
The purpose of this is to sanitize JDBC errors, but can sanitize other errors
if they implement SanitizableError Interface
add a class to log errors and sanitize them
added a simple test that tests out that the error gets sanitized
add @NonNull annotation to serverconfig's ErrorResponseTransfromStrategy
* return less information as part of too many connections, and instead only log specific details
This is so an end user gets relevant information but not too much info since they might now how
many brokers they have
* return only runtime exceptions
added new error types that need to be sanitized
also sanitize deprecated and unsupported exceptions.
* dont reqrewite exceptions unless necessary for checked exceptions
add docs
avoid blanket turning all exceptions into runtime exceptions
* address comments, to fix up docs.
add more javadocs
add support UOE sanitization
* use try catch instead and sanitize at public methods
* checkstyle fixes
* throw noSuchStatement and NoSuchConnection as Avatica is affected by those
* address comments. move log error back to druid meta
clean up bad formatting and commented code. add missed catch for NoSuchStatementException
clean up comments for error handler and add comment explainging not wanting to santize avatica exceptions
* alter test to reflect new error message
* revert ColumnAnalysis type, add typeSignature and use it for DruidSchema
* review stuffs
* maybe null
* better maybe null
* Update docs/querying/segmentmetadataquery.md
* Update docs/querying/segmentmetadataquery.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* fix null right
* sad
* oops
* Update batch_hadoop_queries.json
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Add support for hadoop 3 profiles . Most of the details are captured in #11791 .
We use a combination of maven profiles and resource filtering to achieve this. Hadoop2 is supported by default and a new maven profile with the name hadoop3 is created. This will allow the user to choose the profile which is best suited for the use case.
* Revert "Require Datasource WRITE authorization for Supervisor and Task access (#11718)"
This reverts commit f2d6100124.
* Revert "Require DATASOURCE WRITE access in SupervisorResourceFilter and TaskResourceFilter (#11680)"
This reverts commit 6779c4652d.
* Fix docs for the reverted commits
* Fix and restore deleted tests
* Fix and restore SystemSchemaTest
* Increment retry count to add more time for tests to pass
* Re-enable ITHttpInputSourceTest
* Restore original count
* This test is about input source, hash partitioning takes longer and not required thus changing to dynamic
* Further simplify by removing sketches
Follow up PR for #11680
Description
Supervisor and Task APIs are related to ingestion and must always require Datasource WRITE
authorization even if they are purely informative.
Changes
Check Datasource WRITE in SystemSchema for tables "supervisors" and "tasks"
Check Datasource WRITE for APIs /supervisor/history and /supervisor/{id}/history
Check Datasource for all Indexing Task APIs
* Add handoff wait time to ingestion stats report. Refactor some code for batch handoff
* fix checkstyle
* Add assertion to AbstractITBatchIndexTask to make sure report reflects wait for segments happened
* add docs to the task reports section of doc
* initial work
* reduce lock in sqlLifecycle
* Integration test for sql canceling
* javadoc, cleanup, more tests
* log level to debug
* fix test
* checkstyle
* fix flaky test; address comments
* rowTransformer
* cancelled state
* use lock
* explode instead of noop
* oops
* unused import
* less aggressive with state
* fix calcite charset
* don't emit metrics when you are not authorized
* Configurable maxStreamLength for doubles sketches
* fix equals/hashcode and it test failure
* fix test
* fix it test
* benchmark
* doc
* grouping key
* fix comment
* dependency check
* Update docs/development/extensions-core/datasketches-quantiles.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/querying/sql.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/querying/sql.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/querying/sql.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/querying/sql.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/querying/sql.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/querying/sql.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/querying/sql.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Fixes#11297.
Description
Description and design in the proposal #11297
Key changed/added classes in this PR
*DataSegmentPusher
*ShuffleClient
*PartitionStat
*PartitionLocation
*IntermediaryDataManager
This PR fixes an issue with the integration test copy_resources.sh script.
The "install druid jars" portion was removing the $SHARED_DIR/docker directory, which wipes out the $SHARED_DIR/docker/extensions and $SHARED_DIR/docker/credentials directories created just before, which leads to issues later in the script when copying resources to the $SHARED_DIR/docker/credentials/ dir.
* compaction/status API retains status for datasources that no longer existed causing in-memory used to grow unbounded
* compaction/status API retains status for datasources that no longer existed causing in-memory used to grow unbounded
* compaction/status API retains status for datasources that no longer existed causing in-memory used to grow unbounded
* fix test
* fix test
* support using mariadb connector with mysql extensions
* cleanup and more tests
* fix test
* javadocs, more tests, etc
* style and more test
* more test more better
* missing pom
* more pom
This PR refactors the code for QueryRunnerFactory#mergeRunners to accept a new interface called QueryProcessingPool instead of ExecutorService for concurrent execution of query runners. This interface will let custom extensions inject their own implementation for deciding which query-runner to prioritize first. The default implementation is the same as today that takes the priority of query into account. QueryProcessingPool can also be used as a regular executor service. It has a dedicated method for accepting query execution work so implementations can differentiate between regular async tasks and query execution tasks. This dedicated method also passes the QueryRunner object as part of the task information. This hook will let custom extensions carry any state from QuerySegmentWalker to QueryProcessingPool#mergeRunners which is not possible currently.
With this change, Druid will only support ZooKeeper 3.5.x and later.
In order to support Java 15 we need to switch to ZK 3.5.x client libraries and drop support for ZK 3.4.x
(see #10780 for the detailed reasons)
* remove ZooKeeper 3.4.x compatibility
* exclude additional ZK 3.5.x netty dependencies to ensure we use our version
* keep ZooKeeper version used for integration tests in sync with client library version
* remove the need to specify ZK version at runtime for docker
* add support to run integration tests with JDK 15
* build and run unit tests with Java 15 in travis
* Do not stop retrying when an exception is encountered. Save & propagate last exception if retry count is exceeded.
* Add one more log message to help with debugging
* Limit schema registry heap to attempt to control OOMs
* Consolidate the number of Dockerfiles
* add build-arguments to choose which Java base image to use at runtime
* default to building image with Java 11
* base k8s integration test image off of the default image: this ensures
our docker image now gets tested as part of integration tests.
* upgrade maven help plugin to 3.2.0
* Do stuff
* Do more stuff
* * Do more stuff
* * Do more stuff
* * working
* * cleanup
* * more cleanup
* * more cleanup
* * add license header
* * Add unit tests
* * add java docs
* * add more unit tests
* * Cleanup test
* * Move removing of workingPath to index task rather than in hadoop job.
* * Address review comments
* * remove unused import
* * Address review comments
* Do not overwrite segment descriptor for segment if it already exists.
* * add comments to FileSystemHelper class
* * fix local hadoop integration test
* * Fix failing test failures when running with java11
* Revert "Revert "Adjust HadoopIndexTask temp segment renaming to avoid potential race conditions (#11075)" (#11151)"
This reverts commit 49a9c3ffb7.
* * remove JobHelperPowerMockTest
* * remove FileSystemHelper class
* upgrade to Apache Kafka 2.8.0 (release notes:
https://downloads.apache.org/kafka/2.8.0/RELEASE_NOTES.html)
* pass Kafka version as a Docker argument in integration tests
to keep in sync with maven version
* fix use of internal Kafka APIs in integration tests
* Do stuff
* Do more stuff
* * Do more stuff
* * Do more stuff
* * working
* * cleanup
* * more cleanup
* * more cleanup
* * add license header
* * Add unit tests
* * add java docs
* * add more unit tests
* * Cleanup test
* * Move removing of workingPath to index task rather than in hadoop job.
* * Address review comments
* * remove unused import
* * Address review comments
* Do not overwrite segment descriptor for segment if it already exists.
* * add comments to FileSystemHelper class
* * fix local hadoop integration test
* Add ability to wait for segment availability for batch jobs
* IT updates
* fix queries in legacy hadoop IT
* Fix broken indexing integration tests
* address an lgtm flag
* spell checker still flagging for hadoop doc. adding under that file header too
* fix compaction IT
* Updates to wait for availability method
* improve unit testing for patch
* fix bad indentation
* refactor waitForSegmentAvailability
* Fixes based off of review comments
* cleanup to get compile after merging with master
* fix failing test after previous logic update
* add back code that must have gotten deleted during conflict resolution
* update some logging code
* fixes to get compilation working after merge with master
* reset interrupt flag in catch block after code review pointed it out
* small changes following self-review
* fixup some issues brought on by merge with master
* small changes after review
* cleanup a little bit after merge with master
* Fix potential resource leak in AbstractBatchIndexTask
* syntax fix
* Add a Compcation TuningConfig type
* add docs stipulating the lack of support by Compaction tasks for the new config
* Fixup compilation errors after merge with master
* Remove erreneous newline
* JavaScript script engine support was removed in JDK 15: skip those tests for JDKs without it
* Fix flaky HTTP client tests with Java 15
* Switch from CMS to G1GC in integration tests, since CMS is no longer available in JDK 15
* DruidInputSource: Fix issues in column projection, timestamp handling.
DruidInputSource, DruidSegmentReader changes:
1) Remove "dimensions" and "metrics". They are not necessary, because we
can compute which columns we need to read based on what is going to
be used by the timestamp, transform, dimensions, and metrics.
2) Start using ColumnsFilter (see below) to decide which columns we need
to read.
3) Actually respect the "timestampSpec". Previously, it was ignored, and
the timestamp of the returned InputRows was set to the `__time` column
of the input datasource.
(1) and (2) together fix a bug in which the DruidInputSource would not
properly read columns that are used as inputs to a transformSpec.
(3) fixes a bug where the timestampSpec would be ignored if you attempted
to set the column to something other than `__time`.
(1) and (3) are breaking changes.
Web console changes:
1) Remove "Dimensions" and "Metrics" from the Druid input source.
2) Set timestampSpec to `{"column": "__time", "format": "millis"}` for
compatibility with the new behavior.
Other changes:
1) Add ColumnsFilter, a new class that allows input readers to determine
which columns they need to read. Currently, it's only used by the
DruidInputSource, but it could be used by other columnar input sources
in the future.
2) Add a ColumnsFilter to InputRowSchema.
3) Remove the metric names from InputRowSchema (they were unused).
4) Add InputRowSchemas.fromDataSchema method that computes the proper
ColumnsFilter for given timestamp, dimensions, transform, and metrics.
5) Add "getRequiredColumns" method to TransformSpec to support the above.
* Various fixups.
* Uncomment incorrectly commented lines.
* Move TransformSpecTest to the proper module.
* Add druid.indexer.task.ignoreTimestampSpecForDruidInputSource setting.
* Fix.
* Fix build.
* Checkstyle.
* Misc fixes.
* Fix test.
* Move config.
* Fix imports.
* Fixup.
* Fix ShuffleResourceTest.
* Add import.
* Smarter exclusions.
* Fixes based on tests.
Also, add TIME_COLUMN constant in the web console.
* Adjustments for tests.
* Reorder test data.
* Update docs.
* Update docs to say Druid 0.22.0 instead of 0.21.0.
* Fix test.
* Fix ITAutoCompactionTest.
* Changes from review & from merging.
* Fix Auto-compaction with segment granularity retrieve incomplete segments from timeline when interval overlap
* Fix Auto-compaction with segment granularity retrieve incomplete segments from timeline when interval overlap
* Fix Auto-compaction with segment granularity retrieve incomplete segments from timeline when interval overlap
* Fix Auto-compaction with segment granularity retrieve incomplete segments from timeline when interval overlap
* address comments
* Auto-compaction with segment granularity should skip segments that already have the configured segmentGranularity
* Auto-compaction with segment granularity should skip segments that already have the configured segmentGranularity
* Auto-compaction with segment granularity should skip segments that already have the configured segmentGranularity
* address comments
* address comments
* address comments
* address comments
* address comments
* Ability to use mirror of archive.apache.org
* Ability to use mirror of archive.apache.org: documentation
* Ability to use mirror of archive.apache.org: fix int test Dockerfile: missing COPY instruction
* Suppress logging for some exceptions to reduce excessive stack trace messages
Signed-off-by: frank chen <frank.chen021@outlook.com>
* log message for channel disconnected exception
Signed-off-by: frank chen <frank.chen021@outlook.com>
* druid task auto scale based on kafka lag
* fix kafkaSupervisorIOConfig and KinesisSupervisorIOConfig
* druid task auto scale based on kafka lag
* fix kafkaSupervisorIOConfig and KinesisSupervisorIOConfig
* test dynamic auto scale done
* auto scale tasks tested on prd cluster
* auto scale tasks tested on prd cluster
* modify code style to solve 29055.10 29055.9 29055.17 29055.18 29055.19 29055.20
* rename test fiel function
* change codes and add docs based on capistrant reviewed
* midify test docs
* modify docs
* modify docs
* modify docs
* merge from master
* Extract the autoScale logic out of SeekableStreamSupervisor to minimize putting more stuff inside there && Make autoscaling algorithm configurable and scalable.
* fix ci failed
* revert msic.xml
* add uts to test autoscaler create && scale out/in and kafka ingest with scale enable
* add more uts
* fix inner class check
* add IT for kafka ingestion with autoscaler
* add new IT in groups=kafka-index named testKafkaIndexDataWithWithAutoscaler
* review change
* code review
* remove unused imports
* fix NLP
* fix docs and UTs
* revert misc.xml
* use jackson to build autoScaleConfig with default values
* add uts
* use jackson to init AutoScalerConfig in IOConfig instead of Map<>
* autoscalerConfig interface and provide a defaultAutoScalerConfig
* modify uts
* modify docs
* fix checkstyle
* revert misc.xml
* modify uts
* reviewed code change
* reviewed code change
* code reviewed
* code review
* log changed
* do StringUtils.encodeForFormat when create allocationExec
* code review && limit taskCountMax to partitionNumbers
* modify docs
* code review
Co-authored-by: yuezhang <yuezhang@freewheel.tv>
* add query granularity to compaction task
* fix checkstyle
* fix checkstyle
* fix test
* fix test
* add tests
* fix test
* fix test
* cleanup
* rename class
* fix test
* fix test
* add test
* fix test
* Keep query granularity of compacted segments after compaction
* Protect against null isRollup
* Fix bugspot check RC_REF_COMPARISON_BAD_PRACTICE_BOOLEAN & edit an existing comment
* Make sure that NONE is also included when comparing for the finer granularity
* Update integration test check for segment size due to query granularity propagation affecting size
* Minor code cleanup
* Added functional test to verify queryGranlarity after compaction
* Minor style fix
* Update unit tests
* Support segmentGranularity for auto-compaction
* Support segmentGranularity for auto-compaction
* Support segmentGranularity for auto-compaction
* Support segmentGranularity for auto-compaction
* resolve conflict
* Support segmentGranularity for auto-compaction
* Support segmentGranularity for auto-compaction
* fix tests
* fix more tests
* fix checkstyle
* add unit tests
* fix checkstyle
* fix checkstyle
* fix checkstyle
* add unit tests
* add integration tests
* fix checkstyle
* fix checkstyle
* fix failing tests
* address comments
* address comments
* fix tests
* fix tests
* fix test
* fix test
* fix test
* fix test
* fix test
* fix test
* fix test
* fix test
* Update integration-tests README
Updated the integration-tests README file to include instructions
for setting the `ZK_VERSION` property which is now required to be
set prior to executing any integration test. Also added a note
about the importance of setting the test group parameter when
running integration tests, even when running single tests.
* * revert change made to DOCKER_IP doc
* * Add default value for zk version
* * update travis config to use new zk.version property when running
integration tests
* Remove doc about needing to set ZK_VERSION variable when running
integration tests