* expression filter support for vectorized query engines
* remove unused codes
* more tests
* refactor, more tests
* suppress
* more
* more
* more
* oops, i was wrong
* comment
* remove decorate, object dimension selector, more javadocs
* style
* logs more info when delete segments && add deleteSegments-related UT
* revert msic.xml
* code review
* use log.debugSegments
Co-authored-by: yuezhang <yuezhang@freewheel.tv>
* Ability to use mirror of archive.apache.org
* Ability to use mirror of archive.apache.org: documentation
* Ability to use mirror of archive.apache.org: fix int test Dockerfile: missing COPY instruction
* Suppress logging for some exceptions to reduce excessive stack trace messages
Signed-off-by: frank chen <frank.chen021@outlook.com>
* log message for channel disconnected exception
Signed-off-by: frank chen <frank.chen021@outlook.com>
* license.yaml fixes for code introduced related to AWS RDS token based password provider in PR #9518
* add notice for commons-dbcp in license file
* add version and update NOTICE file
* fix SQL issue for group by queries with time filter that gets optimized to false
* short circuit always false in CombineAndSimplifyBounds
* adjust
* javadocs
* add preconditions for and/or filters to ensure they have children
* add comments, remove preconditions
* prometheus-emitter
* use existing jetty server to expose prometheus collection endpoint
* unused variables
* better variable names
* removed unused dependencies
* more metric definitions
* reorganize
* use prometheus HTTPServer instead of hooking into Jetty server
* temporary empty help string
* temporary non-empty help. fix incorrect dimension value in JSON (also updated statsd json)
* added full help text. added metric conversion factor for timers that are not using seconds. Correct metric dimension name in documentation
* added documentation for prometheus emitter
* safety for invalid labelNames
* fix travis checks
* Unit test and better sanitization of metrics names and label values
* add precondition to check namespace against regex
* use precompiled regex
* remove static imports. fix metric types
* better docs. fix possible NPE in PrometheusEmitterConfig. Guard against multiple calls to PrometheusEmitter.start()
* Update regex for label-value replacements to allow internal numeric values. Additional tests
* Adds missing license header
updates website/.spelling to add words used in prometheus-emitter docs.
updates docs/operations/metrics.md to correct the spelling of
bufferPoolName
* fixes version in extensions-contrib/prometheus-emitter
* fix style guide errors
* update import ordering
* add another word to website/.spelling
* remove unthrown declared exception
* remove unused import
* Pushgateway strategy for metrics
* typo
* Format fix and nullable strategy
* Update pom file for prometheus-emitter
* code review comments. Counter to gauge for cache metrics, periodical task to pushGateway
* Syntax fix
* Dimension label regex include numeric character back, fix previous commit
* bump prometheus-emitter pom dev version
* Remove scheduled task inside poen that push metrics
* Fix checkstyle
* Unit test coverage
* Unit test coverage
* Spelling
* Doc fix
* spelling
Co-authored-by: Michael Schiff <michael.schiff@tubemogul.com>
Co-authored-by: Michael Schiff <schiff.michael@gmail.com>
Co-authored-by: Tianxin Zhao <tianxin.zhao@tubemogul.com>
Co-authored-by: Tianxin Zhao <tizhao@adobe.com>
* Allow only HTTP and HTTPS protocols for the HTTP inputSource
* rename
* Update core/src/main/java/org/apache/druid/data/input/impl/HttpInputSource.java
Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>
* fix http firehose and update doc
* HDFS inputSource
* add configs for allowed protocols
* fix checkstyle and doc
* more checkstyle
* remove stale doc
* remove more doc
* Apply doc suggestions from code review
Co-authored-by: Charles Smith <38529548+techdocsmith@users.noreply.github.com>
* update hdfs address in docs
* fix test
Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>
Co-authored-by: Charles Smith <38529548+techdocsmith@users.noreply.github.com>
* druid task auto scale based on kafka lag
* fix kafkaSupervisorIOConfig and KinesisSupervisorIOConfig
* druid task auto scale based on kafka lag
* fix kafkaSupervisorIOConfig and KinesisSupervisorIOConfig
* test dynamic auto scale done
* auto scale tasks tested on prd cluster
* auto scale tasks tested on prd cluster
* modify code style to solve 29055.10 29055.9 29055.17 29055.18 29055.19 29055.20
* rename test fiel function
* change codes and add docs based on capistrant reviewed
* midify test docs
* modify docs
* modify docs
* modify docs
* merge from master
* Extract the autoScale logic out of SeekableStreamSupervisor to minimize putting more stuff inside there && Make autoscaling algorithm configurable and scalable.
* fix ci failed
* revert msic.xml
* add uts to test autoscaler create && scale out/in and kafka ingest with scale enable
* add more uts
* fix inner class check
* add IT for kafka ingestion with autoscaler
* add new IT in groups=kafka-index named testKafkaIndexDataWithWithAutoscaler
* review change
* code review
* remove unused imports
* fix NLP
* fix docs and UTs
* revert misc.xml
* use jackson to build autoScaleConfig with default values
* add uts
* use jackson to init AutoScalerConfig in IOConfig instead of Map<>
* autoscalerConfig interface and provide a defaultAutoScalerConfig
* modify uts
* modify docs
* fix checkstyle
* revert misc.xml
* modify uts
* reviewed code change
* reviewed code change
* code reviewed
* code review
* log changed
* do StringUtils.encodeForFormat when create allocationExec
* code review && limit taskCountMax to partitionNumbers
* modify docs
* code review
Co-authored-by: yuezhang <yuezhang@freewheel.tv>
* where filter left first draft
* Revert changes in calcite test
* Refactor a bit
* Fixing the Tests
* Changes
* Adding tests
* Add tests for correlated queries
* Add comment
* Fix typos
* Clarify docs
* Apply suggestions from code review
Co-authored-by: Charles Smith <38529548+techdocsmith@users.noreply.github.com>
Co-authored-by: Charles Smith <38529548+techdocsmith@users.noreply.github.com>
* Fix runtime error when IndexedTableJoinMatcher matches long selector to unique string index.
The issue arises when matching against a long selector on the left-hand side to a string
typed Index on the right-hand side, and when that Index also returns true from areKeysUnique.
In this case, IndexedTableJoinMatcher would generate a ConditionMatcher that implements
matchSingleRow by calling findUniqueLong on the Index. This is inappropriate because the Index
is actually string typed. The fix is to check the type of the Index before deciding how to
implement the ConditionMatcher.
The patch adds "testMatchSingleRowToUniqueStringIndex" to IndexedTableJoinMatcherTest, which
explores this case.
* Update tests.
* add query granularity to compaction task
* fix checkstyle
* fix checkstyle
* fix test
* fix test
* add tests
* fix test
* fix test
* cleanup
* rename class
* fix test
* fix test
* add test
* fix test
* Add config and header support for confluent schema registry. (porting code from https://github.com/apache/druid/pull/9096)
* Add Eclipse Public License 2.0 to license check
* Update licenses.yaml, revert changes to check-licenses.py and dependencies for integration-tests
* Add spelling exception and remove unused dependency
* Use non-deprecated getSchemaById() and remove duplicated license entry
* Update docs/ingestion/data-formats.md
Co-authored-by: Clint Wylie <cjwylie@gmail.com>
* Added check for schema being null, as per Confluent code
* Missing imports and whitespace
* Updated unit tests with AvroSchema
Co-authored-by: Sergio Spinatelli <sergio.spinatelli.extern@7-tv.de>
Co-authored-by: Sergio Spinatelli <sergio.spinatelli.extern@joyn.de>
Co-authored-by: Clint Wylie <cjwylie@gmail.com>
* clarify security requirements around HTTPInputSource
* explicitly mention write/datasource in best practices. clarify that the ingestion task is the risk
* Update docs/operations/security-overview.md
Co-authored-by: Suneet Saldanha <suneet@apache.org>
Co-authored-by: Suneet Saldanha <suneet@apache.org>
* use the latest Apache DataSketches release 2.0.0
* updated datasketches version
Co-authored-by: AlexanderSaydakov <AlexanderSaydakov@users.noreply.github.com>
* Granularity: Introduce primitive-typed bucketStart, increment methods.
Saves creation of unnecessary DateTime objects in timestamp_floor and
timestamp_ceil expressions.
* Fix style.
* Amp up the test coverage.
* Improve Druid ldap auth documentation
Improved the ldap auth docs by clarifying that the object classes and
attributes noted are specific to Microsoft Active Directory, and could
be different depending on the specific ldap server being used. Also
emphasized the importance of the memberOf field and noted that the
step about adding users to roles is only needed in certain circumstances.
* * add another note
* Apply suggestions from code review
Co-authored-by: sthetland <steve.hetland@imply.io>
* * simplify
* * Address review comments
Co-authored-by: sthetland <steve.hetland@imply.io>
* add druid jdbc handler config for minimum number of rows per frame
* javadocs and docs adjustments
* spelling
* adjust docs per review with minor tweaks
* adjust more