* fix SQL issue for group by queries with time filter that gets optimized to false
* short circuit always false in CombineAndSimplifyBounds
* adjust
* javadocs
* add preconditions for and/or filters to ensure they have children
* add comments, remove preconditions
* prometheus-emitter
* use existing jetty server to expose prometheus collection endpoint
* unused variables
* better variable names
* removed unused dependencies
* more metric definitions
* reorganize
* use prometheus HTTPServer instead of hooking into Jetty server
* temporary empty help string
* temporary non-empty help. fix incorrect dimension value in JSON (also updated statsd json)
* added full help text. added metric conversion factor for timers that are not using seconds. Correct metric dimension name in documentation
* added documentation for prometheus emitter
* safety for invalid labelNames
* fix travis checks
* Unit test and better sanitization of metrics names and label values
* add precondition to check namespace against regex
* use precompiled regex
* remove static imports. fix metric types
* better docs. fix possible NPE in PrometheusEmitterConfig. Guard against multiple calls to PrometheusEmitter.start()
* Update regex for label-value replacements to allow internal numeric values. Additional tests
* Adds missing license header
updates website/.spelling to add words used in prometheus-emitter docs.
updates docs/operations/metrics.md to correct the spelling of
bufferPoolName
* fixes version in extensions-contrib/prometheus-emitter
* fix style guide errors
* update import ordering
* add another word to website/.spelling
* remove unthrown declared exception
* remove unused import
* Pushgateway strategy for metrics
* typo
* Format fix and nullable strategy
* Update pom file for prometheus-emitter
* code review comments. Counter to gauge for cache metrics, periodical task to pushGateway
* Syntax fix
* Dimension label regex include numeric character back, fix previous commit
* bump prometheus-emitter pom dev version
* Remove scheduled task inside poen that push metrics
* Fix checkstyle
* Unit test coverage
* Unit test coverage
* Spelling
* Doc fix
* spelling
Co-authored-by: Michael Schiff <michael.schiff@tubemogul.com>
Co-authored-by: Michael Schiff <schiff.michael@gmail.com>
Co-authored-by: Tianxin Zhao <tianxin.zhao@tubemogul.com>
Co-authored-by: Tianxin Zhao <tizhao@adobe.com>
* Allow only HTTP and HTTPS protocols for the HTTP inputSource
* rename
* Update core/src/main/java/org/apache/druid/data/input/impl/HttpInputSource.java
Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>
* fix http firehose and update doc
* HDFS inputSource
* add configs for allowed protocols
* fix checkstyle and doc
* more checkstyle
* remove stale doc
* remove more doc
* Apply doc suggestions from code review
Co-authored-by: Charles Smith <38529548+techdocsmith@users.noreply.github.com>
* update hdfs address in docs
* fix test
Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>
Co-authored-by: Charles Smith <38529548+techdocsmith@users.noreply.github.com>
* druid task auto scale based on kafka lag
* fix kafkaSupervisorIOConfig and KinesisSupervisorIOConfig
* druid task auto scale based on kafka lag
* fix kafkaSupervisorIOConfig and KinesisSupervisorIOConfig
* test dynamic auto scale done
* auto scale tasks tested on prd cluster
* auto scale tasks tested on prd cluster
* modify code style to solve 29055.10 29055.9 29055.17 29055.18 29055.19 29055.20
* rename test fiel function
* change codes and add docs based on capistrant reviewed
* midify test docs
* modify docs
* modify docs
* modify docs
* merge from master
* Extract the autoScale logic out of SeekableStreamSupervisor to minimize putting more stuff inside there && Make autoscaling algorithm configurable and scalable.
* fix ci failed
* revert msic.xml
* add uts to test autoscaler create && scale out/in and kafka ingest with scale enable
* add more uts
* fix inner class check
* add IT for kafka ingestion with autoscaler
* add new IT in groups=kafka-index named testKafkaIndexDataWithWithAutoscaler
* review change
* code review
* remove unused imports
* fix NLP
* fix docs and UTs
* revert misc.xml
* use jackson to build autoScaleConfig with default values
* add uts
* use jackson to init AutoScalerConfig in IOConfig instead of Map<>
* autoscalerConfig interface and provide a defaultAutoScalerConfig
* modify uts
* modify docs
* fix checkstyle
* revert misc.xml
* modify uts
* reviewed code change
* reviewed code change
* code reviewed
* code review
* log changed
* do StringUtils.encodeForFormat when create allocationExec
* code review && limit taskCountMax to partitionNumbers
* modify docs
* code review
Co-authored-by: yuezhang <yuezhang@freewheel.tv>
* where filter left first draft
* Revert changes in calcite test
* Refactor a bit
* Fixing the Tests
* Changes
* Adding tests
* Add tests for correlated queries
* Add comment
* Fix typos
* Clarify docs
* Apply suggestions from code review
Co-authored-by: Charles Smith <38529548+techdocsmith@users.noreply.github.com>
Co-authored-by: Charles Smith <38529548+techdocsmith@users.noreply.github.com>
* Fix runtime error when IndexedTableJoinMatcher matches long selector to unique string index.
The issue arises when matching against a long selector on the left-hand side to a string
typed Index on the right-hand side, and when that Index also returns true from areKeysUnique.
In this case, IndexedTableJoinMatcher would generate a ConditionMatcher that implements
matchSingleRow by calling findUniqueLong on the Index. This is inappropriate because the Index
is actually string typed. The fix is to check the type of the Index before deciding how to
implement the ConditionMatcher.
The patch adds "testMatchSingleRowToUniqueStringIndex" to IndexedTableJoinMatcherTest, which
explores this case.
* Update tests.
* add query granularity to compaction task
* fix checkstyle
* fix checkstyle
* fix test
* fix test
* add tests
* fix test
* fix test
* cleanup
* rename class
* fix test
* fix test
* add test
* fix test
* Add config and header support for confluent schema registry. (porting code from https://github.com/apache/druid/pull/9096)
* Add Eclipse Public License 2.0 to license check
* Update licenses.yaml, revert changes to check-licenses.py and dependencies for integration-tests
* Add spelling exception and remove unused dependency
* Use non-deprecated getSchemaById() and remove duplicated license entry
* Update docs/ingestion/data-formats.md
Co-authored-by: Clint Wylie <cjwylie@gmail.com>
* Added check for schema being null, as per Confluent code
* Missing imports and whitespace
* Updated unit tests with AvroSchema
Co-authored-by: Sergio Spinatelli <sergio.spinatelli.extern@7-tv.de>
Co-authored-by: Sergio Spinatelli <sergio.spinatelli.extern@joyn.de>
Co-authored-by: Clint Wylie <cjwylie@gmail.com>
* clarify security requirements around HTTPInputSource
* explicitly mention write/datasource in best practices. clarify that the ingestion task is the risk
* Update docs/operations/security-overview.md
Co-authored-by: Suneet Saldanha <suneet@apache.org>
Co-authored-by: Suneet Saldanha <suneet@apache.org>
* use the latest Apache DataSketches release 2.0.0
* updated datasketches version
Co-authored-by: AlexanderSaydakov <AlexanderSaydakov@users.noreply.github.com>
* Granularity: Introduce primitive-typed bucketStart, increment methods.
Saves creation of unnecessary DateTime objects in timestamp_floor and
timestamp_ceil expressions.
* Fix style.
* Amp up the test coverage.
* Improve Druid ldap auth documentation
Improved the ldap auth docs by clarifying that the object classes and
attributes noted are specific to Microsoft Active Directory, and could
be different depending on the specific ldap server being used. Also
emphasized the importance of the memberOf field and noted that the
step about adding users to roles is only needed in certain circumstances.
* * add another note
* Apply suggestions from code review
Co-authored-by: sthetland <steve.hetland@imply.io>
* * simplify
* * Address review comments
Co-authored-by: sthetland <steve.hetland@imply.io>
* add druid jdbc handler config for minimum number of rows per frame
* javadocs and docs adjustments
* spelling
* adjust docs per review with minor tweaks
* adjust more
They all use Long.compare, but they don't need to. Changing to
regular comparisons simplifies the code and also removes branches.
(Internally, Long.compare has two branches.)
* Keep query granularity of compacted segments after compaction
* Protect against null isRollup
* Fix bugspot check RC_REF_COMPARISON_BAD_PRACTICE_BOOLEAN & edit an existing comment
* Make sure that NONE is also included when comparing for the finer granularity
* Update integration test check for segment size due to query granularity propagation affecting size
* Minor code cleanup
* Added functional test to verify queryGranlarity after compaction
* Minor style fix
* Update unit tests
* Segment timeline doesn't show results older than 3 months
* Adoption testing patch for web segment timeline view and also refactoring default time config
* Changing git-commit-id-plugin to use native git, shaving off 15% off build time
Co-authored-by: dev <dev@dev.minitoken.com>
* Support segmentGranularity for auto-compaction
* Support segmentGranularity for auto-compaction
* Support segmentGranularity for auto-compaction
* Support segmentGranularity for auto-compaction
* resolve conflict
* Support segmentGranularity for auto-compaction
* Support segmentGranularity for auto-compaction
* fix tests
* fix more tests
* fix checkstyle
* add unit tests
* fix checkstyle
* fix checkstyle
* fix checkstyle
* add unit tests
* add integration tests
* fix checkstyle
* fix checkstyle
* fix failing tests
* address comments
* address comments
* fix tests
* fix tests
* fix test
* fix test
* fix test
* fix test
* fix test
* fix test
* fix test
* fix test
* Update integration-tests README
Updated the integration-tests README file to include instructions
for setting the `ZK_VERSION` property which is now required to be
set prior to executing any integration test. Also added a note
about the importance of setting the test group parameter when
running integration tests, even when running single tests.
* * revert change made to DOCKER_IP doc
* * Add default value for zk version
* * update travis config to use new zk.version property when running
integration tests
* Remove doc about needing to set ZK_VERSION variable when running
integration tests