* lay the groundwork for throttling replicant loads per RunRules execution
* Add dynamic coordinator config to control new replicant threshold.
* remove redundant line
* add some unit tests
* fix checkstyle error
* add documentation for new dynamic config
* improve docs and logs
* Alter how null is handled for new config. If null, manually set as default
* Add feature to automatically remove rules based on retention period
* Add feature to automatically remove rules based on retention period
* address comments
* DruidInputSource: Fix issues in column projection, timestamp handling.
DruidInputSource, DruidSegmentReader changes:
1) Remove "dimensions" and "metrics". They are not necessary, because we
can compute which columns we need to read based on what is going to
be used by the timestamp, transform, dimensions, and metrics.
2) Start using ColumnsFilter (see below) to decide which columns we need
to read.
3) Actually respect the "timestampSpec". Previously, it was ignored, and
the timestamp of the returned InputRows was set to the `__time` column
of the input datasource.
(1) and (2) together fix a bug in which the DruidInputSource would not
properly read columns that are used as inputs to a transformSpec.
(3) fixes a bug where the timestampSpec would be ignored if you attempted
to set the column to something other than `__time`.
(1) and (3) are breaking changes.
Web console changes:
1) Remove "Dimensions" and "Metrics" from the Druid input source.
2) Set timestampSpec to `{"column": "__time", "format": "millis"}` for
compatibility with the new behavior.
Other changes:
1) Add ColumnsFilter, a new class that allows input readers to determine
which columns they need to read. Currently, it's only used by the
DruidInputSource, but it could be used by other columnar input sources
in the future.
2) Add a ColumnsFilter to InputRowSchema.
3) Remove the metric names from InputRowSchema (they were unused).
4) Add InputRowSchemas.fromDataSchema method that computes the proper
ColumnsFilter for given timestamp, dimensions, transform, and metrics.
5) Add "getRequiredColumns" method to TransformSpec to support the above.
* Various fixups.
* Uncomment incorrectly commented lines.
* Move TransformSpecTest to the proper module.
* Add druid.indexer.task.ignoreTimestampSpecForDruidInputSource setting.
* Fix.
* Fix build.
* Checkstyle.
* Misc fixes.
* Fix test.
* Move config.
* Fix imports.
* Fixup.
* Fix ShuffleResourceTest.
* Add import.
* Smarter exclusions.
* Fixes based on tests.
Also, add TIME_COLUMN constant in the web console.
* Adjustments for tests.
* Reorder test data.
* Update docs.
* Update docs to say Druid 0.22.0 instead of 0.21.0.
* Fix test.
* Fix ITAutoCompactionTest.
* Changes from review & from merging.
* first pass compaction refactor. includes updated behavior for queryGranularity. removes duplicated doc
* fix links, typos, some reorganization
* fix spelling. TBD still there for work in progress
* updates tutorial examples, adds more clarification around compaction use cases
* add granularity spec to automatic compaction config
* final edits
* spelling fixes
* apply suggestions from review
* upadtes from review
* last edits
* move note
* clarify null
* fix links & spelling
* latest review
* edits to auto-compaction config
* add back rollup
* fix links & spelling
* Update compaction.md
add granularityspec to example
* Allow only HTTP and HTTPS protocols for the HTTP inputSource
* rename
* Update core/src/main/java/org/apache/druid/data/input/impl/HttpInputSource.java
Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>
* fix http firehose and update doc
* HDFS inputSource
* add configs for allowed protocols
* fix checkstyle and doc
* more checkstyle
* remove stale doc
* remove more doc
* Apply doc suggestions from code review
Co-authored-by: Charles Smith <38529548+techdocsmith@users.noreply.github.com>
* update hdfs address in docs
* fix test
Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>
Co-authored-by: Charles Smith <38529548+techdocsmith@users.noreply.github.com>
* add druid jdbc handler config for minimum number of rows per frame
* javadocs and docs adjustments
* spelling
* adjust docs per review with minor tweaks
* adjust more
* dynamic coord config adding more balancing control
add new dynamic coordinator config, maxSegmentsToConsiderPerMove. This
config caps the number of segments that are iterated over when selecting
a segment to move. The default value combined with current balancing
strategies will still iterate over all provided segments. However,
setting this value to something > 0 will cap the number of segments
visited. This could make sense in cases where a cluster has a very large
number of segments and the admins prefer less iterations vs a thorough
consideration of all segments provided.
* fix checkstyle failure
* Make doc more detailed for admin to understand when/why to use new config
* refactor PR to use a % of segments instead of raw number
* update the docs
* remove bad doc line
* fix typo in name of new dynamic config
* update RservoirSegmentSampler to gracefully deal with values > 100%
* add handler for <= 0 in ReservoirSegmentSampler
* fixup CoordinatorDynamicConfigTest naming and argument ordering
* fix items in docs after spellcheck flags
* Fix lgtm flag on missing space in string literal
* improve documentation for new config
* Add default value to config docs and add advice in cluster tuning doc
* Add percentOfSegmentsToConsiderPerMove to web console coord config dialog
* update jest snapshot after console change
* fix spell checker errors
* Improve debug logging in getRandomSegmentBalancerHolder to cover all bad inputs for % of segments to consider
* add new config back to web console module after merge with master
* fix ReservoirSegmentSamplerTest
* fix line breaks in coordinator console dialog
* Add a test that helps ensure not regressions for percentOfSegmentsToConsiderPerMove
* Make improvements based off of feedback in review
* additional cleanup coming from review
* Add a warning log if limit on segments to consider for move can't be calcluated
* remove unused import
* fix tests for CoordinatorDynamicConfig
* remove precondition test that is redundant in CoordinatorDynamicConfig Builder class
* fix to allow customer storage location selector strategy
* add test cases to check instance of selector strategy
* update doc
* code format
* resolve code review comments
* inject StorageLocation
* fix CI
* fix mismatched license item reported by CI
* change property path from druid.segmentCache.locationSelectorStrategy.type to druid.segmentCache.locationSelector.strategy
* using a helper method to bind to correct property path
* Adding more dimensions to the audit log entry
* Making adding payload in audit metric optional
* Changing the name of the parameter to includePayloadAsDimensionInMetric. Adding a unit test
* Fixing the intellij code introspection issues
* Working
* add test
* doc
* fix test
* split other integration test
* exclude other-index from other tests
* doc anchor fix
* adjust task slots and number of merge tasks
* spell check
* reduce maxNumConcurrentSubTasks to 1
* maxNumConcurrentSubtasks for range partitinoing
* reduce memory for historical
* change group name
* support unit suffix on byte-related properties
* add doc
* change default value of byte-related properites in example files
* fix coding style
* fix doc
* fix CI
* suppress spelling errors
* improve code according to comments
* rename Bytes to HumanReadableBytes
* add getBytesInInt to get value safely
* improve doc
* fix problem reported by CI
* fix problem reported by CI
* resolve code review comments
* improve error message
* improve code & doc according to comments
* fix CI problem
* improve doc
* suppress spelling check errors
* Filter http requests by http method
Add a config that allows a user which http methods to allow against their
Druid server.
Druid will only accept http requests with the method: GET, PUT, POST, DELETE
and OPTIONS.
If a Druid admin wants to allow other methods, they can do so by using the
ServerConfig#allowedHttpMethods config.
If a Druid user would like to disallow OPTIONS, this can be done by changing
the AuthConfig#allowUnauthenticatedHttpOptions config
* Exclude OPTIONS from always supported HTTP methods
Add HEAD as an allowed method for web console e2e tests
* fix docs
* fix security IT
* Actually fix the web console e2e tests
* Ignore icode coverage for nitialization classes
* code review
* change default number of segment loading threads
* fix docs
* missed file
* min -> max for segment loading threads
Co-authored-by: Dylan <dwylie@spotx.tv>