* Vectorize LongDeserializers.
Also, add many more tests.
* more faster
* more more faster
* more cleanup
* fixes
* forbidden
* benchmark style
* idk why
* adjust
* add preconditions for value >= 0 for writers
* add 64 bit exception
Co-authored-by: Gian Merlino <gian@imply.io>
* DruidInputSource: Fix issues in column projection, timestamp handling.
DruidInputSource, DruidSegmentReader changes:
1) Remove "dimensions" and "metrics". They are not necessary, because we
can compute which columns we need to read based on what is going to
be used by the timestamp, transform, dimensions, and metrics.
2) Start using ColumnsFilter (see below) to decide which columns we need
to read.
3) Actually respect the "timestampSpec". Previously, it was ignored, and
the timestamp of the returned InputRows was set to the `__time` column
of the input datasource.
(1) and (2) together fix a bug in which the DruidInputSource would not
properly read columns that are used as inputs to a transformSpec.
(3) fixes a bug where the timestampSpec would be ignored if you attempted
to set the column to something other than `__time`.
(1) and (3) are breaking changes.
Web console changes:
1) Remove "Dimensions" and "Metrics" from the Druid input source.
2) Set timestampSpec to `{"column": "__time", "format": "millis"}` for
compatibility with the new behavior.
Other changes:
1) Add ColumnsFilter, a new class that allows input readers to determine
which columns they need to read. Currently, it's only used by the
DruidInputSource, but it could be used by other columnar input sources
in the future.
2) Add a ColumnsFilter to InputRowSchema.
3) Remove the metric names from InputRowSchema (they were unused).
4) Add InputRowSchemas.fromDataSchema method that computes the proper
ColumnsFilter for given timestamp, dimensions, transform, and metrics.
5) Add "getRequiredColumns" method to TransformSpec to support the above.
* Various fixups.
* Uncomment incorrectly commented lines.
* Move TransformSpecTest to the proper module.
* Add druid.indexer.task.ignoreTimestampSpecForDruidInputSource setting.
* Fix.
* Fix build.
* Checkstyle.
* Misc fixes.
* Fix test.
* Move config.
* Fix imports.
* Fixup.
* Fix ShuffleResourceTest.
* Add import.
* Smarter exclusions.
* Fixes based on tests.
Also, add TIME_COLUMN constant in the web console.
* Adjustments for tests.
* Reorder test data.
* Update docs.
* Update docs to say Druid 0.22.0 instead of 0.21.0.
* Fix test.
* Fix ITAutoCompactionTest.
* Changes from review & from merging.
* first pass compaction refactor. includes updated behavior for queryGranularity. removes duplicated doc
* fix links, typos, some reorganization
* fix spelling. TBD still there for work in progress
* updates tutorial examples, adds more clarification around compaction use cases
* add granularity spec to automatic compaction config
* final edits
* spelling fixes
* apply suggestions from review
* upadtes from review
* last edits
* move note
* clarify null
* fix links & spelling
* latest review
* edits to auto-compaction config
* add back rollup
* fix links & spelling
* Update compaction.md
add granularityspec to example
* Fix Auto-compaction with segment granularity retrieve incomplete segments from timeline when interval overlap
* Fix Auto-compaction with segment granularity retrieve incomplete segments from timeline when interval overlap
* Fix Auto-compaction with segment granularity retrieve incomplete segments from timeline when interval overlap
* Fix Auto-compaction with segment granularity retrieve incomplete segments from timeline when interval overlap
* address comments
* Auto-compaction with segment granularity should skip segments that already have the configured segmentGranularity
* Auto-compaction with segment granularity should skip segments that already have the configured segmentGranularity
* Auto-compaction with segment granularity should skip segments that already have the configured segmentGranularity
* address comments
* address comments
* address comments
* address comments
* address comments
* Fix regression introduced by #11008
* Add back and tweak the check to not inspect resources for authorization when AllowAllAuthorizer is configured.
Add a unit test to validate that the change doesn't introduce new behavior.
Size HashMap and HashSet appropriately. Perf analysis of the queries
revealed that over 25% of the query time was spent in resizing HashMap and HashSet
collections. Also, prevent the need to examine and authorize all resources when
AllowAllAuthorizer is the configured authorizer.
* expression filter support for vectorized query engines
* remove unused codes
* more tests
* refactor, more tests
* suppress
* more
* more
* more
* oops, i was wrong
* comment
* remove decorate, object dimension selector, more javadocs
* style
* logs more info when delete segments && add deleteSegments-related UT
* revert msic.xml
* code review
* use log.debugSegments
Co-authored-by: yuezhang <yuezhang@freewheel.tv>
* Ability to use mirror of archive.apache.org
* Ability to use mirror of archive.apache.org: documentation
* Ability to use mirror of archive.apache.org: fix int test Dockerfile: missing COPY instruction
* Suppress logging for some exceptions to reduce excessive stack trace messages
Signed-off-by: frank chen <frank.chen021@outlook.com>
* log message for channel disconnected exception
Signed-off-by: frank chen <frank.chen021@outlook.com>
* license.yaml fixes for code introduced related to AWS RDS token based password provider in PR #9518
* add notice for commons-dbcp in license file
* add version and update NOTICE file
* fix SQL issue for group by queries with time filter that gets optimized to false
* short circuit always false in CombineAndSimplifyBounds
* adjust
* javadocs
* add preconditions for and/or filters to ensure they have children
* add comments, remove preconditions
* prometheus-emitter
* use existing jetty server to expose prometheus collection endpoint
* unused variables
* better variable names
* removed unused dependencies
* more metric definitions
* reorganize
* use prometheus HTTPServer instead of hooking into Jetty server
* temporary empty help string
* temporary non-empty help. fix incorrect dimension value in JSON (also updated statsd json)
* added full help text. added metric conversion factor for timers that are not using seconds. Correct metric dimension name in documentation
* added documentation for prometheus emitter
* safety for invalid labelNames
* fix travis checks
* Unit test and better sanitization of metrics names and label values
* add precondition to check namespace against regex
* use precompiled regex
* remove static imports. fix metric types
* better docs. fix possible NPE in PrometheusEmitterConfig. Guard against multiple calls to PrometheusEmitter.start()
* Update regex for label-value replacements to allow internal numeric values. Additional tests
* Adds missing license header
updates website/.spelling to add words used in prometheus-emitter docs.
updates docs/operations/metrics.md to correct the spelling of
bufferPoolName
* fixes version in extensions-contrib/prometheus-emitter
* fix style guide errors
* update import ordering
* add another word to website/.spelling
* remove unthrown declared exception
* remove unused import
* Pushgateway strategy for metrics
* typo
* Format fix and nullable strategy
* Update pom file for prometheus-emitter
* code review comments. Counter to gauge for cache metrics, periodical task to pushGateway
* Syntax fix
* Dimension label regex include numeric character back, fix previous commit
* bump prometheus-emitter pom dev version
* Remove scheduled task inside poen that push metrics
* Fix checkstyle
* Unit test coverage
* Unit test coverage
* Spelling
* Doc fix
* spelling
Co-authored-by: Michael Schiff <michael.schiff@tubemogul.com>
Co-authored-by: Michael Schiff <schiff.michael@gmail.com>
Co-authored-by: Tianxin Zhao <tianxin.zhao@tubemogul.com>
Co-authored-by: Tianxin Zhao <tizhao@adobe.com>