* Optimization for expressions that hit a single long column.
There was previously a single-long-input optimization that applied only
to the time column. These have been combined together. Also adds
type-specific value caching to ExprEval, which allowed simplifying
the SingleLongInputCachingExpressionColumnValueSelector code.
* Add more benchmarks.
* Don't use LRU cache for __time.
* Simplify a bit.
* Let the cache grow.
* Prohibit some guava collection APIs and use JDK APIs directly
* reset files that changed by accident
* sort codestyle/druid-forbidden-apis.txt alphabetically
* add PrefixFilteredDimensionSpec for multi-value dimensions
* add docs for PrefixFilteredDimensionSpec
* remove unnecessary null handling
* add null check to the result of NullHandling
* Add optional `name` to top level of FilteredAggregatorFactory
* Add compat constructor for tests
* Address comments
* Add equals and hash code updates
* Rename test
* Fix imports and code style
This PR accumulates many refactorings and small improvements that I did while preparing the next change set of https://github.com/druid-io/druid/projects/2. I finally decided to make them a separate PR to minimize the volume of the main PR.
Some of the changes:
- Renamed confusing "Generic Column" term to "Numeric Column" (what it actually implies) in many class names.
- Generified `ComplexMetricExtractor`
* Added backpressure metric
* Updated channelReadable to AtomicBoolean and fixed broken test
* Moved backpressure metric logic to NettyHttpClient
* Fix placement of calculating backPressureDuration
Possibly related to https://github.com/apache/incubator-druid/issues/4937
--------
There is currently a race condition in IncrementalIndexStorageAdapter that can lead to exceptions like the following, when running queries with filters on String dimensions that hit realtime tasks:
```
org.apache.druid.java.util.common.ISE: id[5] >= maxId[5]
at org.apache.druid.segment.StringDimensionIndexer$1IndexerDimensionSelector.lookupName(StringDimensionIndexer.java:591)
at org.apache.druid.segment.StringDimensionIndexer$1IndexerDimensionSelector$2.matches(StringDimensionIndexer.java:562)
at org.apache.druid.segment.incremental.IncrementalIndexStorageAdapter$IncrementalIndexCursor.advance(IncrementalIndexStorageAdapter.java:284)
```
When the `filterMatcher` is created in the constructor of `IncrementalIndexStorageAdapter.IncrementalIndexCursor`, `StringDimensionIndexer.makeDimensionSelector` gets called eventually, which calls:
```
final int maxId = getCardinality();
...
@Override
public int getCardinality()
{
return dimLookup.size();
}
```
So `maxId` is set to the size of the dictionary at the time that the `filterMatcher` is created.
However, the `maxRowIndex` which is meant to prevent the Cursor from returning rows that were added after the Cursor was created (see https://github.com/apache/incubator-druid/pull/4049) is set after the `filterMatcher` is created.
If rows with new dictionary values are added after the `filterMatcher` is created but before `maxRowIndex` is set, then it is possible for the Cursor to return rows that contain the new values, which will have `id >= maxId`.
This PR sets `maxRowIndex` before creating the `filterMatcher` to prevent rows with unknown dictionary IDs from being passed to the `filterMatcher`.
-----------
The included test triggers the error with a custom Filter + DruidPredicateFactory.
The DimensionSelector for predicate-based filter matching is created here in `Filters.makeValueMatcher`:
```
public static ValueMatcher makeValueMatcher(
final ColumnSelectorFactory columnSelectorFactory,
final String columnName,
final DruidPredicateFactory predicateFactory
)
{
final ColumnCapabilities capabilities = columnSelectorFactory.getColumnCapabilities(columnName);
// This should be folded into the ValueMatcherColumnSelectorStrategy once that can handle LONG typed columns.
if (capabilities != null && capabilities.getType() == ValueType.LONG) {
return getLongPredicateMatcher(
columnSelectorFactory.makeColumnValueSelector(columnName),
predicateFactory.makeLongPredicate()
);
}
final ColumnSelectorPlus<ValueMatcherColumnSelectorStrategy> selector =
DimensionHandlerUtils.createColumnSelectorPlus(
ValueMatcherColumnSelectorStrategyFactory.instance(),
DefaultDimensionSpec.of(columnName),
columnSelectorFactory
);
return selector.getColumnSelectorStrategy().makeValueMatcher(selector.getSelector(), predicateFactory);
}
```
The test Filter adds a row to the IncrementalIndex in the test when the predicateFactory creates a new String predicate, after `DimensionHandlerUtils.createColumnSelectorPlus` is called.
* Broker backpressure.
Adds a new property "druid.broker.http.maxQueuedBytes" and a new context
parameter "maxQueuedBytes". Both represent a maximum number of bytes queued
per query before exerting backpressure on the channel to the data server.
Fixes#4933.
* Fix query context doc.
* make COMPLEX column filterable in Druid code
* Revert "make COMPLEX column filterable in Druid code"
This reverts commit 9fc6ec768c.
* complex columns can be optionally made filterable
* some types are always filterable
* add ColumnCapabilitiesImpl serde tests
* add SuppresedWarnings annotation
* Rename io.druid to org.apache.druid.
* Fix META-INF files and remove some benchmark results.
* MonitorsConfig update for metrics package migration.
* Reorder some dimensions in inner queries for some reason.
* Fix protobuf tests.
* add subtotalsSpec attribute to groupBy query
* dont sent subtotalsSpec to downstream nodes from broker and other updates
* address review comment
* fix checkstyle issues after merge to master
* add docs for subtotalsSpec feature
* address doc review comments
The bug was caused by makeExprEvalSelector returning a null object, which
it isn't supposed to do. Fixed this by renaming ConstantColumnValueSelector
to ConstantExprEvalSelector (it was only used for ExprEval anyway) and
putting logic in that class to make sure the selectors behave as expected.
* Fix all inspection errors currently reported.
TeamCity builds on master are reporting inspection errors, possibly
because there was a while where it was not running due to the Apache
migration, and there was some drift.
* Fix one more location.
* Fix tests.
* Another fix.
* Fix four bugs with numeric dimension output types.
This patch includes the following bug fixes:
- TopNColumnSelectorStrategyFactory: Cast dimension values to the output type
during dimExtractionScanAndAggregate instead of updateDimExtractionResults.
This fixes a bug where, for example, grouping on doubles-cast-to-longs would
fail to merge two doubles that should have been combined into the same long value.
- TopNQueryEngine: Use DimExtractionTopNAlgorithm when treating string columns
as numeric dimensions. This fixes a similar bug: grouping on string-cast-to-long
would fail to merge two strings that should have been combined.
- GroupByQuery: Cast numeric types to the expected output type before comparing them
in compareDimsForLimitPushDown. This fixes#6123.
- GroupByQueryQueryToolChest: Convert Jackson-deserialized dimension values into
the proper output type. This fixes an inconsistency between results that came
from cache vs. not-cache: for example, Jackson sometimes deserializes integers
as Integers and sometimes as Longs.
And the following code-cleanup changes, related to the fixes above:
- DimensionHandlerUtils: Introduce convertObjectToType, compareObjectsAsType,
and converterFromTypeToType to make it easier to handle casting operations.
- TopN in general: Rename various "dimName" variables to "dimValue" where they
actually represent dimension values. The old names were confusing.
* Remove unused imports.
* Cache: Add maxEntrySize config.
The idea is this makes it more feasible to cache query types that
can potentially generate large result sets, like groupBy and select,
without fear of writing too much to the cache per query.
Includes a refactor of cache population code in CachingQueryRunner and
CachingClusteredClient, such that they now use the same CachePopulator
interface with two implementations: one for foreground and one for
background.
The main reason for splitting the foreground / background impls is
that the foreground impl can have a more effective implementation of
maxEntrySize. It can stop retaining subvalues for the cache early.
* Add CachePopulatorStats.
* Fix whitespace.
* Fix docs.
* Fix various tests.
* Add tests.
* Fix tests.
* Better tests
* Remove conflict markers.
* Fix licenses.
* order using IncrementalIndexRowComparator at persist time when rollup is disabled, allowing increased effectiveness of dimension compression, resolves#6066
* fix stuff from review
* Optimize per-segment queries
* Always optimize, add unit test
* PR comments
* Only run IntervalDimFilter optimization on __time column
* PR comments
* Checkstyle fix
* Add test for non __time column
* Add lastString and firstString aggregators extension
* Remove duplicated class
* Move first-last-string doc page to extensions-contrib
* Fix ObjectStrategy compare method
* Fix doc bad aggregatos type name
* Create FoldingAggregatorFactory classes to fix SegmentMetadataQuery
* Add getMaxStringBytes() method to support JSON serialization
* Fix null pointer exception at segment creation phase when the string value is null
* Control the valueSelector object class on BufferAggregators
* Perform all improvements
* Add java doc on SerializablePairLongStringSerde
* Refactor ObjectStraty compare method
* Remove unused ;
* Add aggregateCombiner unit tests. Rename BufferAggregators unit tests
* Remove unused imports
* Add license header
* Add class name to java doc class serde
* Throw exception if value is unsupported class type
* Move first-last-string extension into druid core
* Update druid core docs
* Fix null pointer exception when pair->string is null
* Add null control unit tests
* Remove unused imports
* Add first/last string folding aggregator on AggregatorsModule to support segment metadata query
* Change SerializablePairLongString to extend SerializablePair
* Change vars from public to private
* Convert vars to primitive type
* Clarify compare comment
* Change IllegalStateException to ISE
* Remove TODO comments
* Control possible null pointer exception
* Add @Nullable annotation
* Remove empty line
* Remove unused parameter type
* Improve AggregatorCombiner javadocs
* Add filterNullValues option at StringLast and StringFirst aggregators
* Add filterNullValues option at agg documentation
* Fix checkstyle
* Update header license
* Fix StringFirstAggregatorFactory.VALUE_COMPARATOR
* Fix StringFirstAggregatorCombiner
* Fix if condition at StringFirstAggregateCombiner
* Remove filterNullValues from string first/last aggregators
* Add isReset flag in FirstAggregatorCombiner
* Change Arrays.asList to Collections.singletonList
* Fix 'auto' encoded longs + compression serializer
Fixes#6044
changes:
* Fixes `VSizeLongSerde` serializers to treat 'close' as 'flush' when used with `BlockLayoutColumnarLongsSerializer`, allowing unwritten values to be flushed to the buffer when the block is compressed
* Add exhaustive unit test that flexes a variety of value sizes, row counts, and compression strategies to catch issues such as these
:
* refactor LongSerializer close to be named flush instead
* revert and just make new serializers per block
* Various changes about druid-services module
* Patch improvements from reviewer
* Add ToArrayCallWithZeroLengthArrayArgument & ArraysAsListWithZeroOrOneArgument into inspection profile
* Fix ArraysAsListWithZeroOrOneArgument
* Fix conflict
* Fix ToArrayCallWithZeroLengthArrayArgument
* Fix AliEqualsAvoidNull
* Remove blank line
* Remove unused import clauses
* Fix code style in TopNQueryRunnerTest
* Fix conflict
* Don't use Collections.singletonList when converting the type of array type
* Add argLine into maven-surefire-plugin in druid-process module & increase the timeout value for testMoveSegment testcase
* Roll back the latest commit
* Add java.io.File#toURL() into druid-forbidden-apis
* Using Boolean.parseBoolean instead of Boolean.valueOf for CliCoordinator#isOverlord
* Add a new regexp element into stylecode xml file
* Fix style error for new regexp
* Set the level of ArraysAsListWithZeroOrOneArgument as WARNING
* Fix style error for new regexp
* Add option BY_LEVEL for ToArrayCallWithZeroLengthArrayArgument in inspection profile
* Roll back the level as ToArrayCallWithZeroLengthArrayArgument as ERROR
* Add toArray(new Object[0]) regexp into checkstyle config file & fix them
* Set the level of ArraysAsListWithZeroOrOneArgument as ERROR & Roll back the level of ToArrayCallWithZeroLengthArrayArgument as WARNING until Youtrack fix it
* Add a comment for string equals regexp in checkstyle config
* Fix code format
* Add RedundantTypeArguments as ERROR level inspection
* Fix cannot resolve symbol datasource
False failures on Travis due to spurious timeout (in turn due to noisy
neighbors) is a bigger problem than legitimate failures taking too long
to time out. So it makes sense to extend timeouts.
* Lazy-ify ValueMatcher BitSet optimization for string dimensions.
The idea is that if the prior evaluated filters are decently selective,
such that they mean we won't see all possible values of the later
filters, then the eager version of the optimization is too wasteful.
This involves checking an extra bitset, but the overhead is small even
if the lazy-ification is useless.
* Remove import.
* Minor transformation
* fixes#5814
changes:
* pass `StorageAdapter` to topn algorithms to get things like if column is 'sorted' or if query interval is smaller than segment granularity, instead of using `io.druid.segment.Capabilities`
* remove `io.druid.segment.Capabilities` since it had one purpose, supplying `dimensionValuesSorted` which is now provided directly by `StorageAdapter`.
* added test for topn optimization path checking
* add Capabilities back since StorageAdapter is marked PublicApi
* oops
* add javadoc, fix build i think
* correctly revert api changes
* fix intellij fail
* fix typo :(
* Revert "Consider waiting and pending compaction tasks as well as running tasks in DruidCoordinatorSegmentCompactor (#5704)"
This reverts commit c7a59394e0.
* Revert "Fix metrics for inserting segments (#5749)"
This reverts commit c9d645103b.
* Revert "Typo fix in historical doc (#5753)"
This reverts commit aa23fe6386.
* Revert "Use a bimap for reverse lookups on injective maps (#5681)"
This reverts commit e1277d306c.
* The check for maxBytesInMemory should be >= 0 instead of > 0
* if the default value is 0, the actual check could be skipped
* fix the message for persistReasons
* Address PR comments
* if maxBytes set -1, make is Long.MAX_VAL, so we do not need to check if it's 0 or -1
* set the maxBytesTuningconfig in AppenderatorImpl constructor to avoid duplicate code
* fix the failing test cases
* Address PR comments
* This commit introduces a new tuning config called 'maxBytesInMemory' for ingestion tasks
Currently a config called 'maxRowsInMemory' is present which affects how much memory gets
used for indexing.If this value is not optimal for your JVM heap size, it could lead
to OutOfMemoryError sometimes. A lower value will lead to frequent persists which might
be bad for query performance and a higher value will limit number of persists but require
more jvm heap space and could lead to OOM.
'maxBytesInMemory' is an attempt to solve this problem. It limits the total number of bytes
kept in memory before persisting.
* The default value is 1/3(Runtime.maxMemory())
* To maintain the current behaviour set 'maxBytesInMemory' to -1
* If both 'maxRowsInMemory' and 'maxBytesInMemory' are present, both of them
will be respected i.e. the first one to go above threshold will trigger persist
* Fix check style and remove a comment
* Add overlord unsecured paths to coordinator when using combined service (#5579)
* Add overlord unsecured paths to coordinator when using combined service
* PR comment
* More error reporting and stats for ingestion tasks (#5418)
* Add more indexing task status and error reporting
* PR comments, add support in AppenderatorDriverRealtimeIndexTask
* Use TaskReport instead of metrics/context
* Fix tests
* Use TaskReport uploads
* Refactor fire department metrics retrieval
* Refactor input row serde in hadoop task
* Refactor hadoop task loader names
* Truncate error message in TaskStatus, add errorMsg to task report
* PR comments
* Allow getDomain to return disjointed intervals (#5570)
* Allow getDomain to return disjointed intervals
* Indentation issues
* Adding feature thetaSketchConstant to do some set operation in PostAgg (#5551)
* Adding feature thetaSketchConstant to do some set operation in PostAggregator
* Updated review comments for PR #5551 - Adding thetaSketchConstant
* Fixed CI build issue
* Updated review comments 2 for PR #5551 - Adding thetaSketchConstant
* Fix taskDuration docs for KafkaIndexingService (#5572)
* With incremental handoff the changed line is no longer true.
* Add doc for automatic pendingSegments (#5565)
* Add missing doc for automatic pendingSegments
* address comments
* Fix indexTask to respect forceExtendableShardSpecs (#5509)
* Fix indexTask to respect forceExtendableShardSpecs
* add comments
* Deprecate spark2 profile in pom.xml (#5581)
Deprecated due to https://github.com/druid-io/druid/pull/5382
* CompressionUtils: Add support for decompressing xz, bz2, zip. (#5586)
Also switch various firehoses to the new method.
Fixes#5585.
* This commit introduces a new tuning config called 'maxBytesInMemory' for ingestion tasks
Currently a config called 'maxRowsInMemory' is present which affects how much memory gets
used for indexing.If this value is not optimal for your JVM heap size, it could lead
to OutOfMemoryError sometimes. A lower value will lead to frequent persists which might
be bad for query performance and a higher value will limit number of persists but require
more jvm heap space and could lead to OOM.
'maxBytesInMemory' is an attempt to solve this problem. It limits the total number of bytes
kept in memory before persisting.
* The default value is 1/3(Runtime.maxMemory())
* To maintain the current behaviour set 'maxBytesInMemory' to -1
* If both 'maxRowsInMemory' and 'maxBytesInMemory' are present, both of them
will be respected i.e. the first one to go above threshold will trigger persist
* Address code review comments
* Fix the coding style according to druid conventions
* Add more javadocs
* Rename some variables/methods
* Other minor issues
* Address more code review comments
* Some refactoring to put defaults in IndexTaskUtils
* Added check for maxBytesInMemory in AppenderatorImpl
* Decrement bytes in abandonSegment
* Test unit test for multiple sinks in single appenderator
* Fix some merge conflicts after rebase
* Fix some style checks
* Merge conflicts
* Fix failing tests
Add back check for 0 maxBytesInMemory in OnHeapIncrementalIndex
* Address PR comments
* Put defaults for maxRows and maxBytes in TuningConfig
* Change/add javadocs
* Refactoring and renaming some variables/methods
* Fix TeamCity inspection warnings
* Added maxBytesInMemory config to HadoopTuningConfig
* Updated the docs and examples
* Added maxBytesInMemory config in docs
* Removed references to maxRowsInMemory under tuningConfig in examples
* Set maxBytesInMemory to 0 until used
Set the maxBytesInMemory to 0 if user does not set it as part of tuningConfing
and set to part of max jvm memory when ingestion task starts
* Update toString in KafkaSupervisorTuningConfig
* Use correct maxBytesInMemory value in AppenderatorImpl
* Update DEFAULT_MAX_BYTES_IN_MEMORY to 1/6 max jvm memory
Experimenting with various defaults, 1/3 jvm memory causes OOM
* Update docs to correct maxBytesInMemory default value
* Minor to rename and add comment
* Add more details in docs
* Address new PR comments
* Address PR comments
* Fix spelling typo
* Use mergeBuffer instead of processingBuffer in parallelCombiner
* Fix test
* address comments
* fix test
* Fix test
* Update comment
* address comments
* fix build
* Fix test failure
* Adding feature thetaSketchConstant to do some set operation in PostAggregator
* Updated review comments for PR #5551 - Adding thetaSketchConstant
* Fixed CI build issue
* Updated review comments 2 for PR #5551 - Adding thetaSketchConstant
* Add more indexing task status and error reporting
* PR comments, add support in AppenderatorDriverRealtimeIndexTask
* Use TaskReport instead of metrics/context
* Fix tests
* Use TaskReport uploads
* Refactor fire department metrics retrieval
* Refactor input row serde in hadoop task
* Refactor hadoop task loader names
* Truncate error message in TaskStatus, add errorMsg to task report
* PR comments
* Use the official aws-sdk instead of jet3t
* fix compile and serde tests
* address comments and fix test
* add http version string
* remove redundant dependencies, fix potential NPE, and fix test
* resolve TODOs
* fix build
* downgrade jackson version to 2.6.7
* fix test
* resolve the last TODO
* support proxy and endpoint configurations
* fix build
* remove debugging log
* downgrade hadoop version to 2.8.3
* fix tests
* remove unused log
* fix it test
* revert KerberosAuthenticator change
* change hadoop-aws scope to provided in hdfs-storage
* address comments
* address comments
* Future-proof some Guava usage
* Use a java-util EmptyIterator instead of Guava's
* Change some of the guava future handling to do manual async
transforms. Guava changes transform into transformAsync by deprecating
transform in ONLY Guava 19. Then its gone in 20
* Use `Collections.emptyIterator()`
* Pretty formatting
* Make listenable future transforms a thing in default druid
* Format fix
* Add forbidden guava apis
* Make the ListenableFutrues.transformAsync have comments
* Undo intellij bad pattern matching in comments
* Futrues --> Futures
* Add empty iterators forbidding
* Fix extra `A`
* Correct method signature
* Address review comments
* Finish Gian review comments
* Proper syntax from https://github.com/policeman-tools/forbidden-apis/wiki/SignaturesSyntax
* Fix round robining in router.
Say that ten times fast.
For query endpoints, AsyncQueryForwardingServlet called hostFinder.getDefaultServer()
to set a default server, followed by hostFinder.getServer(inputQuery) to override it
with query-specific routing. Since hostFinder is round-robin, this skips a server.
When there are only two servers, one server is _always_ skipped and the router sends
all queries to the same broker.
* Adjust spacing.
* SegmentMetadataQuery: Fix default interval handling.
PR #4131 introduced a new copy builder for segmentMetadata that did
not retain the value of usingDefaultInterval. This led to it being
dropped and the default-interval handling not working as expected.
Instead of using the default 1 week history when intervals are not
provided, the segmentMetadata query would query _all_ segments,
incurring an unexpected performance hit.
This patch fixes the bug and adds a test for the copy builder.
* Intervals