Possibly related to https://github.com/apache/incubator-druid/issues/4937
--------
There is currently a race condition in IncrementalIndexStorageAdapter that can lead to exceptions like the following, when running queries with filters on String dimensions that hit realtime tasks:
```
org.apache.druid.java.util.common.ISE: id[5] >= maxId[5]
at org.apache.druid.segment.StringDimensionIndexer$1IndexerDimensionSelector.lookupName(StringDimensionIndexer.java:591)
at org.apache.druid.segment.StringDimensionIndexer$1IndexerDimensionSelector$2.matches(StringDimensionIndexer.java:562)
at org.apache.druid.segment.incremental.IncrementalIndexStorageAdapter$IncrementalIndexCursor.advance(IncrementalIndexStorageAdapter.java:284)
```
When the `filterMatcher` is created in the constructor of `IncrementalIndexStorageAdapter.IncrementalIndexCursor`, `StringDimensionIndexer.makeDimensionSelector` gets called eventually, which calls:
```
final int maxId = getCardinality();
...
@Override
public int getCardinality()
{
return dimLookup.size();
}
```
So `maxId` is set to the size of the dictionary at the time that the `filterMatcher` is created.
However, the `maxRowIndex` which is meant to prevent the Cursor from returning rows that were added after the Cursor was created (see https://github.com/apache/incubator-druid/pull/4049) is set after the `filterMatcher` is created.
If rows with new dictionary values are added after the `filterMatcher` is created but before `maxRowIndex` is set, then it is possible for the Cursor to return rows that contain the new values, which will have `id >= maxId`.
This PR sets `maxRowIndex` before creating the `filterMatcher` to prevent rows with unknown dictionary IDs from being passed to the `filterMatcher`.
-----------
The included test triggers the error with a custom Filter + DruidPredicateFactory.
The DimensionSelector for predicate-based filter matching is created here in `Filters.makeValueMatcher`:
```
public static ValueMatcher makeValueMatcher(
final ColumnSelectorFactory columnSelectorFactory,
final String columnName,
final DruidPredicateFactory predicateFactory
)
{
final ColumnCapabilities capabilities = columnSelectorFactory.getColumnCapabilities(columnName);
// This should be folded into the ValueMatcherColumnSelectorStrategy once that can handle LONG typed columns.
if (capabilities != null && capabilities.getType() == ValueType.LONG) {
return getLongPredicateMatcher(
columnSelectorFactory.makeColumnValueSelector(columnName),
predicateFactory.makeLongPredicate()
);
}
final ColumnSelectorPlus<ValueMatcherColumnSelectorStrategy> selector =
DimensionHandlerUtils.createColumnSelectorPlus(
ValueMatcherColumnSelectorStrategyFactory.instance(),
DefaultDimensionSpec.of(columnName),
columnSelectorFactory
);
return selector.getColumnSelectorStrategy().makeValueMatcher(selector.getSelector(), predicateFactory);
}
```
The test Filter adds a row to the IncrementalIndex in the test when the predicateFactory creates a new String predicate, after `DimensionHandlerUtils.createColumnSelectorPlus` is called.
* fix opentsdb emitter always be running
* check if emitter started
* add more details about consumeDelay in doc
* fix possible thread unsafe
* fix fail sending tags whose value contain colon
* 'suspend' and 'resume' support for kafka indexing service
changes:
* introduces `SuspendableSupervisorSpec` interface to describe supervisors which support suspend/resume functionality controlled through the `SupervisorManager`, which will gracefully shutdown the supervisor and it's tasks, update it's `SupervisorSpec` with either a suspended or running state, and update with the toggled spec. Spec updates are provided by `SuspendableSupervisorSpec.createSuspendedSpec` and `SuspendableSupervisorSpec.createRunningSpec` respectively.
* `KafkaSupervisorSpec` extends `SuspendableSupervisorSpec` and now supports suspend/resume functionality. The difference in behavior between 'running' and 'suspended' state is whether the supervisor will attempt to ensure that indexing tasks are or are not running respectively. Behavior is identical otherwise.
* `SupervisorResource` now provides `/druid/indexer/v1/supervisor/{id}/suspend` and `/druid/indexer/v1/supervisor/{id}/resume` which are used to suspend/resume suspendable supervisors
* Deprecated `/druid/indexer/v1/supervisor/{id}/shutdown` and moved it's functionality to `/druid/indexer/v1/supervisor/{id}/terminate` since 'shutdown' is ambiguous verbage for something that effectively stops a supervisor forever
* Added ability to get all supervisor specs from `/druid/indexer/v1/supervisor` by supplying the 'full' query parameter `/druid/indexer/v1/supervisor?full` which will return a list of json objects of the form `{"id":<id>, "spec":<SupervisorSpec>}`
* Updated overlord console ui to enable suspend/resume, and changed 'shutdown' to 'terminate'
* move overlord console status to own column in supervisor table so does not look like garbage
* spacing
* padding
* other kind of spacing
* fix rebase fail
* fix more better
* all supervisors now suspendable, updated materialized view supervisor to support suspend, more tests
* fix log
* allow 1 retry for failing tests idk if this is a good idea, but false failure rate due to flaky tests seems pretty bad lately
* try to fix retry issue with teardown
* Update pom.xml
* Update pom.xml
* ParseSpec: Remove default setting.
Having a default ParseSpec implementation is bad for users, because it masks
problems specifying the format. Two common problems masked by this are specifying
the "format" at the wrong level of the JSON, and specifying a format that
Druid doesn't support. In both cases, having a default implementation means that
users will get the delimited parser rather than an error, and then be confused
when, later on, their data failed to parse.
* Fix integration tests.
* Broker backpressure.
Adds a new property "druid.broker.http.maxQueuedBytes" and a new context
parameter "maxQueuedBytes". Both represent a maximum number of bytes queued
per query before exerting backpressure on the channel to the data server.
Fixes#4933.
* Fix query context doc.
* resolves#5898 by adding maxTotalRows to incremental publishing kafka index task and appenderator based realtime indexing task, as available in IndexTask
* address review comments
* changes due to review
* merge fail
* make COMPLEX column filterable in Druid code
* Revert "make COMPLEX column filterable in Druid code"
This reverts commit 9fc6ec768c.
* complex columns can be optionally made filterable
* some types are always filterable
* add ColumnCapabilitiesImpl serde tests
* add SuppresedWarnings annotation
* Rename io.druid to org.apache.druid.
* Fix META-INF files and remove some benchmark results.
* MonitorsConfig update for metrics package migration.
* Reorder some dimensions in inner queries for some reason.
* Fix protobuf tests.
* add subtotalsSpec attribute to groupBy query
* dont sent subtotalsSpec to downstream nodes from broker and other updates
* address review comment
* fix checkstyle issues after merge to master
* add docs for subtotalsSpec feature
* address doc review comments
The old code assumes that post-aggregator prefixes are one character
long followed by numbers. This isn't always true (we may pad with
underscores to avoid conflicts). Instead, the new code uses a different
base prefix for sort-project postaggregators ("s" instead of "p") and
uses the usual Calcites.findUnusedPrefix function to avoid conflicts.
The bug was caused by makeExprEvalSelector returning a null object, which
it isn't supposed to do. Fixed this by renaming ConstantColumnValueSelector
to ConstantExprEvalSelector (it was only used for ExprEval anyway) and
putting logic in that class to make sure the selectors behave as expected.
* SQL: Support more result formats, add columns header.
- Add result formats for line-based JSON and CSV.
- Add X-Druid-Sql-Columns header with a list of all columns that
the response will contain.
- Add more comprehensive documentation on what callers should expect
when making Druid SQL queries.
* Fix some tests.
* Adjust tests.
* Adjust trailer, add types header.
* Fix trailers.
* Fix all inspection errors currently reported.
TeamCity builds on master are reporting inspection errors, possibly
because there was a while where it was not running due to the Apache
migration, and there was some drift.
* Fix one more location.
* Fix tests.
* Another fix.
* Fix four bugs with numeric dimension output types.
This patch includes the following bug fixes:
- TopNColumnSelectorStrategyFactory: Cast dimension values to the output type
during dimExtractionScanAndAggregate instead of updateDimExtractionResults.
This fixes a bug where, for example, grouping on doubles-cast-to-longs would
fail to merge two doubles that should have been combined into the same long value.
- TopNQueryEngine: Use DimExtractionTopNAlgorithm when treating string columns
as numeric dimensions. This fixes a similar bug: grouping on string-cast-to-long
would fail to merge two strings that should have been combined.
- GroupByQuery: Cast numeric types to the expected output type before comparing them
in compareDimsForLimitPushDown. This fixes#6123.
- GroupByQueryQueryToolChest: Convert Jackson-deserialized dimension values into
the proper output type. This fixes an inconsistency between results that came
from cache vs. not-cache: for example, Jackson sometimes deserializes integers
as Integers and sometimes as Longs.
And the following code-cleanup changes, related to the fixes above:
- DimensionHandlerUtils: Introduce convertObjectToType, compareObjectsAsType,
and converterFromTypeToType to make it easier to handle casting operations.
- TopN in general: Rename various "dimName" variables to "dimValue" where they
actually represent dimension values. The old names were confusing.
* Remove unused imports.