The indexes introduced in #6348 were on the wrong table. The tests
did not catch them due to retries on the create table steps (the
first try created the table but not the bogus indexes; the second
try noticed that the table already existed and did nothing). This
patch doesn't fix the issue with the tests, since the best way to
do that would be to do the table and index creation in a
transaction; but, this is not supported by all of our supported
database engines.
* Adding licenses and enable apache-rat-plugi.
Change-Id: I4685a2d9f1e147855dba69329b286f2d5bee3c18
* restore the copywrite of demo_table and add it to the list of allowed ones
Change-Id: I2a9efde6f4b984bc1ac90483e90d98e71f818a14
* revirew comments
Change-Id: I0256c930b7f9a5bb09b44b5e7a149e6ec48cb0ca
* more fixup
Change-Id: I1355e8a2549e76cd44487abec142be79bec59de2
* align
Change-Id: I70bc47ecb577bdf6b91639dd91b6f5642aa6b02f
Possibly related to https://github.com/apache/incubator-druid/issues/4937
--------
There is currently a race condition in IncrementalIndexStorageAdapter that can lead to exceptions like the following, when running queries with filters on String dimensions that hit realtime tasks:
```
org.apache.druid.java.util.common.ISE: id[5] >= maxId[5]
at org.apache.druid.segment.StringDimensionIndexer$1IndexerDimensionSelector.lookupName(StringDimensionIndexer.java:591)
at org.apache.druid.segment.StringDimensionIndexer$1IndexerDimensionSelector$2.matches(StringDimensionIndexer.java:562)
at org.apache.druid.segment.incremental.IncrementalIndexStorageAdapter$IncrementalIndexCursor.advance(IncrementalIndexStorageAdapter.java:284)
```
When the `filterMatcher` is created in the constructor of `IncrementalIndexStorageAdapter.IncrementalIndexCursor`, `StringDimensionIndexer.makeDimensionSelector` gets called eventually, which calls:
```
final int maxId = getCardinality();
...
@Override
public int getCardinality()
{
return dimLookup.size();
}
```
So `maxId` is set to the size of the dictionary at the time that the `filterMatcher` is created.
However, the `maxRowIndex` which is meant to prevent the Cursor from returning rows that were added after the Cursor was created (see https://github.com/apache/incubator-druid/pull/4049) is set after the `filterMatcher` is created.
If rows with new dictionary values are added after the `filterMatcher` is created but before `maxRowIndex` is set, then it is possible for the Cursor to return rows that contain the new values, which will have `id >= maxId`.
This PR sets `maxRowIndex` before creating the `filterMatcher` to prevent rows with unknown dictionary IDs from being passed to the `filterMatcher`.
-----------
The included test triggers the error with a custom Filter + DruidPredicateFactory.
The DimensionSelector for predicate-based filter matching is created here in `Filters.makeValueMatcher`:
```
public static ValueMatcher makeValueMatcher(
final ColumnSelectorFactory columnSelectorFactory,
final String columnName,
final DruidPredicateFactory predicateFactory
)
{
final ColumnCapabilities capabilities = columnSelectorFactory.getColumnCapabilities(columnName);
// This should be folded into the ValueMatcherColumnSelectorStrategy once that can handle LONG typed columns.
if (capabilities != null && capabilities.getType() == ValueType.LONG) {
return getLongPredicateMatcher(
columnSelectorFactory.makeColumnValueSelector(columnName),
predicateFactory.makeLongPredicate()
);
}
final ColumnSelectorPlus<ValueMatcherColumnSelectorStrategy> selector =
DimensionHandlerUtils.createColumnSelectorPlus(
ValueMatcherColumnSelectorStrategyFactory.instance(),
DefaultDimensionSpec.of(columnName),
columnSelectorFactory
);
return selector.getColumnSelectorStrategy().makeValueMatcher(selector.getSelector(), predicateFactory);
}
```
The test Filter adds a row to the IncrementalIndex in the test when the predicateFactory creates a new String predicate, after `DimensionHandlerUtils.createColumnSelectorPlus` is called.
* fix opentsdb emitter always be running
* check if emitter started
* add more details about consumeDelay in doc
* fix possible thread unsafe
* fix fail sending tags whose value contain colon
* 'suspend' and 'resume' support for kafka indexing service
changes:
* introduces `SuspendableSupervisorSpec` interface to describe supervisors which support suspend/resume functionality controlled through the `SupervisorManager`, which will gracefully shutdown the supervisor and it's tasks, update it's `SupervisorSpec` with either a suspended or running state, and update with the toggled spec. Spec updates are provided by `SuspendableSupervisorSpec.createSuspendedSpec` and `SuspendableSupervisorSpec.createRunningSpec` respectively.
* `KafkaSupervisorSpec` extends `SuspendableSupervisorSpec` and now supports suspend/resume functionality. The difference in behavior between 'running' and 'suspended' state is whether the supervisor will attempt to ensure that indexing tasks are or are not running respectively. Behavior is identical otherwise.
* `SupervisorResource` now provides `/druid/indexer/v1/supervisor/{id}/suspend` and `/druid/indexer/v1/supervisor/{id}/resume` which are used to suspend/resume suspendable supervisors
* Deprecated `/druid/indexer/v1/supervisor/{id}/shutdown` and moved it's functionality to `/druid/indexer/v1/supervisor/{id}/terminate` since 'shutdown' is ambiguous verbage for something that effectively stops a supervisor forever
* Added ability to get all supervisor specs from `/druid/indexer/v1/supervisor` by supplying the 'full' query parameter `/druid/indexer/v1/supervisor?full` which will return a list of json objects of the form `{"id":<id>, "spec":<SupervisorSpec>}`
* Updated overlord console ui to enable suspend/resume, and changed 'shutdown' to 'terminate'
* move overlord console status to own column in supervisor table so does not look like garbage
* spacing
* padding
* other kind of spacing
* fix rebase fail
* fix more better
* all supervisors now suspendable, updated materialized view supervisor to support suspend, more tests
* fix log
* allow 1 retry for failing tests idk if this is a good idea, but false failure rate due to flaky tests seems pretty bad lately
* try to fix retry issue with teardown
* Update pom.xml
* Update pom.xml
* ParseSpec: Remove default setting.
Having a default ParseSpec implementation is bad for users, because it masks
problems specifying the format. Two common problems masked by this are specifying
the "format" at the wrong level of the JSON, and specifying a format that
Druid doesn't support. In both cases, having a default implementation means that
users will get the delimited parser rather than an error, and then be confused
when, later on, their data failed to parse.
* Fix integration tests.
* Broker backpressure.
Adds a new property "druid.broker.http.maxQueuedBytes" and a new context
parameter "maxQueuedBytes". Both represent a maximum number of bytes queued
per query before exerting backpressure on the channel to the data server.
Fixes#4933.
* Fix query context doc.
* resolves#5898 by adding maxTotalRows to incremental publishing kafka index task and appenderator based realtime indexing task, as available in IndexTask
* address review comments
* changes due to review
* merge fail
* make COMPLEX column filterable in Druid code
* Revert "make COMPLEX column filterable in Druid code"
This reverts commit 9fc6ec768c.
* complex columns can be optionally made filterable
* some types are always filterable
* add ColumnCapabilitiesImpl serde tests
* add SuppresedWarnings annotation