druid

Commit Graph

Author	SHA1	Message	Date
Jonathan Wei	609da01882	Fix dictionary ID race condition in IncrementalIndexStorageAdapter (#6340 ) Possibly related to https://github.com/apache/incubator-druid/issues/4937 -------- There is currently a race condition in IncrementalIndexStorageAdapter that can lead to exceptions like the following, when running queries with filters on String dimensions that hit realtime tasks: ``` org.apache.druid.java.util.common.ISE: id[5] >= maxId[5] at org.apache.druid.segment.StringDimensionIndexer$1IndexerDimensionSelector.lookupName(StringDimensionIndexer.java:591) at org.apache.druid.segment.StringDimensionIndexer$1IndexerDimensionSelector$2.matches(StringDimensionIndexer.java:562) at org.apache.druid.segment.incremental.IncrementalIndexStorageAdapter$IncrementalIndexCursor.advance(IncrementalIndexStorageAdapter.java:284) ``` When the `filterMatcher` is created in the constructor of `IncrementalIndexStorageAdapter.IncrementalIndexCursor`, `StringDimensionIndexer.makeDimensionSelector` gets called eventually, which calls: ``` final int maxId = getCardinality(); ... @Override public int getCardinality() { return dimLookup.size(); } ``` So `maxId` is set to the size of the dictionary at the time that the `filterMatcher` is created. However, the `maxRowIndex` which is meant to prevent the Cursor from returning rows that were added after the Cursor was created (see https://github.com/apache/incubator-druid/pull/4049) is set after the `filterMatcher` is created. If rows with new dictionary values are added after the `filterMatcher` is created but before `maxRowIndex` is set, then it is possible for the Cursor to return rows that contain the new values, which will have `id >= maxId`. This PR sets `maxRowIndex` before creating the `filterMatcher` to prevent rows with unknown dictionary IDs from being passed to the `filterMatcher`. ----------- The included test triggers the error with a custom Filter + DruidPredicateFactory. The DimensionSelector for predicate-based filter matching is created here in `Filters.makeValueMatcher`: ``` public static ValueMatcher makeValueMatcher( final ColumnSelectorFactory columnSelectorFactory, final String columnName, final DruidPredicateFactory predicateFactory ) { final ColumnCapabilities capabilities = columnSelectorFactory.getColumnCapabilities(columnName); // This should be folded into the ValueMatcherColumnSelectorStrategy once that can handle LONG typed columns. if (capabilities != null && capabilities.getType() == ValueType.LONG) { return getLongPredicateMatcher( columnSelectorFactory.makeColumnValueSelector(columnName), predicateFactory.makeLongPredicate() ); } final ColumnSelectorPlus<ValueMatcherColumnSelectorStrategy> selector = DimensionHandlerUtils.createColumnSelectorPlus( ValueMatcherColumnSelectorStrategyFactory.instance(), DefaultDimensionSpec.of(columnName), columnSelectorFactory ); return selector.getColumnSelectorStrategy().makeValueMatcher(selector.getSelector(), predicateFactory); } ``` The test Filter adds a row to the IncrementalIndex in the test when the predicateFactory creates a new String predicate, after `DimensionHandlerUtils.createColumnSelectorPlus` is called.	2018-09-18 10:43:29 +04:00
Dayue Gao	edf0c13807	add a sql option to force user to specify time condition (#6246 ) * add a sql option to force user to specify time condition * rename forceTimeCondition to requireTimeCondition, refine error message	2018-09-17 13:52:24 -07:00
Hongze Zhang	2fac6743d4	Add maxIdleTime option to EventReceiverFirehose (#5997 )	2018-09-17 13:50:56 -07:00
QiuMM	dabaf4caf8	fix NoClassDefFoundError when using SysMonitor (#6300 )	2018-09-14 14:47:15 -07:00
Roman Leventov	0c4bd2b57b	Prohibit some Random usage patterns (#6226 ) * Prohibit Random usage patterns * Fix FlattenJSONBenchmarkUtil	2018-09-14 13:35:51 -07:00
QiuMM	288aa4d504	Add missing metadata table information in docs (#6309 ) * Add missing metadata table information in doc file * address review comment	2018-09-14 12:17:05 -07:00
QiuMM	85391e9fb3	fix opentsdb emitter always be running and fail sending tags whose value contains colon (#6251 ) * fix opentsdb emitter always be running * check if emitter started * add more details about consumeDelay in doc * fix possible thread unsafe * fix fail sending tags whose value contain colon	2018-09-14 12:14:15 -07:00
QiuMM	87ccee05f7	Add ability to specify list of task ports and port range (#6263 ) * support specify list of task ports * fix typos * address comments * remove druid.indexer.runner.separateIngestionEndpoint config * tweak doc * fix doc * code cleanup * keep some useful comments	2018-09-13 19:36:04 -07:00
Roman Leventov	d50b69e6d4	Prohibit LinkedList (#6112 ) * Prohibit LinkedList * Fix tests * Fix * Remove unused import	2018-09-13 18:07:06 -07:00
Jonathan Wei	fd6786ac6c	Fix endpoint permissions section in basic-security docs (#6331 )	2018-09-13 15:23:41 -07:00
Clint Wylie	91a37c692d	'suspend' and 'resume' support for supervisors (kafka indexing service, materialized views) (#6234 ) * 'suspend' and 'resume' support for kafka indexing service changes: * introduces `SuspendableSupervisorSpec` interface to describe supervisors which support suspend/resume functionality controlled through the `SupervisorManager`, which will gracefully shutdown the supervisor and it's tasks, update it's `SupervisorSpec` with either a suspended or running state, and update with the toggled spec. Spec updates are provided by `SuspendableSupervisorSpec.createSuspendedSpec` and `SuspendableSupervisorSpec.createRunningSpec` respectively. * `KafkaSupervisorSpec` extends `SuspendableSupervisorSpec` and now supports suspend/resume functionality. The difference in behavior between 'running' and 'suspended' state is whether the supervisor will attempt to ensure that indexing tasks are or are not running respectively. Behavior is identical otherwise. * `SupervisorResource` now provides `/druid/indexer/v1/supervisor/{id}/suspend` and `/druid/indexer/v1/supervisor/{id}/resume` which are used to suspend/resume suspendable supervisors * Deprecated `/druid/indexer/v1/supervisor/{id}/shutdown` and moved it's functionality to `/druid/indexer/v1/supervisor/{id}/terminate` since 'shutdown' is ambiguous verbage for something that effectively stops a supervisor forever * Added ability to get all supervisor specs from `/druid/indexer/v1/supervisor` by supplying the 'full' query parameter `/druid/indexer/v1/supervisor?full` which will return a list of json objects of the form `{"id":<id>, "spec":<SupervisorSpec>}` * Updated overlord console ui to enable suspend/resume, and changed 'shutdown' to 'terminate' * move overlord console status to own column in supervisor table so does not look like garbage * spacing * padding * other kind of spacing * fix rebase fail * fix more better * all supervisors now suspendable, updated materialized view supervisor to support suspend, more tests * fix log	2018-09-13 14:42:18 -07:00
Clint Wylie	96a1076e23	allow 3 retries for failing tests (#6324 ) * allow 1 retry for failing tests idk if this is a good idea, but false failure rate due to flaky tests seems pretty bad lately * try to fix retry issue with teardown * Update pom.xml * Update pom.xml	2018-09-11 19:16:59 -07:00
Gian Merlino	7f3a0dae28	ParseSpec: Remove default setting. (#6310 ) * ParseSpec: Remove default setting. Having a default ParseSpec implementation is bad for users, because it masks problems specifying the format. Two common problems masked by this are specifying the "format" at the wrong level of the JSON, and specifying a format that Druid doesn't support. In both cases, having a default implementation means that users will get the delimited parser rather than an error, and then be confused when, later on, their data failed to parse. * Fix integration tests.	2018-09-11 19:16:19 -07:00
Gian Merlino	d6cbdf86c2	Broker backpressure. (#6313 ) * Broker backpressure. Adds a new property "druid.broker.http.maxQueuedBytes" and a new context parameter "maxQueuedBytes". Both represent a maximum number of bytes queued per query before exerting backpressure on the channel to the data server. Fixes #4933. * Fix query context doc.	2018-09-10 09:33:29 -07:00
Gian Merlino	4669f0878f	SQL: UNION ALL operator. (#6314 ) * SQL: UNION ALL operator. * Remove unused import.	2018-09-09 22:32:56 -07:00
Clint Wylie	e6e068ce60	Add support for 'maxTotalRows' to incremental publishing kafka indexing task and appenderator based realtime task (#6129 ) * resolves #5898 by adding maxTotalRows to incremental publishing kafka index task and appenderator based realtime indexing task, as available in IndexTask * address review comments * changes due to review * merge fail	2018-09-07 13:17:49 -07:00
Clint Wylie	e095f63e8e	fix coordinator console loading (#6276 )	2018-09-06 16:59:51 -07:00
Jonathan Wei	60cbc64472	Use PasswordProvider, fix info on initial passwords in basic security extension docs (#6303 ) * Fix info on initial passwords in basic security extension docs * Use PasswordProvider * Compile fix	2018-09-05 17:07:16 -07:00
Himanshu	d61f708ef5	make COMPLEX column optionally filterable in Druid code (#6223 ) * make COMPLEX column filterable in Druid code * Revert "make COMPLEX column filterable in Druid code" This reverts commit `9fc6ec768c`. * complex columns can be optionally made filterable * some types are always filterable * add ColumnCapabilitiesImpl serde tests * add SuppresedWarnings annotation	2018-09-05 12:28:49 -07:00
Gian Merlino	be6c901114	Like filter: Fix escapes escaping themselves. (#6295 ) Escapes should escape themselves.	2018-09-05 09:29:07 -07:00
Jonathan Wei	4caa61d8fa	Fix tutorial sample data filename, fix logger classname in metrics docs (#6299 )	2018-09-04 21:47:12 -07:00
QiuMM	84810f6358	correct metric name in emitter configuration files (#6290 )	2018-09-04 14:23:04 -07:00
adursun	71ac3ada21	Fix link related to metadata storage (#6294 )	2018-09-04 14:20:57 -07:00
Eyal Yurman	10ca290d64	Correct file name typo in Quickstart tutorial (#6297 ) Correct name wikipedia-2015-09-12-sampled.json.gz to wikiticker-2015-09-12-sampled.json.gz	2018-09-04 14:20:17 -07:00
Jonathan Wei	180e3ccfad	Docs consistency cleanup (#6259 )	2018-09-04 12:54:41 -07:00
Dayue Gao	743547fc3b	Unauthorized sql request should return 403 (#6279 )	2018-09-01 09:17:18 -07:00
Jonathan Wei	d0fb83760e	Fix PostgreSQLConnectorConfig binding (#6273 )	2018-08-31 14:18:29 -07:00
Dayue Gao	951b36e2bc	BytesFullResponseHandler should only consume readableBytes of ChannelBuffer (#6270 )	2018-08-30 20:22:08 -07:00
QiuMM	9b04846e6b	correct metric name in doc file (#6271 )	2018-08-30 10:57:35 -07:00
Gian Merlino	431d3d8497	Rename io.druid to org.apache.druid. (#6266 ) * Rename io.druid to org.apache.druid. * Fix META-INF files and remove some benchmark results. * MonitorsConfig update for metrics package migration. * Reorder some dimensions in inner queries for some reason. * Fix protobuf tests.	2018-08-30 09:56:26 -07:00
Himanshu	1fae6513e1	add "subtotalsSpec" attribute to groupBy query (#5280 ) * add subtotalsSpec attribute to groupBy query * dont sent subtotalsSpec to downstream nodes from broker and other updates * address review comment * fix checkstyle issues after merge to master * add docs for subtotalsSpec feature * address doc review comments	2018-08-28 17:46:38 -07:00
Dayue Gao	fcf8c8d53c	RowBasedKeySerde should use empty dictionary in constructor (#6256 )	2018-08-28 17:22:18 -07:00
Jonathan Wei	c9a27e3e8e	Don't let catch/finally suppress main exception in IncrementalPublishingKafkaIndexTaskRunner (#6258 )	2018-08-28 16:12:02 -07:00
Gian Merlino	80224df36a	SQL: Fix post-aggregator naming logic for sort-project. (#6250 ) The old code assumes that post-aggregator prefixes are one character long followed by numbers. This isn't always true (we may pad with underscores to avoid conflicts). Instead, the new code uses a different base prefix for sort-project postaggregators ("s" instead of "p") and uses the usual Calcites.findUnusedPrefix function to avoid conflicts.	2018-08-28 10:59:32 -07:00
Dayue Gao	a879022bc8	fix AssertionError of semi join query (#6244 )	2018-08-27 17:49:51 -07:00
Jim Slattery	d957295b98	spelling: storage (#6248 )	2018-08-27 16:35:31 -07:00
Dayue Gao	2325844a38	fix incorrect check of maxSemiJoinRowsInMemory (#6242 )	2018-08-27 16:28:29 -07:00
Gian Merlino	4a8b09b6a9	Fix NPE on constant null numeric expressions. (#6232 ) The bug was caused by makeExprEvalSelector returning a null object, which it isn't supposed to do. Fixed this by renaming ConstantColumnValueSelector to ConstantExprEvalSelector (it was only used for ExprEval anyway) and putting logic in that class to make sure the selectors behave as expected.	2018-08-27 15:30:56 -07:00
Gian Merlino	71c1a70ff6	FilteredBufferAggregator: Fix missing relocate, isNull methods. (#6233 )	2018-08-27 15:30:45 -07:00
Gian Merlino	157e75a1fe	Minor followup to #6220 . (#6231 ) Adjustments to comments and usage of generics.	2018-08-27 12:01:44 -05:00
Jihoon Son	bda5a8a95e	Fix NPE in KafkaSupervisor.checkpointTaskGroup (#6206 ) * Fix NPE in KafkaSupervisor.checkpointTaskGroup * address comments * address comment	2018-08-26 22:23:33 -07:00
Gian Merlino	0172326c62	SQL: Support more result formats, add columns header. (#6191 ) * SQL: Support more result formats, add columns header. - Add result formats for line-based JSON and CSV. - Add X-Druid-Sql-Columns header with a list of all columns that the response will contain. - Add more comprehensive documentation on what callers should expect when making Druid SQL queries. * Fix some tests. * Adjust tests. * Adjust trailer, add types header. * Fix trailers.	2018-08-26 23:00:14 -06:00
Jihoon Son	64d33eef7e	Fix timeout in KafkaSupervisorTest.testCheckpointForInactiveTaskGroup (#6207 ) * Fix timeout in KafkaSupervisorTest.testCheckpointForInactiveTaskGroup * fix npe * add taskRunner.getRunningTasks()	2018-08-26 19:59:01 -06:00
Gian Merlino	cb40b6d369	Fix all inspection errors currently reported. (#6236 ) * Fix all inspection errors currently reported. TeamCity builds on master are reporting inspection errors, possibly because there was a while where it was not running due to the Apache migration, and there was some drift. * Fix one more location. * Fix tests. * Another fix.	2018-08-26 18:36:01 -06:00
QiuMM	ef91fdbf03	Zstandard decompression support (#6224 )	2018-08-26 16:09:24 -07:00
Gian Merlino	23ba6f7ad7	Fix four bugs with numeric dimension output types. (#6220 ) * Fix four bugs with numeric dimension output types. This patch includes the following bug fixes: - TopNColumnSelectorStrategyFactory: Cast dimension values to the output type during dimExtractionScanAndAggregate instead of updateDimExtractionResults. This fixes a bug where, for example, grouping on doubles-cast-to-longs would fail to merge two doubles that should have been combined into the same long value. - TopNQueryEngine: Use DimExtractionTopNAlgorithm when treating string columns as numeric dimensions. This fixes a similar bug: grouping on string-cast-to-long would fail to merge two strings that should have been combined. - GroupByQuery: Cast numeric types to the expected output type before comparing them in compareDimsForLimitPushDown. This fixes #6123. - GroupByQueryQueryToolChest: Convert Jackson-deserialized dimension values into the proper output type. This fixes an inconsistency between results that came from cache vs. not-cache: for example, Jackson sometimes deserializes integers as Integers and sometimes as Longs. And the following code-cleanup changes, related to the fixes above: - DimensionHandlerUtils: Introduce convertObjectToType, compareObjectsAsType, and converterFromTypeToType to make it easier to handle casting operations. - TopN in general: Rename various "dimName" variables to "dimValue" where they actually represent dimension values. The old names were confusing. * Remove unused imports.	2018-08-25 14:31:46 -07:00
Himanshu	c3aaf8122d	fix TaskQueue-HRTR deadlock (#6212 ) * fix TaskQueue-HRTR deadlock causing https://github.com/apache/incubator-druid/issues/6201 * address review comments	2018-08-25 14:15:57 -07:00
Gian Merlino	28e6ae3664	SQL: Finalize aggregations for inner queries when necessary. (#6221 ) * SQL: Finalize aggregations for inner queries when necessary. Fixes #5779. * Fixed test method name.	2018-08-25 13:56:23 -07:00
QiuMM	9803ce954a	fix port conflict for druid peon (#6202 )	2018-08-23 19:05:13 -07:00
Himanshu	ddb26f2696	do not ignore ms in ruby time (#6217 )	2018-08-23 14:09:31 -07:00

... 2 3 4 5 6 ...

8859 Commits All Branches Search

8859 Commits

All Branches