druid

Commit Graph

Author	SHA1	Message	Date
Mingming Qiu	849ba867b2	fix missing property in JsonTypeInfo of SegmentWriteOutMediumFactory (#6656 )	2018-11-27 15:59:58 -08:00
Roman Leventov	887c645675	Find duplicate lines with checkstyle; enable some duplicate inspections in IntelliJ (#6558 ) Not putting this to 0.13 milestone because the found bugs are not critical (one is a harmless DI config duplicate, and another is in a benchmark. Change in `DumpSegment` is just an indentation change.	2018-11-26 16:55:42 +01:00
Roman Leventov	87b96fb1fd	Add checkstyle rules about imports and empty lines between members (#6543 ) * Add checkstyle rules about imports and empty lines between members * Add suppressions * Update Eclipse import order * Add empty line * Fix StatsDEmitter	2018-11-20 12:42:15 +01:00
Gian Merlino	fe69da0d95	Expressions: Fix improper supplier reuse with missing columns. (#6600 ) * Expressions: Fix improper supplier reuse with missing columns. ExpressionSelectors has an optimization that skips building a Map when there is only one input supplier. However, this optimization should not be used in the case where the is one input supplier but more than one input identifier (which can happen when only one input identifier corresponds to an actual column). Fixes #6556. * Add underscores to statics.	2018-11-15 22:13:32 -08:00
David Lim	7b41e23cbb	remove backpressure time from DefaultQueryMetrics pending on-going discussion (#6631 )	2018-11-15 19:29:50 -07:00
Roman Leventov	8f3fe9cd02	Prohibit String.replace() and String.replaceAll(), fix and prohibit some toString()-related redundancies (#6607 ) * Prohibit String.replace() and String.replaceAll(), fix and prohibit some toString()-related redundancies * Fix bug * Replace checkstyle regexp with IntelliJ inspection	2018-11-15 13:21:34 -08:00
Jihoon Son	cdae2fe7b5	Deprecate IntervalChunkingQueryRunner (#6591 ) * Deprecate IntervalChunkingQueryRunner * add doc * deprecate metric * fix doc	2018-11-14 06:33:27 +08:00
Gian Merlino	52f6bdc1eb	Optimization for expressions that hit a single long column. (#6599 ) * Optimization for expressions that hit a single long column. There was previously a single-long-input optimization that applied only to the time column. These have been combined together. Also adds type-specific value caching to ExprEval, which allowed simplifying the SingleLongInputCachingExpressionColumnValueSelector code. * Add more benchmarks. * Don't use LRU cache for __time. * Simplify a bit. * Let the cache grow.	2018-11-13 09:36:32 -08:00
Roman Leventov	54351a5c75	Fix various bugs; Enable more IntelliJ inspections and update error-prone (#6490 ) * Fix various bugs; Enable more IntelliJ inspections and update error-prone * Fix NPE * Fix inspections * Remove unused imports	2018-11-06 14:38:08 -08:00
Roman Leventov	a2a1a1c2c9	Hide NullDimensionSelector from public (#6480 )	2018-11-02 04:38:21 -07:00
QiuMM	676f5e6d7f	Prohibit some guava collection APIs and use JDK collection APIs directly (#6511 ) * Prohibit some guava collection APIs and use JDK APIs directly * reset files that changed by accident * sort codestyle/druid-forbidden-apis.txt alphabetically	2018-10-29 13:02:43 +01:00
Samarth Jain	0a90b3d51a	Remove unused code (#6504 ) * Remove unused code * Remove usage of list in setDimensions and setAggregatorSpecs * Fix formatting to adhere to 120 character guideline	2018-10-26 11:31:10 -07:00
Roman Leventov	84ac18dc1b	Catch some incorrect method parameter or call argument formatting patterns with checkstyle (#6461 ) * Catch some incorrect method parameter or call argument formatting patterns with checkstyle * Fix DiscoveryModule * Inline parameters_and_arguments.txt * Fix a bug in PolyBind * Fix formatting	2018-10-23 07:17:38 -03:00
Samarth Jain	359576a80b	Implement force push down for nested group by query (#5471 ) * Force nested query push down * Code review changes	2018-10-22 13:43:47 -07:00
Roman Leventov	789c9a1dc7	Prohibit using Object\|Long\|Float\|DoubleColumnSelector in instanceof statements (#6470 ) * Prohibit using Object\|Long\|Float\|DoubleColumnSelector in instanceof statements * Doc fixes	2018-10-15 15:41:43 -07:00
robertervin	95ab1ea737	Fix Empty InDimFilter Failure (#6330 ) * fix empty InDimFilter failure (#6101) * Add test case for empty values input * Add documentation for empty values in InDimFilter	2018-10-14 20:43:16 -07:00
Clint Wylie	84598fba3b	combine druid-api, druid-common, java-util into druid-core (#6443 ) * combine druid-api, druid-common, java-util * spacing	2018-10-14 20:37:37 -07:00
dongyifeng	b06ac54a5e	add PrefixFilteredDimensionSpec for multi-value dimensions (#6307 ) * add PrefixFilteredDimensionSpec for multi-value dimensions * add docs for PrefixFilteredDimensionSpec * remove unnecessary null handling * add null check to the result of NullHandling	2018-10-12 17:51:09 -07:00
David Lim	20ab213ba6	change project versions to 0.13.0-incubating-SNAPSHOT (#6453 )	2018-10-11 19:28:01 -07:00
Charles Allen	c55b37d7ec	Add optional `name` to top level of FilteredAggregatorFactory (#6219 ) * Add optional `name` to top level of FilteredAggregatorFactory * Add compat constructor for tests * Address comments * Add equals and hash code updates * Rename test * Fix imports and code style	2018-10-11 11:56:53 -07:00
Clint Wylie	f7775d1db3	fixes for LookupReferencesManagerTest (#6444 ) * some fixes for LookupReferencesManagerTest * docs * formatting * more formatting fixes	2018-10-10 18:02:11 -07:00
Roman Leventov	09126c021a	Remove Aggregator.clone() methods (#6437 ) * Remove Aggregator.clone() methods * Remove CardinalityAggregator.name	2018-10-10 10:07:56 -03:00
QiuMM	0b8085aff7	Prohibit jackson ObjectMapper#reader methods which are deprecated (#6386 ) * Prohibit jackson ObjectMapper#reader methods which are deprecated * address comments	2018-10-03 17:55:20 -03:00
Roman Leventov	3ae563263a	Renamed 'Generic Column' -> 'Numeric Column'; Fixed a few resource leaks in processing; misc refinements (#5957 ) This PR accumulates many refactorings and small improvements that I did while preparing the next change set of https://github.com/druid-io/druid/projects/2. I finally decided to make them a separate PR to minimize the volume of the main PR. Some of the changes: - Renamed confusing "Generic Column" term to "Numeric Column" (what it actually implies) in many class names. - Generified `ComplexMetricExtractor`	2018-10-02 14:50:22 -03:00
Jihoon Son	cb14a43038	Remove ConvertSegmentTask, HadoopConverterTask, and ConvertSegmentBackwardsCompatibleTask (#6393 ) * Remove ConvertSegmentTask, HadoopConverterTask, and ConvertSegmentBackwardsCompatibleTask * update doc and remove auto conversion * remove remaining doc * fix teamcity	2018-10-01 12:03:35 -07:00
Shiv Toolsidass	5a894f830b	Added backpressure metric (#6335 ) * Added backpressure metric * Updated channelReadable to AtomicBoolean and fixed broken test * Moved backpressure metric logic to NettyHttpClient * Fix placement of calculating backPressureDuration	2018-09-29 14:24:04 -07:00
Jihoon Son	f09e718c68	Implement MapVirtualColumn.makeDimensionSelector properly (#6396 ) * Implement MapVirtualColumn.makeDimensionSelector properly * address comments	2018-09-29 14:13:05 -07:00
Jihoon Son	faf3f1e426	Fix cache keys of DefaultDimensionSpec and ExtractionDimensionSpec (#6390 )	2018-09-26 20:08:53 -07:00
Nishant Bangarwa	c9d281a2e9	Add ability to pass in Bloom filter from Hive Queries (#6222 ) * Bloom filter initial implementation fix checkstyle review comments Fix wierd failure review comments Revert "Fix wierd failure" This reverts commit a13a83ad7887e679f6d539191b52aeaaea85b613. * fix test * review comment	2018-09-26 16:04:26 -07:00
Jonathan Wei	00b0a156e9	Tweak isInvalidRows behavior in HadoopTuningConfig (#6339 ) * Tweak isInvalidRows behavior in HadoopTuningConfig * Fix tests	2018-09-24 16:13:13 -07:00
Alexander Saydakov	93345064b5	HllSketch module (#5712 ) * HllSketch module * updated license and imports * updated package name * implemented makeAggregateCombiner() * removed json marks * style fix * added module * removed unnecessary import, side effect of package renaming * use TreadLocalRandom * addressing code review points, mostly formatting and comments * javadoc * natural order with nulls * typo * factored out raw input value extraction * singleton * style fix * style fix * use Collections.singletonList instead of Arrays.asList * suppress warning	2018-09-24 08:41:56 -07:00
Jonathan Wei	609da01882	Fix dictionary ID race condition in IncrementalIndexStorageAdapter (#6340 ) Possibly related to https://github.com/apache/incubator-druid/issues/4937 -------- There is currently a race condition in IncrementalIndexStorageAdapter that can lead to exceptions like the following, when running queries with filters on String dimensions that hit realtime tasks: ``` org.apache.druid.java.util.common.ISE: id[5] >= maxId[5] at org.apache.druid.segment.StringDimensionIndexer$1IndexerDimensionSelector.lookupName(StringDimensionIndexer.java:591) at org.apache.druid.segment.StringDimensionIndexer$1IndexerDimensionSelector$2.matches(StringDimensionIndexer.java:562) at org.apache.druid.segment.incremental.IncrementalIndexStorageAdapter$IncrementalIndexCursor.advance(IncrementalIndexStorageAdapter.java:284) ``` When the `filterMatcher` is created in the constructor of `IncrementalIndexStorageAdapter.IncrementalIndexCursor`, `StringDimensionIndexer.makeDimensionSelector` gets called eventually, which calls: ``` final int maxId = getCardinality(); ... @Override public int getCardinality() { return dimLookup.size(); } ``` So `maxId` is set to the size of the dictionary at the time that the `filterMatcher` is created. However, the `maxRowIndex` which is meant to prevent the Cursor from returning rows that were added after the Cursor was created (see https://github.com/apache/incubator-druid/pull/4049) is set after the `filterMatcher` is created. If rows with new dictionary values are added after the `filterMatcher` is created but before `maxRowIndex` is set, then it is possible for the Cursor to return rows that contain the new values, which will have `id >= maxId`. This PR sets `maxRowIndex` before creating the `filterMatcher` to prevent rows with unknown dictionary IDs from being passed to the `filterMatcher`. ----------- The included test triggers the error with a custom Filter + DruidPredicateFactory. The DimensionSelector for predicate-based filter matching is created here in `Filters.makeValueMatcher`: ``` public static ValueMatcher makeValueMatcher( final ColumnSelectorFactory columnSelectorFactory, final String columnName, final DruidPredicateFactory predicateFactory ) { final ColumnCapabilities capabilities = columnSelectorFactory.getColumnCapabilities(columnName); // This should be folded into the ValueMatcherColumnSelectorStrategy once that can handle LONG typed columns. if (capabilities != null && capabilities.getType() == ValueType.LONG) { return getLongPredicateMatcher( columnSelectorFactory.makeColumnValueSelector(columnName), predicateFactory.makeLongPredicate() ); } final ColumnSelectorPlus<ValueMatcherColumnSelectorStrategy> selector = DimensionHandlerUtils.createColumnSelectorPlus( ValueMatcherColumnSelectorStrategyFactory.instance(), DefaultDimensionSpec.of(columnName), columnSelectorFactory ); return selector.getColumnSelectorStrategy().makeValueMatcher(selector.getSelector(), predicateFactory); } ``` The test Filter adds a row to the IncrementalIndex in the test when the predicateFactory creates a new String predicate, after `DimensionHandlerUtils.createColumnSelectorPlus` is called.	2018-09-18 10:43:29 +04:00
Roman Leventov	0c4bd2b57b	Prohibit some Random usage patterns (#6226 ) * Prohibit Random usage patterns * Fix FlattenJSONBenchmarkUtil	2018-09-14 13:35:51 -07:00
Roman Leventov	d50b69e6d4	Prohibit LinkedList (#6112 ) * Prohibit LinkedList * Fix tests * Fix * Remove unused import	2018-09-13 18:07:06 -07:00
Gian Merlino	d6cbdf86c2	Broker backpressure. (#6313 ) * Broker backpressure. Adds a new property "druid.broker.http.maxQueuedBytes" and a new context parameter "maxQueuedBytes". Both represent a maximum number of bytes queued per query before exerting backpressure on the channel to the data server. Fixes #4933. * Fix query context doc.	2018-09-10 09:33:29 -07:00
Himanshu	d61f708ef5	make COMPLEX column optionally filterable in Druid code (#6223 ) * make COMPLEX column filterable in Druid code * Revert "make COMPLEX column filterable in Druid code" This reverts commit `9fc6ec768c`. * complex columns can be optionally made filterable * some types are always filterable * add ColumnCapabilitiesImpl serde tests * add SuppresedWarnings annotation	2018-09-05 12:28:49 -07:00
Gian Merlino	be6c901114	Like filter: Fix escapes escaping themselves. (#6295 ) Escapes should escape themselves.	2018-09-05 09:29:07 -07:00
Gian Merlino	431d3d8497	Rename io.druid to org.apache.druid. (#6266 ) * Rename io.druid to org.apache.druid. * Fix META-INF files and remove some benchmark results. * MonitorsConfig update for metrics package migration. * Reorder some dimensions in inner queries for some reason. * Fix protobuf tests.	2018-08-30 09:56:26 -07:00
Himanshu	1fae6513e1	add "subtotalsSpec" attribute to groupBy query (#5280 ) * add subtotalsSpec attribute to groupBy query * dont sent subtotalsSpec to downstream nodes from broker and other updates * address review comment * fix checkstyle issues after merge to master * add docs for subtotalsSpec feature * address doc review comments	2018-08-28 17:46:38 -07:00
Dayue Gao	fcf8c8d53c	RowBasedKeySerde should use empty dictionary in constructor (#6256 )	2018-08-28 17:22:18 -07:00
Gian Merlino	4a8b09b6a9	Fix NPE on constant null numeric expressions. (#6232 ) The bug was caused by makeExprEvalSelector returning a null object, which it isn't supposed to do. Fixed this by renaming ConstantColumnValueSelector to ConstantExprEvalSelector (it was only used for ExprEval anyway) and putting logic in that class to make sure the selectors behave as expected.	2018-08-27 15:30:56 -07:00
Gian Merlino	71c1a70ff6	FilteredBufferAggregator: Fix missing relocate, isNull methods. (#6233 )	2018-08-27 15:30:45 -07:00
Gian Merlino	157e75a1fe	Minor followup to #6220 . (#6231 ) Adjustments to comments and usage of generics.	2018-08-27 12:01:44 -05:00
Gian Merlino	cb40b6d369	Fix all inspection errors currently reported. (#6236 ) * Fix all inspection errors currently reported. TeamCity builds on master are reporting inspection errors, possibly because there was a while where it was not running due to the Apache migration, and there was some drift. * Fix one more location. * Fix tests. * Another fix.	2018-08-26 18:36:01 -06:00
Gian Merlino	23ba6f7ad7	Fix four bugs with numeric dimension output types. (#6220 ) * Fix four bugs with numeric dimension output types. This patch includes the following bug fixes: - TopNColumnSelectorStrategyFactory: Cast dimension values to the output type during dimExtractionScanAndAggregate instead of updateDimExtractionResults. This fixes a bug where, for example, grouping on doubles-cast-to-longs would fail to merge two doubles that should have been combined into the same long value. - TopNQueryEngine: Use DimExtractionTopNAlgorithm when treating string columns as numeric dimensions. This fixes a similar bug: grouping on string-cast-to-long would fail to merge two strings that should have been combined. - GroupByQuery: Cast numeric types to the expected output type before comparing them in compareDimsForLimitPushDown. This fixes #6123. - GroupByQueryQueryToolChest: Convert Jackson-deserialized dimension values into the proper output type. This fixes an inconsistency between results that came from cache vs. not-cache: for example, Jackson sometimes deserializes integers as Integers and sometimes as Longs. And the following code-cleanup changes, related to the fixes above: - DimensionHandlerUtils: Introduce convertObjectToType, compareObjectsAsType, and converterFromTypeToType to make it easier to handle casting operations. - TopN in general: Rename various "dimName" variables to "dimValue" where they actually represent dimension values. The old names were confusing. * Remove unused imports.	2018-08-25 14:31:46 -07:00
Himanshu	a76bf9ab2a	add ability to do optional rollup in AggregationTestHelper (#6213 )	2018-08-22 16:38:36 -07:00
Benedict Jin	3647d4c94a	Make time-related variables more readable (#6158 ) * Make time-related variables more readable * Patch some improvements from the code reviewer * Remove unnecessary boxing of Long type variables	2018-08-21 15:29:40 -07:00
Kirill Kozlov	62e580050c	Use JUnit TemporaryFolder rule instead of system temp folder (#6070 ) * Use JUnit TemporaryFolder rule instead of system tmp folder * Allow to forbid apis which present not in all mvn modules	2018-08-16 11:05:45 -07:00
Jihoon Son	ecee3e0a24	Further optimize memory for Travis jobs (#6150 ) * Further optimize memory for Travis jobs * fix build * sudo false	2018-08-10 22:03:36 -07:00
Gian Merlino	3525d4059e	Cache: Add maxEntrySize config, make groupBy cacheable by default. (#5108 ) * Cache: Add maxEntrySize config. The idea is this makes it more feasible to cache query types that can potentially generate large result sets, like groupBy and select, without fear of writing too much to the cache per query. Includes a refactor of cache population code in CachingQueryRunner and CachingClusteredClient, such that they now use the same CachePopulator interface with two implementations: one for foreground and one for background. The main reason for splitting the foreground / background impls is that the foreground impl can have a more effective implementation of maxEntrySize. It can stop retaining subvalues for the cache early. * Add CachePopulatorStats. * Fix whitespace. * Fix docs. * Fix various tests. * Add tests. * Fix tests. * Better tests * Remove conflict markers. * Fix licenses.	2018-08-07 10:23:15 -07:00

1 2 3 4 5 ...

2001 Commits